Setting up to use External Tables in SQL Server 2019 BDC

SQL Server 2019 Big Data Clusters is one of the cool new features that’s available with the release of SQL Server 2019. The idea behind SQL Server 2019 Bid Data Clusters (BDC) is that you can keep all your data on their native systems, and have BDC point to the tables needed using what are called External Tables. These External Tables are just basically pointers that point you to the actual data.

Think about this scenario, you’ve got data in a SQL Server 2012 database, you’ve got other data in an Oracle database and you need to create a report that includes columns from both sources.

In the olden days, you’d have to ETL this data from one server to another on some sort of schedule, usually to a third server (usually a data mart) so that the report could be run there.

With SQL Server 2019 BDC you don’t have to worry about creating and maintaining this ETL. You can simply use Azure Data Studio to spin up a SQL 2019 BDC instance (either in an Azure Kubernetes environment called aks, or in an on-prem Kubernetes environment) and create external tables to point to your source data.

In order to create external tables, there’s a little bit of setup that you’ll need to do on the BDC instance.

The first step will be to create a database that you want to use. The normal CREATE DATABASE command is all you need here.

From there you'll need to create a master key in that database (after switching into the database you just created).

USE test;

After the master key is created then you’ll need to create the credential that you’ll be using to log into the remote server.

WITH IDENTITY = N'bdc-login', SECRET = N'MyP@ssw0rd';

After the Credential is created, then you need to tell BDC where the remote data is going to be accessed from. You do this by creating an EXTERNAL DATA SOURCE. We also use this to specify the login that we’re going to use to log in to the remote server.

WITH (LOCATION = N'sqlserver://', CREDENTIAL = [sql-auth]);

At this point, we’re ready to create our external tables. These are done using the CREATE EXTERNAL TABLE command. You specify the schema of the table, then in the WITH section we tell BDC what table we’re accessing the data from, and what data source (server and authentication method) that we’re going to use.

ProductId int,
ProductName nvarchar(30)
with (location = 'SampleDb.dbo.Product', data_source = sourceName)

Once the external table is setup we can access it just like any other table that we access. We can create multiple external tables, all pointing to different places, and join those together as needed to get the data we want.

One thing to keep in mind when you’re setting up BDC is database collation. If the database collation of the source table is different than BDC then you’ll need to specify the source collation when you create your external table. That’s done easily like this where we are using the Chinese_PRC_CI_AS collation as that is the collation that is being used in our source tables.

ProductId int,
ProductName nvarchar(30) COLLATE Chinese_PRC_CI_AS
with (location = 'SampleDb.dbo.Product', data_source = sourceName)

Hopefully, this quick walkthrough on creating external tables on SQL Server 2019 BDC was helpful.

SQL Server Big Data Cluster requires a good amount of planning in order to have a successful deployment.  The team at Denny Cherry & Associates Consulting would love to help you plan and deploy SQL Server Big Data Cluster with you.  Contact the team and we’ll get the process started.



Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trust DCAC with your data

Your data systems may be treading water today, but are they prepared for the next phase of your business growth?