Monday, February 11, 2019

How to copy data between Azure databases

What is the simplest way to move data between Azure Databases? Let's explore the options.
Whenever I have a requirement, the first thing that I ask myself is what is the simplest way I can solve a problem. This time, I was asked to migrate an Azure Sql Table to Production. Let's see how we can do it.

bcp 

One simple option definitely would be using bcp, so why not? I downloaded it and during the installation, was surprised with the message below. Since I don't like installing things for an one-off usage, I decided to search an even simpler solution.

PolyBase


PolyBase is a recent feature available for Sql Server 16+ and Azure Sql databases, that allows us data integration between distributed data sources. Here's how Microsoft describes it:
PolyBase does not require you to install additional software. You query external data by using the same T-SQL syntax used to query a database table. The support actions implemented by PolyBase all happen transparently.

And this is a simple representation on how it works:
It seems PolyBase is a candidate to transfer data between Azure Databases because it allows us to:
  • create a connection on your target server
  • create an external data source on the server were working on
  • create an external table: a virtual representation of your remote table
  • insert/select/join normally as if the remote table were in the same database

Using Polybase

So let's migrate this data in  simple steps.

Step 1 - Create an external data source

An external data source is a virtual representation of your remote database. With it, you'll be able to access your remote database locally. The first thing we'll need to do is to create a SCOPED CREDENTIAL to store the remote data sort address/credentials on your local Sql Database.

Here's how you create them:

-- You are required to create a scoped credential if your server still does have one
CREATE DATABASE SCOPED CREDENTIAL RemoteCred
WITH IDENTITY = '<remote-username>', SECRET = '<remote-password>';

-- The external data source is a virtual representation of your remote database
CREATE EXTERNAL DATA SOURCE RemoteDataSource
WITH
(
    TYPE=RDBMS,
    LOCATION='<remote-server-address>',
    DATABASE_NAME='<remote-db-name>',
    CREDENTIAL= RemoteCred
);

Note that for the CREATE DATABASE SCOPED CREDENTIAL to work, we have to have MASTER KEY ENCRYPTION set in the database. If it isn't created on your server, you can create one with:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<a-very-strong-password>';

Step 2 - Create an external table

An external table is a virtual representation of your remote table. We can create one with:
CREATE EXTERNAL TABLE [dbo].[RemoteTbl] (
    --your column definitions here as a normal create table
    --note that Identity fields are not allowed
)
WITH
(
    DATA_SOURCE = RemoteDataSource,
    SCHEMA_NAME = '<remote-schema-name>',
    OBJECT_NAME = '<remote-table-name>'
)

Once created, we can already select from that table and see results from the remote host in our current connection.

Step 3 -  Insert your records

Last and final record for me was to move the data between servers. In my case, a simple select into resolved my problem:
select *
into [dbo].[YourLocalTable]
from [dbo].[RemoteTbl]
Note that you can also insert on an existing local table using SELECT INTO:
-- if columns match
INSERT INTO dbo.<dest-table>
SELECT <col1>, <col2>, <coln>
  FROM dbo.<source-table>
 WHERE <condition>

-- specific columns
INSERT INTO dbo.<dest-table>
  (<col1>, <col2>)
SELECT <col1>, <col2>
  FROM dbo.<source-table>
 WHERE <condition>

Other Operations

Step 3 above was to illustrate how to move data between servers. In fact, after you created your external table, you can join, query, and do almost everything else as if the remote database were present on the source server.

Conclusion

On this post we reviewed how can we easily migrate data between databases using Sql Server's PolyBase feature. There's a lot more on this topic and I encourage you to research and learn more about it. Hope it helps.

Source Code

As always, the full source code for this article is available on my GitHub.

References

See Also

For other  posts about Azure on this blog, please click here.

About the Author

Bruno Hildenbrand