Showing posts with label Sql. Show all posts
Showing posts with label Sql. Show all posts

Monday, February 11, 2019

How to copy data between Azure databases

What is the simplest way to move data between Azure Databases? Let's explore the options.
Whenever I have a requirement, the first thing that I ask myself is what is the simplest way I can solve a problem. This time, I was asked to migrate an Azure Sql Table to Production. Let's see how we can do it.


One simple option definitely would be using bcp, so why not? I downloaded it and during the installation, was surprised with the message below. Since I don't like installing things for an one-off usage, I decided to search an even simpler solution.


PolyBase is a recent feature available for Sql Server 16+ and Azure Sql databases, that allows us data integration between distributed data sources. Here's how Microsoft describes it:
PolyBase does not require you to install additional software. You query external data by using the same T-SQL syntax used to query a database table. The support actions implemented by PolyBase all happen transparently.

And this is a simple representation on how it works:
It seems PolyBase is a candidate to transfer data between Azure Databases because it allows us to:
  • create a connection on your target server
  • create an external data source on the server were working on
  • create an external table: a virtual representation of your remote table
  • insert/select/join normally as if the remote table were in the same database

Using Polybase

So let's migrate this data in  simple steps.

Step 1 - Create an external data source

An external data source is a virtual representation of your remote database. With it, you'll be able to access your remote database locally. The first thing we'll need to do is to create a SCOPED CREDENTIAL to store the remote data sort address/credentials on your local Sql Database.

Here's how you create them:

-- You are required to create a scoped credential if your server still does have one
WITH IDENTITY = '<remote-username>', SECRET = '<remote-password>';

-- The external data source is a virtual representation of your remote database
    CREDENTIAL= RemoteCred

Note that for the CREATE DATABASE SCOPED CREDENTIAL to work, we have to have MASTER KEY ENCRYPTION set in the database. If it isn't created on your server, you can create one with:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<a-very-strong-password>';

Step 2 - Create an external table

An external table is a virtual representation of your remote table. We can create one with:
    --your column definitions here as a normal create table
    --note that Identity fields are not allowed
    DATA_SOURCE = RemoteDataSource,
    SCHEMA_NAME = '<remote-schema-name>',
    OBJECT_NAME = '<remote-table-name>'

Once created, we can already select from that table and see results from the remote host in our current connection.

Step 3 -  Insert your records

Last and final record for me was to move the data between servers. In my case, a simple select into resolved my problem:
select *
into [dbo].[YourLocalTable]
from [dbo].[RemoteTbl]
Note that you can also insert on an existing local table using SELECT INTO:
-- if columns match
INSERT INTO dbo.<dest-table>
SELECT <col1>, <col2>, <coln>
  FROM dbo.<source-table>
 WHERE <condition>

-- specific columns
INSERT INTO dbo.<dest-table>
  (<col1>, <col2>)
SELECT <col1>, <col2>
  FROM dbo.<source-table>
 WHERE <condition>

Other Operations

Step 3 above was to illustrate how to move data between servers. In fact, after you created your external table, you can join, query, and do almost everything else as if the remote database were present on the source server.


On this post we reviewed how can we easily migrate data between databases using Sql Server's PolyBase feature. There's a lot more on this topic and I encourage you to research and learn more about it. Hope it helps.

Source Code

As always, the full source code for this article is available on my GitHub.


See Also

For other  posts about Azure on this blog, please click here.

Monday, May 21, 2018

Seven Databases in Seven Weeks, 2nd Edition

The book is an interesting heads-up on databases being used on different fields throughout the world.
You may or may not have heard about polyglot persistence. The fact is that more and more, software projects are making use of different technologies. And when it comes to the database world, the number of options is immense: it can be relational, document or columnar databases, key-value stores, graph databases, not to mention other cloud infrastructure options like service buses, blobs, storage accounts and queues.

What to use? Where? And how does that affects developers?

Choosing a database is perhaps one of the most important architectural decisions a developer can make. In that regard, I'd like to recommend a very interesting book that addresses some of that discussion: Seven Databases in Seven Weeks, 2nd Edition.

Why you should read this book

Because the book:
  • provides practical and conceptual introductions to Redis, Neo4J, CouchDB, MongoDB, HBase, PostgreSQL, and DynamoDB
  • introduces you to different technologies encouraging you to run your own experiences
  • revises important topics like the CAP Theorem
  • will give you an overview of what’s out there so you can choose the best tool for the job
  • explore some cutting-edge databases available - from a traditional relational database to newer NoSQL approaches
  • and make informed decisions about challenging data storage problems
  • tackle a real-world problem that highlights the concepts and features that make it shine


Whether you're a programmer building the next big thing, a data scientist seeking solutions to thorny problems, or a technology enthusiast venturing into new territory, you will find something to inspire you in this book.


See Also

About the Author

Bruno Hildenbrand