
Importing data to Azure Cosmos DB using Data migration tool
In this section, we will import our data from SQL server to Cosmos DB on Azure. We are using SQL API in our demonstration.
I am going to use AdventureWorks2016 sample database, will import this database to cosmos DB using Data migration tool. You can download this database from the GitHub repository from the following URL:
On download, we get the .bak file, which has to be moved in our MSSQL Backup folder as below:
Step -1 Restore Database
Now I have restored this database to my on-prem SQL server database. I did this by using restore and move SQL query. You can do this by restore database wizard also.
So, I have AdventureWorks database available in my database list and I am querying some data to be imported on cosmos. You can choose the data as per your business need.
Here we see several tables and views of AdventureWorks database, I am going to use View “vStoreWithAddresses”
You might have noticed that the names of the views and tables are in dot notation where first part denotes the primary entity name.
For example, "vSalesPerson" is prefixed by Sales Entity. And therefore, the full name is [Sales.vSalesPerson]
This view joins many tables and returns 712 rows on selecting.
Similarly, if we fetch those addresses where adresstype = ‘Main office’, we will get 701 rows. But we need to modify this query in dot notation so that we can use partition key in cosmos DB to display the denormalized data in JSON format.
USE [AdventureWorks2016]
GO
SELECT CAST([BusinessEntityID] as varchar) as [id]
,[Name] as [name]
,[AddressType] as [address.addressType]
,[AddressLine1] as [address.addressLine1]
,[AddressLine2] as [address.addressLine2]
,[City] as [address.location.city]
,[StateProvinceName] as [address.location.stateProvinceName]
,[PostalCode] as [address.postalCode]
,[CountryRegionName] as [address.countryRegionName]
FROM [Sales].[vStoreWithAddresses]
WHERE [AddressType] = 'Main Office'
GO
In above query Address Type, line1, Line2, postalCode, countryRegionName are part of the Address and therefore prefixed by address, and City and StateProvineName are part of the Location.
We will import this to cosmos Db and will see how cosmos DB leverages this notation and gives us the nested format of data.
So as the result of above query we get 701 filtered rows as below:
Step-2 Download the Document DB data Migration tool
Data migration tool can be downloaded from Microsoft download center. Below is the URL:
You will get a compressed folder in your download file location. Extract it and run the dtui.exe for GUI based installation.
Step -3 Run the Document DB Data Migration Tool
Once you extract the files, you will see two executables files dt.exe and dtui, select dtui.exe for windows using wizard as below-
Double click on this setup and start the configuration settings.
It will pop up a dialog box, click on next, and we select SQL from “import from” dropdown because our data source is SQL . We enter a dot(.) as the nested separator. because we want to use it to determine the nested data. Here .(dot) will work as a delimiter for us.
We click next, and on the next screen, we have to enter the connection string to connect with our SQL server.
I am trying to connect with AdventureWorks2016 database from my local SQL server instance. You can try with any database you want to query with cosmos DB.
My connection string is data source=.;initial catalog=AdventureWorks2016;integrated security=true
Now copy paste the select view query as below:
Click on Next and in the next screen, we will enter the details of our cosmos DB account.
Go to your cosmos DB account on Azure portal. Search Key and copy the primary Key and paste in the data migration tool.
Copy paste this connection string in Data migration tool and remember to append this connection string with the database name. Because this string points to your cosmos DB account, not the database. So, you need to specify the database name explicitly in which you want to import the data.
Here I am adding my connection string with Database=mydb;
This DB does not exist, so Cosmos DB creates it for us. Similarly, I am entering mystore for the collection. This collection does not exist so it will be generated by cosmos db .
We specify a partition key as : /address/postalCode
If you expand Advanced Options, you will see a Text area to enter index policy.
Right click on it and you will see two options – default and Range. We will use Default for now.
Click Next few screens and Click on Import, the tool will start importing the result of the query to the Cosmos DB in JSON document format.
Now go to your Cosmos DB account and refresh it. In the overview blade, we can see a new collection.
clicking on this collection we can see our data imported in the form of documents.
By clicking on every document, you can see the entire JSON formatted document on right side area, you can easily observe the structure with nested Address and Location properties and this is because we used dot limiter for our column alias in our sql server view.
For example, location is further nested into city and stateprovincename. This looks similar to a normalized separate table in sql, but here denormalized data managed in the form of nested elements, as we can see in below example.
So, all set up now, we get successfully imported the SQL data to cosmos DB.
Now we are ready to query with SQL API.
Happy Learning with CosmosDB !! 😊
No comments:
Post a Comment