Configuring incremental data updates using Azure Data Factory

Print
Share
Dark
Light
PDF

Configuring incremental data updates using Azure Data Factory

Print
Share
Dark
Light
PDF

Article summary

Did you find this summary helpful?

Thank you for your feedback!

#ServerlessTips - Azure Data Factory

Author: Dave McCollough Technical Consultant

In this article, we will incrementally move data from an Azure SQL Database to Azure Blob storage using Azure Data Factory.

Prerequisites

Active Azure Subscription. If you don’t have a subscription, you can sign up for a free one here.
Azure Data Factory
Azure SQL Database
Azure Blob Storage Account

Configure the Pipeline

Open Azure Data Factory Studio
Select Author from the side navigation bar

download 35

3.Click the ellipsis next to Data Flows and select New Data Flow
download 36

4.Click Add Source
download 37

5.Click + New to create a new Dataset
download 38

6.Select Azure SQL Database and click Continue
download 39

7.Click on + New Linked service

download 40

8.Ensure the following fields are populated:
Azure subscription
Server name
Database name
Authentication type
User name/password (depending on selected authentication type)
download 41

9.Click Create
10.Select your Table name and click OK

download 42

11.Your data source has now been configured
download 43

12.In this step, we will configure the incremental load. Select the Source options tab from the bottom panel.
download 44

13.Check the Change data capture checkbox
download 45

14.Select the column from the Column name dropdown you want to use to determine new data added since the previous run and select Full on the first run, then incremental from the Run mode dropdown
download 47

15.Click the + next to the source dataflow and select Sink from the dropdown
download 48

16.The next step is to create a new dataset for the Sink. Click + New
download 49

17.Select Azure Blob Storage and click Continue
download 50

18.Select DelimitedText and click Continue
download 51

19.Select + New from the Linked service dropdown
download 52

20.Select your Azure subscription, Storage account name and click Create
download 53

21.Select From root from the File path dropdown
download 55

22.Select Root folder and click OK
download 56

23.Click OK on Set Properties screen

download 57

24.Click on Pipelines and New pipeline
download 58

25.Drag your Dataflow into the pipeline
download 60

26.Click Publish all
download 61

27.Click Publish
download 62

28.When publishing is complete, click Add trigger and select Trigger now
download 63

29.Browse to your Storage account to validate the pipeline successfully triggered and the .csv file has been created in your storage account
download 64

download 65

30.In this next step, update your Azure SQL database with additional records
31.Navigate back to Azure Data Factory Studio and run your trigger again
download 66

31.After the pipeline has successfully ran, browse to your Storage account to validate the pipeline successfully triggered
download 67

Summary

In this article, we used Azure Data Factory to incrementally update data from an Azure SQL database to a Azure Storage account.

Was this article helpful?

What's Next

Copy data from an AWS S3 Bucket to Azure Blob Storage using Azure Data Factory Pipelines