Configuring incremental data updates using Azure Data Factory
    • Dark
      Light
    • PDF

    Configuring incremental data updates using Azure Data Factory

    • Dark
      Light
    • PDF

    Article summary

    #ServerlessTips - Azure Data Factory
    Author: Dave McCollough Technical Consultant

    In this article, we will incrementally move data from an Azure SQL Database to Azure Blob storage using Azure Data Factory.

    Prerequisites

    • Active Azure Subscription. If you don’t have a subscription, you can sign up for a free one here.
    • Azure Data Factory
    • Azure SQL Database
    • Azure Blob Storage Account

    Configure the Pipeline

    1. Open Azure Data Factory Studio
    2. Select Author from the side navigation bar

    download 35

    3.Click the ellipsis next to Data Flows and select New Data Flow
    download 36

    4.Click Add Source
    download 37

    5.Click + New to create a new Dataset
    download 38

    6.Select Azure SQL Database and click Continue
    download 39

    7.Click on + New Linked service

    download 40

    8.Ensure the following fields are populated:
    Azure subscription
    Server name
    Database name
    Authentication type
    User name/password (depending on selected authentication type)
    download 41

    9.Click Create
    10.Select your Table name and click OK

    download 42

    11.Your data source has now been configured
    download 43

    12.In this step, we will configure the incremental load. Select the Source options tab from the bottom panel.
    download 44

    13.Check the Change data capture checkbox
    download 45

    14.Select the column from the Column name dropdown you want to use to determine new data added since the previous run and select Full on the first run, then incremental from the Run mode dropdown
    download 47

    15.Click the + next to the source dataflow and select Sink from the dropdown
    download 48

    16.The next step is to create a new dataset for the Sink. Click + New
    download 49

    17.Select Azure Blob Storage and click Continue
    download 50

    18.Select DelimitedText and click Continue
    download 51

    19.Select + New from the Linked service dropdown
    download 52

    20.Select your Azure subscription, Storage account name and click Create
    download 53

    21.Select From root from the File path dropdown
    download 55

    22.Select Root folder and click OK
    download 56

    23.Click OK on Set Properties screen

    download 57

    24.Click on Pipelines and New pipeline
    download 58

    25.Drag your Dataflow into the pipeline
    download 60

    26.Click Publish all
    download 61

    27.Click Publish
    download 62

    28.When publishing is complete, click Add trigger and select Trigger now
    download 63

    29.Browse to your Storage account to validate the pipeline successfully triggered and the .csv file has been created in your storage account
    download 64

    download 65

    30.In this next step, update your Azure SQL database with additional records
    31.Navigate back to Azure Data Factory Studio and run your trigger again
    download 66

    31.After the pipeline has successfully ran, browse to your Storage account to validate the pipeline successfully triggered
    download 67

    Summary

    In this article, we used Azure Data Factory to incrementally update data from an Azure SQL database to a Azure Storage account.


    Was this article helpful?

    ESC

    Eddy AI, facilitating knowledge discovery through conversational intelligence