Trigger Synapse / Azure Data factory Pipeline via SQL Server on Events within the Database

Problem Statement :

In my previous blog Event Trigger Data Sync from SQL Server to Synapse via Azure Data Factory / Synapse Pipeline, we were able to trigger a Synapse / ADF pipeline via SQL server leveraging a Dedicated SQL Pool.

Is it possible to instantaneously trigger Synapse / ADF Pipeline in case of any data changes (Insert / Update /Delete) within a table in SQL server database without leveraging Linked server/ Dedicated SQL Pool.

Prerequisites :

  1. SQL Server ( 2022 )
  2. Azure Blob Storage
  3. Synapse / Data Factory

Solution :

To trigger Synapse / ADF pipeline via SQL server, we would be leveraging CETAS (Polybase) feature of SQL server 2022.

  1. Ensure Polybase is Enabled on the database via which we want to trigger the Pipelines.
exec sp_configure @configname = 'polybase enabled', @configvalue = 1;

RECONFIGURE;

Run the following command and verify the return value is 1 to ensure PolyBase is installed.

SELECT SERVERPROPERTY ('IsPolyBaseInstalled') AS IsPolyBaseInstalled;

2. To Leverage CETAS functionality, enable Polybase Export by executing the below command

exec sp_configure @configname = 'allow polybase export', @configvalue = 1;

RECONFIGURE;

3. Set up External Data Source and File format for creation of an External table

a) Create a Master Key (If already not exists)

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<<>>';

b) Create Database Scoped Credential to Azure Blob storage

CREATE DATABASE SCOPED CREDENTIAL [BlobSAS]
WITH
IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'sv=2022-11-02<<>>' ;
GO

Generate the SAS by Selecting Container & Object in Allowed Resource Types

Note : Access Key is not supported in SQL 2022 and only SAS is supported.

C) Create External Data Source

CREATE  EXTERNAL DATA SOURCE [BlobSource]
WITH
( LOCATION = 'abs://<<BlobName>>.blob.core.windows.net/<<ContainerName>>' ,
CREDENTIAL = [BlobSAS],
) ;

D) Create External File Format

CREATE EXTERNAL FILE FORMAT [CommaDelimited] WITH
(
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (FIELD_TERMINATOR = N',',
STRING_DELIMITER = N'"',
FIRST_ROW = 2,
USE_TYPE_DEFAULT = True)
)

4. The purpose behind #3 is for us to leverage this in the CETAS creation which in turn would generate a file in the Azure blob storage which we can leverage for Blob trigger purpose. So next action is to create a Stored procedure in SQL Server which can be triggered in case of any data change.

CREATE PROCEDURE dbo.EventTriggerPipeline
AS
IF EXISTS ( SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'EventTrigger' AND TABLE_SCHEMA='dbo')
    DROP EXTERNAL TABLE dbo.EventTrigger




CREATE EXTERNAL TABLE dbo.EventTrigger
WITH (
LOCATION='Adhoc/',
DATA_SOURCE=[BlobSource],
FILE_FORMAT =[CommaDelimited]

)
AS
SELECT GETDATE() AS Event

5. Set up Blob Event Type trigger on the Pipeline which needs to be triggered.

Note: The Blob path begins is equivalent to the Location specified in CETAS script in #4.

Also, at the end of pipeline, make sure to delete the file via which the Pipeline was triggered via Delete activity to avoid the below error :

Cannot create external table. External table location already exists.

6. The next aspect would be to set up Trigger in the SQL server on the table whose data changes needs to trigger the pipeline.

Create Table dbo.DataSharkX
(
C1 Varchar(10),
C2 Varchar(10)
);

Create Trigger EventPipe
On dbo.DataSharkx
AFTER INSERT,UPDATE,DELETE
AS
BEGIN
	EXEC dbo.EventTriggerPipeline
END

So the set up is established which would flow in the below order whenever any Data changes (Insert / Update / Delete) happens within the SQL server table :

1) SQL Trigger activated => 2) SQL Server Stored Procedure Executed => 3) External Table creation (CETAS) => 4) File Creation in Blob => 5) ADF/Synapse Event Trigger activated => 6) ADF/Synapse Pipeline triggered => 7) Delete files via Delete activity at end

Result :

a) Insert Data into SQL Server table

1 record was inserted into table and files were generated in blob via CETAS

Published by Nandan Hegde

Microsoft Data MVP |Microsoft Data platform Architect | Blogger | MSFT Community Champion I am a MSFT Data Platform MVP and Business Intelligence and Data Warehouse professional working within the Microsoft data platform eco-system which includes Azure Synapse Analytics ,Azure Data Factory ,Azure SQL Database and Power BI. To help people keep up with this ever-changing landscape, I frequently posts on LinkedIn, Twitter and to his blog on https://datasharkx.wordpress.com. LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66 GitHUB Profile : https://github.com/NandanHegde15 Twitter Profile : @nandan_hegde15 MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942

Leave a comment

Design a site like this with WordPress.com
Get started