Automated Way to get the ID of User Manually Triggering the Azure Data Factory Pipeline

Problem Statement :

As of 18th May,2023 ; ADF doesn’t have Out of box feature / System Variable to get the ID/details of the User manually triggering the ADF Pipeline.

We do have a system variable called Pipeline Trigger Type ,

but that only Provides values as Manual or Scheduler and in case of Manual doesn’t provide the User details.

Same is the case with the Monitor tab :

A manual way to get the User ID details, we can get those details using Activity Log.

Go to the Activity Log of the Particular ADF and filter for the Operation name as Create Pipeline Run , post which you would get details like when was the pipeline triggered and by whom (Event Initiated by).

So is there an automated Way to get the ID of User Manually Triggering the Azure Data Factory Pipeline.

Prerequisites :

  1. Azure Data Factory

Solution :

  1. We would be leveraging the Activity Logs REST API : Activity Logs – List to get the User ID details.
  2. Provide Data Factory Reader access within the Resource group in which it is created to authenticate via Managed Identity.

a ) Go to Access Control IAM of Resource Group and Click on Add & Select Add Role Assignment

b) Search Reader role and proceed further

3. For us, we need to get the Activity Logs in the RG associated with the ADF post the ADF trigger time and for the specific pipeline we intend to capture the logs.

So we would modify the URL based on the below aspect filtering on eventTimestamp and ResourceURI:

https://management.azure.com/subscriptions/<<SubscriptionID>>/providers/microsoft.insights/eventtypes/management/values?api-version=2015-04-01&$filter=eventTimestamp ge '<<Pipeline Trigger time>>' and resourceUri eq '/subscriptions/<<SubscriptionID>>/resourcegroups/<<ResourceGrouupName>>/providers/Microsoft.DataFactory/factories/<<DataFactoryName>>/pipelines/<<PipelineName>>'

4.

GitHUB Repo

5. First we need to generate the Resource URI (highlighted in bold italics above)

Value : @concat('''','/subscriptions/',pipeline().parameters.SubscriptionID,'/resourcegroups/',pipeline().parameters.ResourceGroupName,'/providers/Microsoft.DataFactory/factories/',pipeline().DataFactory,'/pipelines/',pipeline().Pipeline,'''')

6. Leverage a Web activity to trigger the REST API

URL :

@concat('https://management.azure.com/subscriptions/',pipeline().parameters.SubscriptionID,'/providers/microsoft.insights/eventtypes/management/values?api-version=2015-04-01&$filter=eventTimestamp ge ','''',formatDateTime(pipeline().TriggerTime, 'yyyy-MM-ddTHH:mm:ss'),'Z' ,'''' ,' and resourceUri eq ', variables('ResourceURI'))

Method : GET

Authentication : System Assigned Managed Identity

Resource : https://management.azure.com

7. The Activity Logs do not get synced instantly as and when the event is generated. Because of that one might get no result via web activity in the initial execution and would have to run UNTIL we get the user details (post data synchronization in Activity log)

So we have to leverage IF activity to compare whether the web activity has any output and if there is no output, wait for some time before you retrigger the REST API and continue until you get the details via Until activity.

Expression :

@empty(activity('GetActivityLogs').output.value)

If True (meaning the Output of Web activity is Empty), Wait for 60 secs

If False (Web activity has an output), Get the User Details

Value :

@activity('GetActivityLogs').output.value[0].caller

UNTIL Activity :

Expression :

@if(empty(activity('GetActivityLogs').output.value),false,true)

To avoid infinite iterations within Until activity, you can add a counter logic with count of max iterations as mentioned in # of the blog :

Avoidance of Infinite Iterations of Until activity within Azure Data Factory / Synapse Pipelines

Output :

Web Activity Input :

Web Activity Output :

Limitations :

  1. There is no direct linkage between the Pipelinerun Id and the other details within the Activity Log. So in case if 2 users execute the same pipeline parallelly, the above logic may / may not have the proper details ( as the logic is based on top 1 caller details, so both the pipeline logs might show the same user name)

Note : One can also use Azure Monitor + Kusto Query language in Azure Log Analytics [if Diagnostic settings is enabled ]

Published by Nandan Hegde

Microsoft Data MVP |Microsoft Data platform Architect | Blogger | MSFT Community Champion I am a MSFT Data Platform MVP and Business Intelligence and Data Warehouse professional working within the Microsoft data platform eco-system which includes Azure Synapse Analytics ,Azure Data Factory ,Azure SQL Database and Power BI. To help people keep up with this ever-changing landscape, I frequently posts on LinkedIn, Twitter and to his blog on https://datasharkx.wordpress.com. LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66 GitHUB Profile : https://github.com/NandanHegde15 Twitter Profile : @nandan_hegde15 MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942

2 thoughts on “Automated Way to get the ID of User Manually Triggering the Azure Data Factory Pipeline

  1. Thanks for sharing this. Is it possible to do the same for Synapse pipelines? Unfortunately, I do not see pipeline runs in audit logs for Synapse.

    Like

Leave a comment

Design a site like this with WordPress.com
Get started