Problem Statement :
As of 18th May,2023 ; ADF doesn’t have Out of box feature / System Variable to get the ID/details of the User manually triggering the ADF Pipeline.
We do have a system variable called Pipeline Trigger Type ,
but that only Provides values as Manual or Scheduler and in case of Manual doesn’t provide the User details.
Same is the case with the Monitor tab :
A manual way to get the User ID details, we can get those details using Activity Log.
Go to the Activity Log of the Particular ADF and filter for the Operation name as Create Pipeline Run , post which you would get details like when was the pipeline triggered and by whom (Event Initiated by).
So is there an automated Way to get the ID of User Manually Triggering the Azure Data Factory Pipeline.
Prerequisites :
- Azure Data Factory
Solution :
- We would be leveraging the Activity Logs REST API : Activity Logs – List to get the User ID details.
- Provide Data Factory Reader access within the Resource group in which it is created to authenticate via Managed Identity.
a ) Go to Access Control IAM of Resource Group and Click on Add & Select Add Role Assignment
b) Search Reader role and proceed further
3. For us, we need to get the Activity Logs in the RG associated with the ADF post the ADF trigger time and for the specific pipeline we intend to capture the logs.
So we would modify the URL based on the below aspect filtering on eventTimestamp and ResourceURI:
https://management.azure.com/subscriptions/<<SubscriptionID>>/providers/microsoft.insights/eventtypes/management/values?api-version=2015-04-01&$filter=eventTimestamp ge '<<Pipeline Trigger time>>' and resourceUri eq '/subscriptions/<<SubscriptionID>>/resourcegroups/<<ResourceGrouupName>>/providers/Microsoft.DataFactory/factories/<<DataFactoryName>>/pipelines/<<PipelineName>>'
4.
GitHUB Repo
5. First we need to generate the Resource URI (highlighted in bold italics above)
Value : @concat('''','/subscriptions/',pipeline().parameters.SubscriptionID,'/resourcegroups/',pipeline().parameters.ResourceGroupName,'/providers/Microsoft.DataFactory/factories/',pipeline().DataFactory,'/pipelines/',pipeline().Pipeline,'''')
6. Leverage a Web activity to trigger the REST API
URL :
@concat('https://management.azure.com/subscriptions/',pipeline().parameters.SubscriptionID,'/providers/microsoft.insights/eventtypes/management/values?api-version=2015-04-01&$filter=eventTimestamp ge ','''',formatDateTime(pipeline().TriggerTime, 'yyyy-MM-ddTHH:mm:ss'),'Z' ,'''' ,' and resourceUri eq ', variables('ResourceURI'))
Method : GET
Authentication : System Assigned Managed Identity
Resource : https://management.azure.com
7. The Activity Logs do not get synced instantly as and when the event is generated. Because of that one might get no result via web activity in the initial execution and would have to run UNTIL we get the user details (post data synchronization in Activity log)
So we have to leverage IF activity to compare whether the web activity has any output and if there is no output, wait for some time before you retrigger the REST API and continue until you get the details via Until activity.
Expression :
@empty(activity('GetActivityLogs').output.value)
If True (meaning the Output of Web activity is Empty), Wait for 60 secs
If False (Web activity has an output), Get the User Details
Value :
@activity('GetActivityLogs').output.value[0].caller
UNTIL Activity :
Expression :
@if(empty(activity('GetActivityLogs').output.value),false,true)
To avoid infinite iterations within Until activity, you can add a counter logic with count of max iterations as mentioned in # of the blog :
Avoidance of Infinite Iterations of Until activity within Azure Data Factory / Synapse Pipelines
Output :
Web Activity Input :
Web Activity Output :
Limitations :
- There is no direct linkage between the Pipelinerun Id and the other details within the Activity Log. So in case if 2 users execute the same pipeline parallelly, the above logic may / may not have the proper details ( as the logic is based on top 1 caller details, so both the pipeline logs might show the same user name)
Note : One can also use Azure Monitor + Kusto Query language in Azure Log Analytics [if Diagnostic settings is enabled ]
I was searching for this from a long time, found it at the right time. Thanks for sharing this.
LikeLike
Thanks for sharing this. Is it possible to do the same for Synapse pipelines? Unfortunately, I do not see pipeline runs in audit logs for Synapse.
LikeLike