Refresh Power BI Dataflow / Dataset from Azure Data Factory / Synapse Pipeline via Service Principal (AAD Token Generation)- Part 1

Related: 1|2|3

Problem Statement :

Is it possible to refresh Power BI Dataflow / Dataset via Azure Data Factory (ADF) / Synapse as there is no out of the box connector for Power BI.

Prerequisites :

  1. Power BI Dataset / Dataflow
  2. Service Principal

Solution :

Note : Service Principal authentication can be leveraged in case of Cross tenant scenarios wherein the ADF / Synapse Pipelines are in tenant other than the Power BI tenant.

  1. Allow Service Principal to access Power BI APIs
    • Go to the Power BI Service at https://app.powerbi.com and open the Admin Portal.
    • Go to the ‘Tenant Settings’ section, and scroll to the setting for ‘Allow service principals to use Power BI APIs’. Enable the setting and Either allow the entire organization to use the Rest APIs (not recommended) or specify an Azure Active Directory Group and make sure your Service Principal is a member of that group

 2. Grant the Service Principal, Member access on the workspace hosting the Power BI Dataset / Dataflow.

3.  We would be leveraging the Power BI REST API : Datasets – Refresh Dataset and / or Dataflows – Refresh Dataflow for refreshing Power BI Dataset and Dataflow respectively and Datasets – Get Refresh History and / or Dataflows – Get Dataflow Transactions for getting refresh status of Power BI Dataset and Dataflow respectively.

GitHub Code

4. Create 8 pipeline parameters

For which below are the values required :

a) EntityId : This represents either the Power BI Dataset / Dataflow Id.

Login to the required workspace and click on the Dataset / Dataflow

Dataset:

Dataflow:

b) EntityType : This Parameter can either be Dataset or Dataflow based on the entity type we are planning to process.

c) WorkspaceId : This is the Id of the workspace hosting the dataset / dataflow.

d) EntityName : This is the dataset / dataflow name which is needed for Messaging purpose in case of failures.

e) TenantId : TenantId in which the Power BI Entity is published.

f) KeyVaultName : Name of the key vault in which the Client Secret of the Service Principal is preserved.

Get Client Secret value from Key Vault Config :

URL : @concat(‘https://’,pipeline().parameters.KeyVaultName,’.vault.azure.net/secrets/’,pipeline().parameters.ClientSecretKeyVaultSecretName,’?api-version=7.0′)

Get AAD Token Config :

URL : @concat(‘https://login.microsoftonline.com/’,pipeline().parameters.TenantId,’/oauth2/token’)

Body : @concat(‘grant_type=client_credentials&resource=https://analysis.windows.net/powerbi/api&client_id=’,pipeline().parameters.ClientId,’&client_secret=’,encodeUriComponent(activity(‘Get Client Secret from Key Vault’).output.value))

Trigger Power BI Refresh Config :

URL : @concat(‘https://api.powerbi.com/v1.0/myorg/groups/’,pipeline().parameters.WorkspaceId,if(equals(toUpper(pipeline().parameters.EntityType),’DATASET’),’/datasets/’,’/dataflows/’),pipeline().parameters.EntityId,’/refreshes’)

Authorization : @concat(string(activity(‘Get AAD Token’).output.token_type),’ ‘,string(activity(‘Get AAD Token’).output.access_token))

5. The Refresh API call is asynchronous. Hence, we do not know whether the dataset/dataflow refresh has actually succeeded. The successful execution of the web activity doesn’t mean that the refresh was a success. To check the status of the refresh one can leverage Web activity to trigger Power BI Dataset / Dataflow refresh status via Power BI REST API

URL : @concat(‘https://api.powerbi.com/v1.0/myorg/groups/’,pipeline().parameters.WorkspaceId,if(equals(toUpper(pipeline().parameters.EntityType),’DATASET’),’/datasets/’,’/dataflows/’),pipeline().parameters.EntityId,if(equals(toUpper(pipeline().parameters.EntityType),’DATASET’),’/refreshes?$top=1′,’/transactions?$top=1′))

Authorization : @concat(string(activity(‘Get AAD Token’).output.token_type),’ ‘,string(activity(‘Get AAD Token’).output.access_token))

We have to add a polling pattern to periodically check on the status of the refresh until it is complete. We start with an until activity. In the settings of the until loop, we set the expression so that the loop executes until the output of the above web activity is not equal to Unknown (in case of dataset) or InProgress (in case of dataflow). When a dataset is refreshing, “Unknown” is the status returned until it completes or fails and when a dataflow is refreshing, “InProgress” is the status returned until it completes or fails.

Expression : @not(equals(first(json(string(activity(‘GetRefreshDetails’).output)).value).status,if(equals(toUpper(pipeline().parameters.EntityType),’DATASET’),’Unknown’,’InProgress’)))

6. The Final activity is the IF activity that checks the Refresh status and leverages a Fail activity to fail the pipeline in case of refresh failure.

IF activity expression : @equals(first(json(string(activity(‘GetRefreshDetails’).output)).value).status,’Failed’)

Fail message : @if(equals(toUpper(pipeline().parameters.EntityType),’DATAFLOW’),concat(pipeline().parameters.EntityName,’ failed to refresh’),string(json(first(json(string(activity(‘GetRefreshDetails’).output))?.value)?.serviceExceptionJson)))

Published by Nandan Hegde

Microsoft Data MVP |Microsoft Data platform Architect | Blogger | MSFT Community Champion I am a MSFT Data Platform MVP and Business Intelligence and Data Warehouse professional working within the Microsoft data platform eco-system which includes Azure Synapse Analytics ,Azure Data Factory ,Azure SQL Database and Power BI. To help people keep up with this ever-changing landscape, I frequently posts on LinkedIn, Twitter and to his blog on https://datasharkx.wordpress.com. LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66 GitHUB Profile : https://github.com/NandanHegde15 Twitter Profile : @nandan_hegde15 MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942

Leave a comment

Design a site like this with WordPress.com
Get started