Refresh Power BI Dataflow / Dataset from Azure Data Factory / Synapse Pipeline via Managed Identity

Related: 1|2|3

Problem Statement :

Is it possible to refresh Power BI Dataflow / Dataset via Azure Data Factory (ADF) / Synapse as there is no out of the box connector for Power BI.

Prerequisites :

  1. Power BI Dataset / Dataflow
  2. Azure AD Group

Solution :

  1. Create an Azure AD security group and add the Managed identity of ADF / Synapse to it.

and search for your data factory / synapse under Members and add that within the group.

2. Allow Service Principal to access Power BI APIs

Go to the Power BI Service at https://app.powerbi.com and open the Admin Portal.

Go to the ‘Tenant Settings’ section, and scroll to the setting for ‘Allow service principals to use Power BI APIs’. Enable the setting and Either allow the entire organization to use the Rest APIs (not recommended) or specify an Azure Active Directory Group.

3. Grant the Azure AD group containing the ADF / Synapse, Member access on the workspace hosting the Power BI Dataset / Dataflow.

4.  We would be leveraging the Power BI REST API : Datasets – Refresh Dataset and / or Dataflows – Refresh Dataflow for refreshing Power BI Dataset and Dataflow respectively and Datasets – Get Refresh History and / or Dataflows – Get Dataflow Transactions for getting refresh status of Power BI Dataset and Dataflow respectively.

GitHub Code for the sample pipeline.

5. Create 4 pipeline parameters

For which below are the values required :

a) EntityId : This represents either the Power BI Dataset / Dataflow Id.

Login to the required workspace and click on the Dataset / Dataflow

Dataset:

Dataflow:

b) EntityType : This Parameter can either be Dataset or Dataflow based on the entity type we are planning to process.

c) WorkspaceId : This is the Id of the workspace hosting the dataset / dataflow.

d) EntityName : This is the dataset / dataflow name which is needed for Messaging purpose in case of failures.

6. One can leverage Web activity to trigger Power BI Dataset / Dataflow refresh via Power BI REST API through Managed Identity Authentication as below :

URL : @concat(‘https://api.powerbi.com/v1.0/myorg/groups/’,pipeline().parameters.WorkspaceId,if(equals(toUpper(pipeline().parameters.EntityType),’DATASET’),’/datasets/’,’/dataflows/’),pipeline().parameters.EntityId,’/refreshes’)

Resource : https://analysis.windows.net/powerbi/api

7. The Refresh API call is asynchronous. Hence, you do not know whether the dataset/dataflow refresh has actually succeeded. The successful execution of the web activity does only mean that a refresh was a success. To check the status of the refresh one can leverage Web activity to trigger Power BI Dataset / Dataflow refresh status via Power BI REST API through Managed Identity Authentication as below :

URL : @concat(‘https://api.powerbi.com/v1.0/myorg/groups/’,pipeline().parameters.WorkspaceId,if(equals(toUpper(pipeline().parameters.EntityType),’DATASET’),’/datasets/’,’/dataflows/’),pipeline().parameters.EntityId,if(equals(toUpper(pipeline().parameters.EntityType),’DATASET’),’/refreshes?$top=1′,’/transactions?$top=1′))

Resource : https://analysis.windows.net/powerbi/api

We have to add a polling pattern to periodically check on the status of the refresh until it is complete. We start with an until activity. In the settings of the until loop, we set the expression so that the loop executes until the output of the above web activity is not equal to Unknown (in case of dataset) or InProgress (in case of dataflow). When a dataset is refreshing, “Unknown” is the status returned until it completes or fails and when a dataflow is refreshing, “InProgress” is the status returned until it completes or fails.

Expression : @not(equals(first(json(string(activity(‘GetRefreshDetails’).output)).value).status,if(equals(toUpper(pipeline().parameters.EntityType),’DATASET’),’Unknown’,’InProgress’)))

8. The Final activity is the IF activity that checks the Refresh status and leverages a Fail activity to fail the pipeline in case of refresh failure.

IF activity expression : @equals(first(json(string(activity(‘GetRefreshDetails’).output)).value).status,’Failed’)

Fail message : @if(equals(toUpper(pipeline().parameters.EntityType),’DATAFLOW’),concat(pipeline().parameters.EntityName,’ failed to refresh’),string(json(first(json(string(activity(‘GetRefreshDetails’).output))?.value)?.serviceExceptionJson)))

Published by Nandan Hegde

Microsoft Data MVP |Microsoft Data platform Architect | Blogger | MSFT Community Champion I am a MSFT Data Platform MVP and Business Intelligence and Data Warehouse professional working within the Microsoft data platform eco-system which includes Azure Synapse Analytics ,Azure Data Factory ,Azure SQL Database and Power BI. To help people keep up with this ever-changing landscape, I frequently posts on LinkedIn, Twitter and to his blog on https://datasharkx.wordpress.com. LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66 GitHUB Profile : https://github.com/NandanHegde15 Twitter Profile : @nandan_hegde15 MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942

Leave a comment

Design a site like this with WordPress.com
Get started