Extract Data from Power BI through Azure Data Factory / Synapse Pipeline via T-SQL From Azure SQL Database

Problem Statement :

Is it possible to copy data from Power BI via Azure Data Factory (ADF) / Synapse as there is no out of the box connector for Power BI.

Prerequisites :

  1. Power BI Report / Dataset
  2. Azure Function ( In this blog, we would be leveraging PowerShell Core Runtime Stack )
  3. Azure SQL Database

Solution :

In my previous blogs Copy Data from Power BI through Azure Data Factory / Synapse Pipeline – Part 3 & Copy Data from Power BI through Azure Data Factory / Synapse Pipeline via Managed Identity Authentication– Part 4 , we saw how we can Web activity & Data Factory / Synapse Dataflow to get data from Power BI data set via REST API.

In this blog, we would be extracting the data via T-SQL from Azure SQL Database.

Note : We would be leveraging Azure SQL Database External REST Endpoints Integration via which we would be triggering an Azure function and Power BI REST API.

  1. In my previous blog Copy Data from Power BI through Azure Data Factory / Synapse Pipeline – Part 1 we extract the data from Power BI Dataset by generating an AAD Token of the Service Principal which has access on the needed Power BI Dataset.
  2. We would be following a similar route in the current flow ; wherein we would be triggering a Managed identity enabled Azure Function to generate a AAD Token which we would in turn use to extract the data from Power BI Dataset via API.

Note : The AAD Token can have an expiration time between 60 – 90 min. So rather than preserving that as a Database Scoped Credential, we can generate it at run time by triggering the Azure function (in this scenario)

3. Assuming we have a PowerShell Core Runtime Stack Function App set up, update the requirements.psd1 ( present within App files section of App Function) to leverage Az modules.

4. Create a Function of HTTP Trigger type and Function Authorization Level

Enable Managed Identity for the Azure function, which we would need further to provide it access on the needed Power BI Dataset .

Add the below code (for AAD Token creation) in run.ps1 file within Code + Test Section and Save it.

using namespace System.Net

# Input bindings are passed in via param block.
param($Request, $TriggerMetadata)

Connect-AzAccount -Identity

$AADToken=(Get-AzAccessToken -resourceUrl "https://analysis.windows.net/powerbi/api")

# Associate values to output bindings by calling 'Push-OutputBinding'.
Push-OutputBinding -Name Response -Value ([HttpResponseContext]@{
    StatusCode = [HttpStatusCode]::OK
    Body = $AADToken 
})

5. Follow Steps from #1 to #3 stated in Copy Data from Power BI through Azure Data Factory / Synapse Pipeline – Part 1 for providing the necessary access on the Power BI dataset.

6. Login to the Azure SQL database via which we need to execute the T-SQL for Power BI data extraction and execute the queries in the below provided sequence.

a) 
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<<>>'

b) 
CREATE DATABASE SCOPED CREDENTIAL [https://<<AzureFunctionName>>.azurewebsites.net/api/<<FunctionName>>]
WITH IDENTITY = 'HTTPEndpointHeaders', SECRET = '{"x-functions-key":"<<FunctionKey>>"}'

Where FunctionKey value is as below :

7. Create a Stored Procedure within the Azure SQL Database which we would trigger via ADF / Synapse.

CREATE PROCEDURE dbo.DataSharkXSP
AS 
DECLARE @Appurl NVARCHAR(4000) = N'https://<<AzureFunctionName>>.azurewebsites.net/api/<<FunctionName>>';
DECLARE @Apppayload NVARCHAR(max) = N'{ }';
DECLARE @Token NVARCHAR(max);
DECLARE @Appret int, @Appresponse NVARCHAR(max);
DECLARE @PowerBIurl NVARCHAR(4000) = N'https://api.powerbi.com/v1.0/myorg/groups/<<WorkspaceID>>/datasets/<<DatasetID>>/executeQueries';
DECLARE @PowerBIpayload NVARCHAR(max) = N'{

  "queries": [

    {

      "query": "' + string_escape('Evaluate SUMMARIZECOLUMNS( ''Date''[DateDimId], ''Date''[MonthNm], "CountRows",CALCULATE(COUNTROWS(''Date'')))
', 'json') + '"

    }

  ],

  "serializerSettings": {

    "includeNulls": true

  }

}'
DECLARE @PowerBIret int, @PowerBIresponse NVARCHAR(max);

--To Trigger the Azure Function in order to generate the AAD Token at run time

EXEC @Appret = sp_invoke_external_rest_endpoint 

    @url = @Appurl,
    @method = 'POST',
    @payload = @Apppayload,
    @credential = [https://<<AzureFunctionName>>.azurewebsites.net/api/<<FunctionName>>],
    @response = @Appresponse output;

    
SET @Token= (SELECT * FROM OPENJSON(@Appresponse,  '$.result') WITH

  (
    "Token" NVARCHAR(max)
  ))


DECLARE @PowerBIheaders NVARCHAR(4000)=N'{"Authorization": "Bearer '+ @Token + '"}';

--To Trigger the Power BI DataSet refresh REST API

EXEC @PowerBIret = sys.sp_invoke_external_rest_endpoint 

	@method = 'POST',
	@url = @PowerBIurl,
	@payload = @PowerBIpayload,
	@headers = @PowerBIheaders,
	@response = @PowerBIresponse output;


SELECT * FROM OPENJSON(@PowerBIresponse,  '$.result.results[0].tables[0].rows') WITH

  (

  "Date[DateDimId]" nvarchar(100),
  "Date[MonthNm]" varchar(100),
  "[CountRows]" int

  )

Output / Result of Final Query within the above Stored Procedure :

Query 1 :

SELECT * FROM OPENJSON(@PowerBIresponse,  '$.result.results[0].tables[0].rows') WITH
  (
  "Date[DateDimId]" nvarchar(100),
  "Date[MonthNm]" varchar(100),
  "[CountRows]" int
  )

Query 2 :

SELECT * FROM OPENJSON(@PowerBIresponse,  '$.result.results[0].tables[0].rows')

Query 3 :

SELECT @PowerBIresponse

{“response”:{“status”:{“http”:{“code”:200,”description”:””}},”headers”:{“Cache-Control”:”no-store, must-revalidate, no-cache”,”Date”:”Sat, 03 Dec 2022 08:16:12 GMT”,”Pragma”:”no-cache”,”Content-Type”:”application\/json”,”strict-transport-security”:”max-age=31536000; includeSubDomains”,”x-frame-options”:”deny”,”x-content-type-options”:”nosniff”,”requestid”:”3e6b00d9-0fde-454a-a95c-1a26ad986aec”,”access-control-expose-headers”:”RequestId”,”request-redirected”:”true”,”home-cluster-uri”:”https:\/\/wabi-west-us-redirect.analysis.windows.net\/”}},”result”:{“informationProtectionLabel”:{“id”:”5fae8262-b78e-4366-8929-a5d6aac95320″,”name”:”Recipients Have Full Control”},”results”:[{“tables”:[{“rows”:[{“Date[DateDimId]”:20000101,”Date[MonthNm]”:”January”,”[CountRows]”:1},{“Date[DateDimId]”:20000102,”Date[MonthNm]”:”January”,”[CountRows]”:1}]}]}]}}

8. Create an ADF / Synapse Pipeline with Copy Activity with Source as the Azure SQL database and Sink (in this case) as Azure Blob storage.

Note : One can Parameterize the Query, Power BI URL within the Stored Procedure and pass it at run time via Azure Data Factory / Synapse Pipeline for a Generic logic.

Published by Nandan Hegde

Microsoft Data MVP |Microsoft Data platform Architect | Blogger | MSFT Community Champion I am a MSFT Data Platform MVP and Business Intelligence and Data Warehouse professional working within the Microsoft data platform eco-system which includes Azure Synapse Analytics ,Azure Data Factory ,Azure SQL Database and Power BI. To help people keep up with this ever-changing landscape, I frequently posts on LinkedIn, Twitter and to his blog on https://datasharkx.wordpress.com. LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66 GitHUB Profile : https://github.com/NandanHegde15 Twitter Profile : @nandan_hegde15 MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942

Leave a comment

Design a site like this with WordPress.com
Get started