Listing Unique records within an Array in Azure Data Factory

Problem Statement :

Is it possible to remove the duplicate within an Array in Azure Data Factory.

Prerequisites :

  1. Azure Data Factory

Solution :

  1. The “union()” function in ADF returns a collection that has all the items from the specified collections. So one can leverage this function to get the unique list from an Array.
  2. Let’s say we have a list of values in an Array variable

3. Using Set Variable activity, we can get the unique list from the Array.

@union(variables('DuplicateArray'),variables('DuplicateArray'))

Output :

ADF JSON :

{
    "name": "ReturnUnique",
    "properties": {
        "activities": [
            {
                "name": "Remove Duplicates",
                "type": "SetVariable",
                "dependsOn": [],
                "userProperties": [],
                "typeProperties": {
                    "variableName": "UniqueArray",
                    "value": {
                        "value": "@union(variables('DuplicateArray'),variables('DuplicateArray'))",
                        "type": "Expression"
                    }
                }
            }
        ],
        "variables": {
            "DuplicateArray": {
                "type": "Array",
                "defaultValue": [
                    "A1",
                    "B2",
                    "C3",
                    "A1",
                    "A5",
                    "B2"
                ]
            },
            "UniqueArray": {
                "type": "Array"
            }
        },
        "annotations": []
    }
}

Published by Nandan Hegde

Microsoft Data MVP |Microsoft Data platform Architect | Blogger | MSFT Community Champion I am a MSFT Data Platform MVP and Business Intelligence and Data Warehouse professional working within the Microsoft data platform eco-system which includes Azure Synapse Analytics ,Azure Data Factory ,Azure SQL Database and Power BI. To help people keep up with this ever-changing landscape, I frequently posts on LinkedIn, Twitter and to his blog on https://datasharkx.wordpress.com. LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66 GitHUB Profile : https://github.com/NandanHegde15 Twitter Profile : @nandan_hegde15 MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942

Leave a comment

Design a site like this with WordPress.com
Get started