Problem Statement :
In Azure Data Factory / Synapse, there is no dedicated function under regular expressions to Sort an Array.
So is it possible to Sort an Array in ADF / Synapse without the use of Dataflow.
Prerequisites :
- Azure Data Factory / Synapse
Solution :
To Sort an Array, we can follow the below approach
GitHUB Code
- Create a Pipeline parameter : InputArray and Pipeline Variables as below
2. Initially Assign the InputArray parameter value to a variable: SourceArray which we would leverage and update based on the Sort Logic.
3. Set the Iteration range (equivalent to #1 in Code Snippet)
@range(0,add(length(variables('SourceArray')),-1))
4. Until Activity (Equivalent to While Iteration in Code)
5. Within Until Activity :
6. Get the Minimum Index in the Iteration
7. Get (Min + 1) Index value
@string(add(int(variables('CurrentItemValue')),1))
8. Compare the Elemental values (Equivalent to #2 in Code Snippet)
@greater(variables('SourceArray')[int(variables('CurrentItemValue'))], variables('SourceArray')[int(variables('NextItemValue'))])
9. In case if the initial element is greater than the next element value (TRUE Activities), Swap the values (Equivalent to #3 in Code Snippet)
Please refer the blog : Swapping Two Array Elements in Azure Data Factory / Synapse Pipeline via Regular Expressions for understanding the highlighted Section in above image .
Finally Reset the iteration ( Equivalent to #4 in Code Snippet)
@range(0,add(length(variables('SourceArray')),-1))
10. In case if the initial element is less than or equal than the next element value (FALSE Activities)
Increment the iteration ( Equivalent to yellow highlighted section in Code Snippet)
In Synapse / ADF , it is not possible to self reference a variable in Set Variable Activity. Hence, we would need to create a TempArray variable and then over ride the original array with the TempArray one.
11. The Final Activity : Debug is only for debugging purpose to get the sorted value in the last activity log (This can be avoided in real case scenarios as we are mapping the original sorted Array into a new variable)
Result :
Numeric :
Input :
Output :
String :
Input :
Output :
Note :
To get the Sort in Descending order, we can follow the below logics :
a) Reverse the Array once it is sorted in Ascending order
Reversing an Array in Azure Data Factory / Synapse Pipeline via Regular Expressions
b) In #8 , IF Activity ; use the below logic
@less(variables('SourceArray')[int(variables('CurrentItemValue'))], variables('SourceArray')[int(variables('NextItemValue'))])