Implementing DATEDIFF function in Data Factory / Synapse via Pipeline Expressions

Problem Statement :

In Azure Data Factory / Synapse, there are no dedicated functions under pipeline expressions equivalent to DATEDIFF.

So is it possible to reciprocate DATEDIFF in ADF / Synapse without the use of Dataflow.

Date1Date2IntervalOutput
2015/08/25 07:002017/08/25 12:45Year2
2015/08/25 07:002017/08/25 12:45Quarter8
2015/08/25 07:002017/08/25 12:45Month24
2015/08/25 07:002017/08/25 12:45Week104
2015/08/25 07:002017/08/25 12:45Day731
2015/08/25 07:002017/08/25 12:45Hour17549
2015/08/25 07:002017/08/25 12:45Minute1052985
2015/08/25 07:002017/08/25 12:45Second63179100
2015/08/25 07:002017/08/25 12:45MilliSecond63179100000

Prerequisites :

  1. Azure Data Factory / Synapse

Solution :

GitHUB Code

VARIABLES :

where Week, Day, Hour, Minute, Second, Millisecond are variables with values for conversion from Tick to the named entity.

  1. For most of the Intervals, we can use the tick function to work around this as we do not have an inbuild function get the time difference between two variables.
Value :
@ticks(pipeline().parameters.Date1)

Similarly, for Tick2 variable :

Value :
@ticks(pipeline().parameters.Date2)

2. Use a SWITCH activity for every interval and have Set Variable activity to do the necessary calculations depending on the interval value.

a) Year :

Value :
@sub(int(formatDateTime(pipeline().parameters.Date2,'yyyy')),int(formatDateTime(pipeline().parameters.Date1,'yyyy')))

b) Quarter :

We would be leveraging the below logic for Quarter calculation :

=CEILING(((YEAR(Date2)*12+MONTH(Date2))-(YEAR(Date1)*12+MONTH(Date1)))/3,1)

The above logic would be split in 2 activities : the one highlighted in bold within the initial activity and the remaining aspect of CEILING in another activity as there is no out of box function within ADF pipeline expression to derive the CEILING value.

Initial Expression : TempQuarter

Value :
@string(div(sub(add(mul(int(formatDateTime(pipeline().parameters.Date2,'yyyy')),12),int(formatDateTime(pipeline().parameters.Date2,'MM'))),add(mul(int(formatDateTime(pipeline().parameters.Date1,'yyyy')),12),int(formatDateTime(pipeline().parameters.Date1,'MM')))),3.0))

For CEILING function, please refer this blog : Implementing ROUND, FLOOR and CEILING Function in Data Factory / Synapse via Pipeline Expressions for understanding of the expression.

Value :
@if (greater(length(split(variables('TempQuarter'), '.')), 1),
   if (greater(int(split(variables('TempQuarter'), '.')[1]), 0),
      if (less(float(variables('TempQuarter')),0),string(split(variables('TempQuarter'), '.')[0]) , 	
	   add(int(split(variables('TempQuarter'), '.')[0]), 1)), 
   split(variables('TempQuarter'), '.')[0]),
int(variables('TempQuarter')))

c) Month :

We would be leveraging the below logic for Month calculation :

((date2.Year - date1.Year) * 12) + date2.Month - date1.Month
Value :
@sub(add(mul(sub(int(formatDateTime(pipeline().parameters.Date2,'yyyy')),int(formatDateTime(pipeline().parameters.Date1,'yyyy'))),12),int(formatDateTime(pipeline().parameters.Date2,'MM'))),int(formatDateTime(pipeline().parameters.Date1,'MM')))

d) Day :

We would convert the Date to the Tick values and take a difference of the Tick values, post which would convert the Tick to the corresponding Day conversion value.

Value :
@div(sub(variables('Tick2'),variables('Tick1')),variables('Day'))

Similarly, we would do the same for the below intervals :

e) Week :

Value :
@div(sub(variables('Tick2'),variables('Tick1')),variables('Week'))

f) Hour :

Value :
@div(sub(variables('Tick2'),variables('Tick1')),variables('Hour'))

g) Minute :

Value :
@div(sub(variables('Tick2'),variables('Tick1')),variables('Minute'))

h) Second :

Value :
@div(sub(variables('Tick2'),variables('Tick1')),variables('Second'))

i) Millisecond :

Value :
@div(sub(variables('Tick2'),variables('Tick1')),variables('Millisecond'))

j) In case if some random value ( or different Case Sensitive value) is provided in Interval Parameter, Default SWITCH case :

Output :

For the below Inputs :

Date1 : 2015/08/25 07:00

Date2 : 2017/08/25 12:45

For Interval values :

a) Year

b) Quarter :

c) MilliSecond :

d) Different CASE or different Value

i) year

ii) Random value like ‘yyyy’

Published by Nandan Hegde

Microsoft Data MVP |Microsoft Data platform Architect | Blogger | MSFT Community Champion I am a MSFT Data Platform MVP and Business Intelligence and Data Warehouse professional working within the Microsoft data platform eco-system which includes Azure Synapse Analytics ,Azure Data Factory ,Azure SQL Database and Power BI. To help people keep up with this ever-changing landscape, I frequently posts on LinkedIn, Twitter and to his blog on https://datasharkx.wordpress.com. LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66 GitHUB Profile : https://github.com/NandanHegde15 Twitter Profile : @nandan_hegde15 MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942

Leave a comment

Design a site like this with WordPress.com
Get started