Problem Statement :
Is it possible to Copy only the files present in Source and not in Sink ( i.e Delta / Missing files between Source and Sink )
Prerequisites :
- Azure Data Factory / Synapse
- Azure Blob Storage
Solution :
GitHUB Code
- Get the list of files across Source dataset and Sink dataset via Get Meta data activity.
where Dataset settings (Binary as we need to move the files as is) are as below for Source and Sink :
2. To get the Delta / Missing files between the Source and Sink, leverage the Filter activity
Items : @activity('Get List of Source Files').output.childItems
Conditions : @not(contains(activity('Get List Of Sink Files').output.childItems,item()))
3. Use For Each Activity to iterate over the Missing files
Items : @activity('Get Delta Or Missing Files').output.value
4. Copy the Missing file in Iteration via Copy activity to Sync the Source and Sink
Source Dataset :
wherein create a Dataset Parameter : FileNm
Sink Dataset :
wherein create a Dataset Parameter : FileNm
Result :
Input :
Source Location:
Sink Location :
Output :
Get Metadata Source –
Get Metadata Sink –
Filter Activity :
Final Sink Location :