How can I backup an Azure Table using Azure Data Factory without relying on the Copy Data activity for a full table backup?

I currently have a pipeline for copying multiple tables in an Azure Table Storage account to Azure Blob Storage. This pipelines, however, has consistent errors during the Copy Data activity, which appears out of my control. I’ve determined that some of the Azure Table read requests randomly take ~19 seconds to complete, and when this happens the whole activity fails and needs to restart. These seem to be random latency spikes and they occur frequently enough that they prevent our largest tables from being copied entirely.

Since I can’t use the Copy Data on the entire table, what other Azure Data Factory solutions would allow for a full backup of an Azure Table? Is there some way I could break up the backup into smaller chunks so that when one of those chunks fails it doesn’t need to restart the whole table?

I’ve considered iterating through dates and filtering by Timestamp, but since this column isn’t indexed, and we’re dealing with large tables, this doesn’t seem like a reasonable solution.

I also won’t know the partition keys for these tables, so iterating over and filtering by partition key seems impractical for this project.

Leave a Comment