Here is a piece of code that initialize a TableBatchOperation designed to retrieve two rows in a single batch:
TableBatchOperation batch = new TableBatchOpe
Okay, so a batch retrieve operation, best case scenario is a table query. Less optimal situation would require parallel retrieve operations.
Depending on your PK, RK design you can based on a list of (PK, RK) figure out what is the smallest/most efficient set of retrieve/query operations that you need to perform. You then fetch all these things in parallel and sort out the exact answer client side.
IMAO, it was a design miss by Microsoft to add the Retrieve method to the TableBatchOperation class because it conveys semantics not supported by the table storage API.
Right now, I'm not in the mood to write something super efficient, so I'm just gonna leave this super simple solution here.
var retrieveTasks = new List>();
foreach (var item in list)
{
retrieveTasks.Add(table.ExecuteAsync(TableOperation.Retrieve(item.pk, item.rk)));
}
var retrieveResults = new List();
foreach (var retrieveTask in retrieveTasks)
{
retrieveResults.Add(await retrieveTask);
}
This asynchronous block of code will fetch the entities in list in parallel and store the result in the retrieveResults preserving the order. If you have continuous ranges of entities that you need to fetch you can improve this by using a rang query.
There a sweet spot (that you'll have to find by testing this) is where it's probably faster/cheaper to query more entities than you might need for a specific batch retrieve then discard the results you retrieve that you don't need.
If you have a small partition you might benefit from a query like so:
where pk=partition1 and (rk=rk1 or rk=rk2 or rk=rk3)
If the lexicographic (i.e. sort order) distance is great between your keys you might want to fetch them in parallel. For example, if you store the alphabet in table storage, fetching a and z which are far apart is best to do with parallel retrieve operations while fetching a, b and c which are close together is best to do with a query. Fetching a, b c, and z would benefit from a hybrid approach.
If you know all this up front you can compute what is the best thing to do given a set of PKs and RKs. The more you know about how the underlying data is sorted the better your results will be. I'd advice a general approach to this one and instead, try to apply what you learn from these different query patterns to solve your problem.