问题
I sending query for some service and get back result. I want to know if I already get the same "answer" in the past. So, I planing to use Azure Table as a cache mechanism.
I making this small POC:
TableBatchOperation batchOperation = new TableBatchOperation();
CachedUrl customer1 = new CachedUrl(Guid.Empty, "test1");
CachedUrl customer2 = new CachedUrl(Guid.Empty, "test2");
batchOperation.Insert(customer1);
batchOperation.Insert(customer2);
table.ExecuteBatch(batchOperation);
When I run this code in the first time, it's working fine. At the end of this, I have 2 rows in the table.
The problem is in the second run. When I execute this code:
TableBatchOperation batchOperation = new TableBatchOperation();
CachedUrl customer1 = new CachedUrl(Guid.Empty, "test1");
CachedUrl customer2 = new CachedUrl(Guid.Empty, "test2");
CachedUrl customer3 = new CachedUrl(Guid.Empty, "test3");
batchOperation.Insert(customer1);
batchOperation.Insert(customer2);
batchOperation.Insert(customer3);
table.ExecuteBatch(batchOperation);
(Note to the add of customer3
)
What I expecting to get is a message that say:
- customer1 - exists
- customer2 - exists
- customer3 - added
What I actually get is this exception (on the ExecuteBatch()
method):
Request Information RequestID:5116ee8a-0002-0024-7ac1-415787000000 RequestDate:Fri, 18 Nov 2016 17:33:08 GMT StatusMessage:0:The specified entity already exists. ErrorCode:EntityAlreadyExists
The server found that the #1 entity is exist, therefore, skip the whole task.
How can I get the expected answer?
The naive solution, is to try the add all N items, one by one. But this solution is the most slow one (N HTTP requests instead 1 request).
回答1:
Azure Table Storage batch operation is atomic so it is expected to return on the first failed operation. A batch operation may contain 1000 operations, there is not much point for the table service to keep executing all operations after it detected the first failure.
The Storage Exception returns the actual index of the failed operation from the batch and the error related to that.
In your example below the index of the failed operation is 0 and the error is EntityAlreadyExists:
0:The specified entity already exists. ErrorCode:EntityAlreadyExists
You can write a retry logic that catches the StorageException, parses the error, if the error is EntityAlreadyExists, remove the operation with that index from your batch and resubmit the batch operation.
See the azure Storage Exception parser that I implemented in Nuget that extracts the index of the failed operation and other useful info like HttpStatusCode from the StorageException object for you: https://www.nuget.org/packages/AzureStorageExceptionParser/
In order to avoid multiple back and forth calls to azure on each failed operation, here is an alternative solution that you can explore:
Every time you insert an entity to table you also insert a second entity with the same partition key that only contains one property of row keys. Lets call this second entity RowKeyTracker entity. It will have the same partition key with the original entity so that you can do a batch operation. It will have a unique row key that you would know in order to query it and it will have a single property that is the appended row keys for that partition. If the RowKeyTracker entity already exists you just append the new row key to its row keys property for that Partition Key every time you insert a new entity, vice versa when you delete an entity you can also go ahead and remove that row key from the RowKeyTracker entity.
So you can use this Row Key tracker entity to figure out if the Row key for that partition is inserted already, by querying it first.
You can combine this approach with the first approach (retry) to have a more robust solution
回答2:
This is expected behavior. An entire batch fails as soon as any entity in that batch fails.
One possible thing you could to is use InsertOrReplace method instead of Insert
. This will update the entity if it exists otherwise inserts the entity.
From the documentation:
Adds a TableOperation to the TableBatchOperation that inserts the specified entity into a table if the entity does not exist; if the entity does exist then its contents are replaced with the provided entity.
来源:https://stackoverflow.com/questions/40682971/azure-storage-table-insert-batch-of-row-and-check-if-they-exists