Retrieving many rows using a TableBatchOperation is not supported?

前端 未结 7 921
被撕碎了的回忆
被撕碎了的回忆 2021-01-11 11:55

Here is a piece of code that initialize a TableBatchOperation designed to retrieve two rows in a single batch:

 TableBatchOperation batch = new TableBatchOpe         


        
7条回答
  •  情深已故
    2021-01-11 12:04

    I know that this is an old question, but as Azure STILL does not support secondary indexes, it seems it will be relevant for some time.

    I was hitting the same type of problem. In my scenario, I needed to lookup hundreds of items within the same partition, where there are millions of rows (imagine GUID as row-key). I tested a couple of options to lookup 10,000 rows

    1. (PK && RK)
    2. (PK && RK1) || (PK & RK2) || ...
    3. PK && (RK1 || RK2 || ... )

    I was using the Async API, with a maximum 10 degrees of parallelism (max 10 outstanding requests). I also tested a couple of different batch sizes (10 rows, 50, 100).

    Test                        Batch Size  API calls   Elapsed (sec)
    (PK && RK)                  1           10000       95.76
    (PK && RK1) || (PK && RK2)  10          1000        25.94
    (PK && RK1) || (PK && RK2)  50          200         18.35
    (PK && RK1) || (PK && RK2)  100         100         17.38
    PK && (RK1 || RK2 || … )    10          1000        24.55
    PK && (RK1 || RK2 || … )    50          200         14.90
    PK && (RK1 || RK2 || … )    100         100         13.43
    

    NB: These are all within the same partition - just multiple rowkeys.

    I would have been happy to just reduce the number of API calls. But as an added benefit, the elapsed time is also significantly less, saving on compute costs (at least on my end!).

    Not too surprising, the batches of 100 rows delivered the best elapsed performance. There are obviously other performance considerations, especially network usage (#1 hardly uses the network at all for example, whereas the others push it much harder)

    EDIT Be careful when querying for many rowkeys. There is (or course) a URL length limitation to the query. If you exceed the length, the query will still succeed because the service can not tell that the URL was truncated. In our case, we limited the combined query length to about 2500 characters (URL encoded!)

提交回复
热议问题