CosmosDB, very long index that's also the partition key

我与影子孤独终老i 提交于 2019-12-12 16:59:29

问题


We are storing a folder tree, the number of items is huge so we have created a partition on the parent folder.

When we issue queries such as

SELECT * FROM root WHERE root.parentPath = "\\server\share\shortpath" AND root.isFile 

The RUs is very low and the performance is very good.

But, when we have a long path eg

SELECT * FROM root WHERE root.parentPath = "\\server\share\a very\long\path\longer\than\this" AND root.isFile

The RUs go up to 5000 and the performance suffers.

parentPath works well as a partition key as all queries include this field in the filter.

If I add another clause to the query it also becomes very fast, eg if I do something like and root.name = 'filename'

It's almost like it's scanning the entire partition based on the hash that's derived from it.

The Query returns NO DATA

which is fine as its someone looking for child folders under a given node, once you get deep it just gets very slow.

Query Metrics

x-ms-documentdb-query-metrics: 

totalExecutionTimeInMs=1807.61;
queryCompileTimeInMs=0.08;
queryLogicalPlanBuildTimeInMs=0.04;
queryPhysicalPlanBuildTimeInMs=0.06;
queryOptimizationTimeInMs=0.01;
VMExecutionTimeInMs=1807.11;
indexLookupTimeInMs=0.65;
documentLoadTimeInMs=1247.08;
systemFunctionExecuteTimeInMs=0.00;
userFunctionExecuteTimeInMs=0.00;
retrievedDocumentCount=72554;
retrievedDocumentSize=59561577;
outputDocumentCount=0;
outputDocumentSize=49;
writeOutputTimeInMs=0.00;
indexUtilizationRatio=0.00

From string

x-ms-documentdb-query-metrics: totalExecutionTimeInMs=1807.61;queryCompileTimeInMs=0.08;queryLogicalPlanBuildTimeInMs=0.04;queryPhysicalPlanBuildTimeInMs=0.06;queryOptimizationTimeInMs=0.01;VMExecutionTimeInMs=1807.11;indexLookupTimeInMs=0.65;documentLoadTimeInMs=1247.08;systemFunctionExecuteTimeInMs=0.00;userFunctionExecuteTimeInMs=0.00;retrievedDocumentCount=72554;retrievedDocumentSize=59561577;outputDocumentCount=0;outputDocumentSize=49;writeOutputTimeInMs=0.00;indexUtilizationRatio=0.00

回答1:


This is because of a path length limit in Indexing v1.

We have increased the path length limit to a larger value in the new index layout, therefore migrating the collections to this new layout would fix the issue and provide many performance benefit.

We have rolled out the new index layout for new collections by default. If it is possible for you to recreate the current collection and migrate existing data over there, it would be great. Otherwise, an alternative is to trigger the migration process to move existing collections to the new index layout. The following C# method can be used to do that:

static async Task UpgradeCollectionToIndexV2Async(

        DocumentClient client,

        string databaseId,

        string collectionId)

    {

        DocumentCollection collection = (await client.ReadDocumentCollectionAsync(string.Format("/dbs/{0}/colls/{1}", databaseId, collectionId))).Resource;

        collection.SetPropertyValue("IndexVersion", 2);

        ResourceResponse<DocumentCollection> replacedCollection = await client.ReplaceDocumentCollectionAsync(collection);

        Console.WriteLine(string.Format(CultureInfo.InvariantCulture, "Upgraded indexing version for database {0}, collection {1} to v2", databaseId, collectionId));

    }

It could take several hours for the migration to complete, depending on the amount of data in the collection. The issue should be addressed once it is completed.

(This was copy pasted from an email conversation we had to resolve this issue)



来源:https://stackoverflow.com/questions/54479827/cosmosdb-very-long-index-thats-also-the-partition-key

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!