Index does not work when using order().by() in Titan

纵饮孤独 提交于 2019-12-02 03:06:02

问题


The Titan documentation says that:

Mixed indexes support ordering natively and efficiently. However, the property key used in the order().by() method must have been previously added to the mixed indexed for native result ordering support. This is important in cases where the the order().by() key is different from the query keys. If the property key is not part of the index, then sorting requires loading all results into memory.

So, I made a mixed index on prop1 property. The mixed index on prop1 works well when value is specified.

gremlin> g.V().has('prop1', gt(1)) /* this gremlin uses the mixed index */
==>v[6017120]
==>v[4907104]
==>v[8667232]
==>v[3854400]
...

But, When I use order().by() on prop1 I cannot take advantage of the mixed index.

gremlin> g.V().order().by('prop1', incr) /* doesn't use the mixed index */
17:46:00 WARN  com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
Could not execute query since pre-sorting requires fetching more than 1000000 elements. Consider rewriting the query to exploit sort orders

Also count() takes so long time.

gremlin> g.V().has('prop1').count()
17:44:47 WARN  com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx  - Query requires iterating over all vertices [()]. For better performance, use indexes

I'd be happy if I know what's wrong with me. Here are my Titan information:

  • Titan Version: 1.0.0-hadoop1
  • Storage Backend: Cassandra 2.1.1
  • Index Backend: ElasticSearch 1.7

Thank you.


回答1:


You must supply a value to filter on for the indices to be used. Here:

g.V().order().by('prop1', incr)

you don't provide any filter, so Titan has to iterate all of V() and then applies the sort.

Here:

g.V().has('prop1').count()

you supply a indexed key but don't specify a value to filter on so it's still iterating all of V(). You could do:

g.V().has("prop1", textRegex(".*")).count()

In this case, you would fake Titan out a bit, but the query still could be slow anyway if that query returns a lot of results to iterate over.



来源:https://stackoverflow.com/questions/34285006/index-does-not-work-when-using-order-by-in-titan

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!