Lucene in Neo4j has some misbehaviours in terms of reliable search querys - compared to OrientDB

谁说我不能喝 提交于 2019-12-03 20:28:46

Well, I want to share what I found out about my issues until now:

Infos about Query #0,#1 and #2:

  1. It is not possible to change the TFIDF of Neo4j. They are using an own implementation that cannot be changed.
  2. In OrientDB ordering before searching is currently slow.

    SELECT FROM (
      SELECT title,ID FROM Appln WHERE title LUCENE "solar*" ORDER BY ID ASC
    )  LIMIT 1
    
    Query executed in 11.531 sec. Returned 1 record(s)
    
    
    SELECT FROM (
      SELECT title,ID FROM Appln WHERE title LUCENE "solar*" ORDER BY ID ASC
    )  LIMIT 10
    
    Query executed in 225.176 sec. Returned 10 record(s)
    

    The reason for it's being that slow is that is does not corresponds with Lucene.

Fixing Query #3,#4 and #5:

the query is not correct. The equal is a direct match and not the fuzzy one. So

START n=node:titles(title="solar panel") RETURN n.title,n.ID ORDER BY n.ID ASC LIMIT 10

needs to be replaced by

START n=node:titles('title:solar\\ panel') RETURN n.title,n.ID ORDER BY n.ID ASC LIMIT 10

Really bad way that you need to escape things in the cypher. Here the order of the two words are important. But there is another way to say it

START n=node:titles('title:SoLar AND title:Panel') RETURN n.title,n.ID ORDER BY n.ID ASC LIMIT 10

but also really bad if you image you have a string and just ask Neo4j for results, you need a parser. But here the order of the words does not matter.

Fixing Query #6:

OrientDB is currently working on making the counting faster (milliseconds). Planned in the 2.0 Release in some days.

Neo4j has no plans about this.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!