SolrClient python update document

天涯浪子 提交于 2020-01-07 02:29:33

问题


I'm currently trying to create a small python program using SolrClient to index some files.

My need is that I want to index some file content and then add some attributes to enrich the document. I used the post command line tool to index the files. Then I use a python program trying to enrich documents, something like this:

doc = solr.get('collection', id)
doc['new_attribute'] = 'value'
solr.index_json('collection',json.dumps([doc]))
solr.commit(openSearcher=True)

Problem is that I have the feeling that we lost file content index. If I run a query with a word present in all attributes of the doc, I find it.

If I run a query with a word only in the file, it does not work (it works indexing only the file with post without my update tentative).

I'm not sure to understand how to update the doc keeping the index created by the post command.

I hope I'm clear enough, maybe I misunderstood the way it works...

thanks a lot


回答1:


If I understand correctly, you want to modify an existing record. You should be able to do something like this without using a solr.get:

doc = [{'id': 'value', 'new_attribute':{'set': 'value'}}]
solr.index_json('collection',json.dumps([doc]))

See also: https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents




回答2:


Trying with Curl did not change anything. I did it differently so now it works. Instead of adding the file with the post command and trying to modify it afterwards, I read the file in a string and index in a "content" field. It means every document is added in one shot.

The content field is defined as not stored, so I just index it.

It works fine and suits my needs. It's also more simple since it removes many attributes set by post command that I don't need.

If I find some time, I'll try again the partial update and update the post.

Thanks Rémi




回答3:


It has worked for me in this way, it can be useful for someone

from SolrClient import SolrClient    
solrConect = SolrClient("http://xx.xx.xxx.xxx:8983/solr/")
doc = [{'id': 'my_id', 'count_related_like':{'set': 10}}]
solrConect.index_json("my_collection", json.dumps(doc) )
solrConect.commit("my_collection", softCommit=True)


来源:https://stackoverflow.com/questions/42092635/solrclient-python-update-document

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!