whoosh

Whoosh - accessing search_page result items throws ReaderClosed exception

半世苍凉 提交于 2021-01-27 16:32:26
问题 Following is a simple pagination function. from whoosh import index def _search(q): wix = index.open_dir(settings.WHOOSH_INDEX_DIR) term = Term("title", q) | Term("content", q) page_id = 1 with wix.searcher() as s: return s.search_page(term, page_id, pagelen=settings.ITEMS_PER_PAGE) In [15]: p = _search("like") In [16]: p.results[0].reader.is_closed Out[16]: True if I try to access an attribute of a Hit, i get ReaderClosed exception. In [19]: p.results Out[19]: <Top 10 Results for Or([Term(

Python分词工具——jieba

爷,独闯天下 提交于 2020-05-08 10:24:18
jieba简介   python在数据挖掘领域的使用越来越广泛。想要使用python做文本分析,分词是必不可少的一个环节在python的第三方包里,jieba应该算得上是分词领域的佼佼者。 GitHub地址: https://github.com/fxsjy/jieba 安装方法 # 全自动安装: easy_install jieba 或者 pip install jieba / pip3 install jieba # 半自动安装: 先下载 http://pypi.python.org/pypi/jieba/ ,解压后运行 python setup.py install # 手动安装: 将 jieba 目录放置于当前目录或者 site-packages 目录 主要算法 基于前缀词典实现高效的词图扫描,生成句子中汉字所有可能成词情况所构成的有向无环图 (DAG) 采用了动态规划查找最大概率路径, 找出基于词频的最大切分组合 对于未登录词,采用了基于汉字成词能力的 HMM 模型,使用了 Viterbi 算法 特点 支持三种分词模式: 精确模式,试图将句子最精确地切开,适合文本分析; 全模式,把句子中所有的可以成词的词语都扫描出来, 速度非常快,但是不能解决歧义; 搜索引擎模式,在精确模式的基础上,对长词再次切分,提高召回率,适合用于搜索引擎分词。 支持繁体分词 支持自定义词典 MIT

django haystack whoosh not showing any errors also no results

本秂侑毒 提交于 2020-03-23 23:59:20
问题 I am trying to django-haystack whoosh. Django Haystack & Whoosh search working but in page no giving result. I have looked up this similar question too Django-haystack-whoosh is giving no results Django Haystack & Whoosh Search Working, But SearchQuerySet Return 0 Results pip freeze python==3.6 django==2.0.7 Whoosh==2.7.4 -e git://github.com/django-haystack/django.haystack.git#egg=django-haystack Here is my code: search_indexes.py import datetime from haystack import indexes from search

How flask-whooshalchemy index data imported manually?

﹥>﹥吖頭↗ 提交于 2020-01-23 18:21:49
问题 I'm using flask-whooshalchemy on sqlite, and mannually imported a lot of data, now whoosh can search none of it. I think it's because whoosh haven't indexed any of the data, right? How could I add whoosh index on those data manually? 回答1: Have a look at https://gist.github.com/davb5/21fbffd7a7990f5e066c I've just written this to solve the same issue - rebuild search indices after a bulk data import. It won't work out of the box for anyone else (my "lib" import contains all of my third party

How to use n-grams in whoosh

蹲街弑〆低调 提交于 2020-01-14 09:34:05
问题 I'm trying to use n-grams to get "autocomplete-style" searches using Whoosh. Unfortunately I'm a little confused. I have made an index like this: if not os.path.exists("index"): os.mkdir("index") ix = create_in("index", schema) ix = open_dir("index") writer = ix.writer() q = MyTable.select() for item in q: print 'adding %s' % item.Title writer.add_document(title=item.Title, content=item.content, url = item.URL) writer.commit() I then search it for the title field like this: querystring = 'my

Index related table using Haystack/Whoosh

吃可爱长大的小学妹 提交于 2020-01-06 03:06:45
问题 How can I index a related table: class Foo(models.Model): name = models.CharField(max_length=50) Class FooImg(models.Model): image = models.ImageField(upload_to='img/', default = 'img/no-img.jpg', verbose_name='Image', ) foo = models.ForeignKey(Foo, default=None, null=True, blank=True) I want to index FooImg, so that I can get the images associated with Foo. I have already indexed Foo, and it works perfectly fine, it returns expected result. So in my template I have: {% for r in foo_search %}

django-haystack + Whoosh SearchQuerySet().all() always None

删除回忆录丶 提交于 2020-01-05 05:55:19
问题 I am using: django: 1.9.7 django-haystack: 2.5.0 whoosh: 2.7.4 search_index.py class ProfileIndex(indexes.SearchIndex, indexes.Indexable): text = indexes.CharField(document=True, use_template=True) last_name= indexes.CharField(model_attr='last_name') content_auto = indexes.EdgeNgramField(model_attr='first_name') def get_model(self): return User def index_queryset(self, using=None): """Used when the entire index for model is updated.""" return self.get_model().objects.all() user_text.txt {{

Where does Whoosh (Python) physically store the indexed content?

一笑奈何 提交于 2020-01-04 09:19:09
问题 I am beginning to research on content indexing implementation, and was having a look at Whoosh (https://pypi.python.org/pypi/Whoosh/). I am curious to know where Whoosh stores its content physically - Is it using files? 回答1: Whoosh uses a pluggable storage system; if you use the create_in() function then a FileStorage() class is used that stores indexes in files in a directory. See the Whoosh quickstart: Once you have the schema, you can create an index using the create_in function: import os

Haystack search on a many to many field is not working

巧了我就是萌 提交于 2020-01-02 03:29:33
问题 I'm trying to run a search on a model that has a many to many field, and I want to filter the search using this field. here is my current code: search_indexes.py class ListingInex(indexes.SearchIndex, indexes.Indexable): text = indexes.CharField(document=True, use_template=True) business_name = indexes.CharField(model_attr='business_name') category = indexes.MultiValueField(indexed=True, stored=True) city = indexes.CharField(model_attr='city') neighborhood= indexes.CharField(model_attr=

Django + Haystack + Whoosh, no results in production

耗尽温柔 提交于 2019-12-25 04:04:00
问题 I'm building a Django application using Haystack+Whoosh for search. In the development environment, search works as expected. However, in production, searches consistently return no results. Development: $> python manage.py rebuild_index ... All documents removed. Indexing 8 categories Indexing 4 documents $> python manage.py shell ... >>> from haystack.query import SearchQuerySet >>> SearchQuerySet().all().count() 12 Production: $> dokku run proj python manage.py rebuild_index -v2 ... All