sphinx

[coreseek/sphinx学习笔记1]--简介

假装没事ソ 提交于 2019-12-01 21:26:30
[参考Coreseek 全文检索服务器 2.0 (Sphinx 0.9.8)参考手册,详情见 http://www.coreseek.cn/docs/sphinx_doc_zhcn_0.9.pdf ] 1.1 什么是Sphinx Sphinx 是 SQL Phrase Index 的缩写,但不幸的和 CMU 的 Sphinx 项目重名。Coreseek 全文检索服务器 2.0 是在 Sphinx 基础上开发的全文检索软件,按照 GPLv2 协议发行 1.2 特称: (1)高速的建立索引(在当代 CPU 上,峰值性能可达到 10 MB/秒); (2)高性能的搜索(在 2 – 4GB 的文本数据上,平均每次检索响应时间小于 0.1 秒); (3)可处理海量数据(目前已知可以处理超过 100 GB 的文本数据, 在单一 CPU 的系统上可处理 100 M 文档); (4)提供了优秀的相关度算法,基于短语相似度和统计(BM25)的复合 Ranking 方法; (5)支持分布式搜索; (6)provides document exceprts generation; (7)可作为 MySQL 的存储引擎提供搜索服务; (8)支持布尔、短语、词语相似度等多种检索模式; (9)文档支持多个全文检索字段(最大不超过 32 个); (10)文档支持多个额外的属性信息(例如:分组信息,时间戳等);

[coreseek/sphinx学习笔记6]--实例代码

我的梦境 提交于 2019-12-01 21:26:15
[这是本人工作中实现的coreseek/sphinx搜索,稍做了些修改,希望大家提出优化意见] 1.环境说明 安装目录:/var/coreseek/coreseek 配置 文件:(./etc/sphinx.conf) #searchd服务配置 searchd { port = 9312 log = /var/coreseek/coreseek/var/log/searchd.log query_log = /var/coreseek/coreseek/var/log/query.log read_timeout = 5 max_children = 30 pid_file = /var/coreseek/coreseek/var/log/searchd.pid max_matches = 1000 seamless_rotate = 1 preopen_indexes = 0 unlink_old = 1 listen = localhost:9306:mysql41 } #数据源配置 source ztm_sheet_xml_source { type = xmlpipe xmlpipe_command = cat /var/www/htdocs/myweb/application/data/sphinxforsheet.pipe.xml xmlpipe_field = sh

solr or sphinx? which is better? [duplicate]

百般思念 提交于 2019-12-01 19:45:58
问题 This question already has answers here : Closed 7 years ago . Possible Duplicate: Choosing a stand-alone full-text search server: Sphinx or SOLR? I will use it to do full text search in my ruby on rails app. which is the best choice. solr use java to do this job. or sphix in ruby? 回答1: I have no experience with Solr, but Sphinx is easy to install, fast and works great with Thinking Sphinx: http://freelancing-god.github.com/ts/en/indexing.html There is also a good railscast: http://railscasts

Sphinx and “did you mean … ?” suggestions idea. WIll it work?

落花浮王杯 提交于 2019-12-01 17:47:35
I'm trying to come up with the fastest way to make search suggestions. At first I thought a Levenstein UDF function combined with a mysql table would do the job. But using levenshtein, mysql would have to go over every row in the table (tons of words) which would make the query really slow. Now I recently installed and started to use Sphinx (http://sphinxsearch.com/) for fulltext searching mainly because of its performance and tight mysql integration with SphinxSE. So I asked myself if I can implement a "did you mean" algorithm using sphinx to boost performance somehow, and I think I found a

When updating an index in sphinx.conf is restarting searchd in sphinx always required?

给你一囗甜甜゛ 提交于 2019-12-01 06:23:18
If I update a resource in my sphinx.conf file I can reindex with --rotate and everything works fine. If I update an index in my sphinx.conf or add a new index --rotate has no effect and I have to restart searchd. Am I doing this correctly, I feel like --rotate should correctly index the new or modified index configurations. It depends on your sphinx version. In the latest versions just about anything (except maybe the searchd config section) will work with changing the config file. Just changing the settings on an individual index, a --rotate indexing of the particular index is enough. If you

When updating an index in sphinx.conf is restarting searchd in sphinx always required?

一曲冷凌霜 提交于 2019-12-01 05:30:57
问题 If I update a resource in my sphinx.conf file I can reindex with --rotate and everything works fine. If I update an index in my sphinx.conf or add a new index --rotate has no effect and I have to restart searchd. Am I doing this correctly, I feel like --rotate should correctly index the new or modified index configurations. 回答1: It depends on your sphinx version. In the latest versions just about anything (except maybe the searchd config section) will work with changing the config file. Just

lnmp+coreseek实现站内全文检索(安装篇)

≡放荡痞女 提交于 2019-11-30 17:30:17
软件安装包 安装环境 系统环境 centos7.2 1核2G 软件环境 coreseek-3.2.14 lnmp1.5 安装mmseg 更新依赖包和安装编译环境 yum -y install m4 autoconf automake libtool yum -y install gcc gcc-c++ wget yum -y install mysql-devel 安装coreseek tar -xzvf coreseek-3.2.14.tar.gz cd coreseek-3.2.14 cd mmseg-3.2.14/ ./bootstrap ./configure --prefix=/usr/local/mmseg3 make make install cd ../csft-3.2.14/ sh buildconf.sh ./configure --prefix=/usr/local/coreseek --without-python --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql --host=arm make make install

[coreseek/sphinx学习笔记2]--安装

↘锁芯ラ 提交于 2019-11-30 17:30:03
[参考Coreseek 全文检索服务器 2.0 (Sphinx 0.9.8)参考手册,详情见 http://www.coreseek.cn/docs/sphinx_doc_zhcn_0.9.pdf ] 2.1 平台 目前的阶段,Sphinx 的 Windows 版可用于测试和调试,但不建议用于生产系统。突出的两个问题是: (1)缺少并发查询的支持; (2)缺少索引数据热切换的支持。虽然目前已经有成功的生产环境克服了这两个问题,仍然不推荐在 Windows 下运行 Sphinx 提供高强度的搜索服务。 2.2 安装步骤 (1)将你下载的 tar 包解压,并进入 sphinx 子目录: $ tar xzvf sphinx-0.9.7.tar.gz $ cd sphinx (2)运行 configuration 程序: $ ./configure configure 程序有很多运行选项。完整的列表可以通过使用 --help 开关得到。最重要的如下: --prefix, 定义将 Sphinx 安装到何处; --with-mysql, 当自动检测失败时,指出在那里能找到 MySQL 头文件和库文件; --with-pgsql, 指出在那里能找到 PostgreSQL 头文件和库文件。 (3)制作二进制程序: $ make (4)按照二进制程序到你选好的目录下: $ make install

[coreseek/sphinx学习笔记3]--建立索引

做~自己de王妃 提交于 2019-11-30 17:29:51
[参考Coreseek 全文检索服务器 2.0 (Sphinx 0.9.8)参考手册,详情见 http://www.coreseek.cn/docs/sphinx_doc_zhcn_0.9.pdf ] 3.1 数据源 索引数据是一个结构化的文档的集合,其中每个文档是字段的集合。 如果确有必要,一个索引的数据可以来自多个数据源。这些数据将严格按照配置文件中定义的顺序进行处理。所有从这些数据源获取到的文档将被合并,共同产生一个索引,如同他们来源于同一个数据源一样。 3.2 属性 属性是附加在每个文档上的额外的信息(值),可以在搜索的时候用于过滤和排序。目前支持的属性类型如下: 无符号整数(1-32 位宽) UNIX 时间戳(timestamps) 浮点值(32 位,IEEE 754 单精度) 字符串叙述 (尤其是计算出的整数值); 多值属性 MVA(multi-value attributes)(32 位无符号整形值的变长序列). 3.3 多值属性 MVA 多值属性 MVA(multi-valued attributes)是文档属性的一种重要的特例,MVA 使得向文档附加一系列的值作为属性的想法成为可能。这对文章的 tags,产品类别等等非常有用。MVA 属性支持过滤和分组(但不支持分组排序)。目前 MVA 列表项的值被限制为 32 位无符号整数。列表的长度不受限制,只要有足够的RAM

Problem running Thinking Sphinx with Rails 2.3.5

﹥>﹥吖頭↗ 提交于 2019-11-30 17:10:49
问题 I just installed Sphinx (distro: archlinux) downloading the source. Then I installed "Thinking Sphinx" plugin for Rails. I followed the official page setup and this Screencast from Ryan Bates, but when I try to index the models it gives me this error: $ rake thinking_sphinx:index (in /home/benoror/Dropbox/Proyectos/cotizahoy) Sphinx cannot be found on your system. You may need to configure the following settings in your config/sphinx.yml file: * bin_path * searchd_binary_name * indexer_binary