Does Sphinx auto update is index when you add data to your SQL?

喜欢而已 提交于 2019-12-03 20:17:50

As found in the sphinx documentation part about real-time indexes

Real-time indexes (or RT indexes for brevity) are a new backend that lets you insert, update, or delete documents (rows) on the fly.

So to update an index on the fly you would just need to make a query like

{INSERT | REPLACE} INTO index [(column, ...)]
VALUES (value, ...)
[, (...)]

The answer is no and you need to tell sphinx to reindex your db.

There are some steps and requirements which you need to know:

  1. Main and delta are requirement
  2. First run you need to index your main index.
  3. After the first run, you may index delta by rotating it ( to make sure the service is running and the data on the web is can be used at the time )
  4. Before you go further step, you need to create a table to mark your "last indexed rows". THe last indexed rows ID could be used for the next indexing delta and merging delta into main.
  5. You need to merge your delta index to the main index. as inside the sphinx documents http://sphinxsearch.com/docs/current.html#index-merging
  6. Restart sphinx service.

    TIPS: Create your own program that could execute the index by using C# or other languages. You may try the task schedule of windows also can.

Here is my conf:

source Main
{
type            = mysql

sql_host        = localhost
sql_user        = root
sql_pass        = password
sql_db          = table1
sql_port        = 3306  # optional, default is 3306
sql_query_pre = REPLACE INTO table1.sph_counter SELECT 1, MAX(PageID) FROM table1.pages;
sql_query       = \
    SELECT  pd.`PageID`, pd.Status from table1.pages pd
    WHERE pd.PageID>=$start AND pd.PageID<=$end \
    GROUP BY pd.`PageID`

sql_attr_uint       = Status

sql_query_info      = SELECT * FROM table1.`pages` pd WHERE pd.`PageID`=$id
sql_query_range     = SELECT MIN(PageID),MAX(PageID)\
              FROM tabl1.`pages`
sql_range_step      = 1000000
}


source Delta : Main
{
sql_query_pre = SET NAMES utf8

sql_query = \
    SELECT  PageID, Status from pages \
    WHERE PageID>=$start AND PageID<=$end 

sql_attr_uint       = Status

sql_query_info      = SELECT * FROM table1.`pages` pd WHERE pd.`PageID`=$id
sql_query_range     = SELECT (SELECT MaxDoc FROM table1.sph_counter WHERE ID = 1) MinDoc,MAX(PageID) FROM table1.`pages`;
sql_range_step      = 1000000
}


index Main
{
source          = Main
path            = C:/sphinx/data/Main
docinfo         = extern
charset_type        = utf-8
}


index Delta : Main
{
    source = Delta
path = C:/sphinx/data/Delta
charset_type = utf-8
}

To expand on Anne's answer - if you're using SQL indices, it won't update automatically. You can manage the process of reindexing after every change - but that can be expensive. One way to get around this is have a core index with everything, and then a delta index with the same structure that indexes just the changes (this could be done by a boolean or timestamp column).

That way, you can just reindex the delta index (which is smaller, and thus faster) on a super-regular basis, and then process both core and delta together less regularly (but still, best to do it at least daily).

But otherwise, the new RT indices are worth looking at - you still need to update things yourself, and it's not tied to the database, so it's a different mindset. Also: RT indices don't have all the features that SQL indices do, so you'll need to decide what's more important.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!