How to correctly create thesaurus dictionary for my custom text search configuration

生来就可爱ヽ(ⅴ<●) 提交于 2020-08-10 18:57:37

问题


I use PostgreSQL 11.8. And for Postgres, I use the docker image postgres:11-alpine. I want to create a custom full text search dictionary for expressions which are based on some words, like hello world should become hw.

First of all I have a custom full text search configuration my_swedish:

CREATE TEXT SEARCH CONFIGURATION my_swedish (
   COPY = swedish
);

ALTER TEXT SEARCH CONFIGURATION my_swedish
   DROP MAPPING FOR hword_asciipart;
ALTER TEXT SEARCH CONFIGURATION my_swedish
   DROP MAPPING FOR hword_part;

and for this configuration I want to create and use a dictionary. For that I follow the PostgreSQL manual:

CREATE TEXT SEARCH DICTIONARY thesaurus_my_swedish (
    TEMPLATE = thesaurus,
    DictFile = thesaurus_my_swedish,
    Dictionary = pg_catalog.swedish_stem
);

and am faced with

ERROR:  could not open thesaurus file "/usr/local/share/postgresql/tsearch_data/thesaurus_my_swedish.ths": No such file or directory

I then created the file manually:

touch /usr/local/share/postgresql/tsearch_data/thesaurus_astro.ths

then:

ALTER TEXT SEARCH CONFIGURATION my_swedish
    ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
    WITH thesaurus_my_swedish;

 ERROR:  text search configuration "my_swedish" does not exist

When I changed it to default swedish

ALTER TEXT SEARCH CONFIGURATION swedish
    ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
    WITH thesaurus_my_swedish;

I got the error:

ERROR:  text search dictionary "thesaurus_my_swedish" does not exist

How to correctly create a thesaurus dictionary for my custom test search configuration?

UPDATE I added in my file thesaurus_my_swedish.ths data hello world : hw And now

SELECT to_tsvector('my_swedish', 'hello world');

returned 'hw':1 ,

But what about about othr words ? Because to_tsvector('my_swedish', 'hello test') return empty, it should be returned like default swedish

SELECT to_tsvector('swedish', 'hello test');
'hello':1 'test':2

what is wrong ?

UPDATE

I understand, need to add pg_catalog.swedish_stem too

ALTER TEXT SEARCH CONFIGURATION my_swedish
   ALTER MAPPING FOR asciihword, asciiword, hword, word
   WITH thesaurus_my_swedish, pg_catalog.swedish_stem;

回答1:


You did everything right, with a few exceptions:

  • thesaurus_my_swedish.ths should not be empty, but contain rules like this (taken from your example):

    hello world : hw
    
  • You should use the new dictionary for all token types that now use swedish_stem, that is

    ALTER TEXT SEARCH CONFIGURATION my_swedish
       ALTER MAPPING FOR asciihword, asciiword, hword, word
       WITH thesaurus_my_swedish, swedish_stem;
    

This error is mysterious and should not have happened:

ERROR:  text search configuration "my_swedish" does not exist

Perhaps you connected to the wrong database, or you dropped the configuration again, or it is not on the search_path and you have to qualify it with its schema. Use \dF *.* in psql to list all existing configurations.

Of course you have to create the dictionary before you can use it in a text search configuration.

Don't modify the configurations in pg_catalog, such modifications would be lost after an upgrade.



来源:https://stackoverflow.com/questions/62652659/how-to-correctly-create-thesaurus-dictionary-for-my-custom-text-search-configura

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!