Poor Performance when trigram similarity and full-text-search were combined with Q ind django using postgres

落爺英雄遲暮 提交于 2021-01-04 07:50:31

问题


I'm creating a web application to search people with their properties such as education, experience, etc. I can't use full-text-search for all the fields, because, some has to be a fuzzy match. (Eg: if we search for biotech, it should pick bio tech, biotech and also bio-tech). My database has about 200 entries in the profile model, which is to appear in the search results.

Other models like education and experience are connected to profile through foreign key

Therefore, I decided to be selective on what method to use on what field. For shorter fields like degree name (In the Education model) I want to use trigram similarity. For fields like education description, I use Full-text search.

However, Since I have to do this in multiple fields, I used simple lookups instead of using search vectors.

Profile.objects.filter(
    Q(first_name__trigram_similar=search_term) |
    Q(last_name__trigram_similar=search_term) |
    Q(vision_expertise__search=search_term) |
    Q(educations__degree__trigram_similar=search_term) |
    Q(educations__field_of_study__trigram_similar=search_term) |
    Q(educations__school__trigram_similar=search_term) |
    Q(educations__description__search=search_term) |
    Q(experiences__title__trigram_similar=search_term) |
    Q(experiences__company__trigram_similar=search_term) |
    Q(experiences__description__search=search_term) |
    Q(publications__title__trigram_similar=search_term) |
    Q(publications__description__search=search_term) |
    Q(certification__certification_name__trigram_similar=search_term) |
    Q(certification__certification_authority__trigram_similar=search_term) |
    Q(bio_description__search=search_term) |
)

I get the expected results on every search. However, the time it takes to get it is ridiculously slow. I can't figure it out how to make this faster.


回答1:


Without the class code it's difficult to find the better way to optimize your query.

You can add a Gin or Gist index to speed up the trigram similarity.

You can build an annotation with the SearchVector as below:

from django.contrib.postgres.aggregates import StringAgg
from django.contrib.postgres.search import SearchQuery, SearchVector

search_vectors = (
    SearchVector('vision_expertise') +
    SearchVector('bio_description') +
    SearchVector(StringAgg('experiences__description', delimiter=' ')) +
    SearchVector(StringAgg('educations__description', delimiter=' ')) +
    SearchVector(StringAgg('publications__description', delimiter=' '))
)

Profile.objects.annotate(
    search=search_vectors
).filter(
    Q(search=SearchQuery(search_term)) |
    Q(first_name__trigram_similar=search_term) |
    Q(last_name__trigram_similar=search_term) |
    Q(educations__degree__trigram_similar=search_term) |
    Q(educations__field_of_study__trigram_similar=search_term) |
    Q(educations__school__trigram_similar=search_term) |
    Q(experiences__title__trigram_similar=search_term) |
    Q(experiences__company__trigram_similar=search_term) |
    Q(publications__title__trigram_similar=search_term) |
    Q(certification__certification_name__trigram_similar=search_term) |
    Q(certification__certification_authority__trigram_similar=search_term)
)

You can speed-up the full-text search using a SearchVectorField

To find out about full-text search and trigram you can read the article I wrote on the subject:

"Full-Text Search in Django with PostgreSQL"



来源:https://stackoverflow.com/questions/56538419/poor-performance-when-trigram-similarity-and-full-text-search-were-combined-with

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!