Mysql Improve Search Performance with wildcards (%%)

后端 未结 6 764
青春惊慌失措
青春惊慌失措 2020-11-30 10:24

Below is a query I use for searching a person by email

  SELECT *
    FROM phppos_customers
    JOIN phppos_people ON phppos_customers.person_id = phppos_peo         


        
6条回答
  •  一向
    一向 (楼主)
    2020-11-30 11:08

    In my post here I describe, in detail, a technique that allows you to use the index with LIKE for fast %infix% search, at the cost of some extra storage:

    https://stackoverflow.com/a/22531268/543814

    As long as the strings are relatively small, the storage requirement is generally acceptable.

    According to Google, the average e-mail address is 25 characters long. This increases your required storage by a factor 12.5 on average, and gives you fast indexed search in return. (See my post for the calculations.)

    From my perspective, if you are storing 10'000 e-mail addresses, you should be fine storing (the equivalent of) about 100'000 e-mail addresses, too. If this is what it takes to allow you to use an index, that seems like an acceptable trade-off. Often, disk space is cheap, while non-indexed searches are unaffordable.

    If you choose to take this approach, I suggest that you limit the input length of e-mail addresses to 64 characters. Those rare (or attacker) e-mail addresses of such length will require up to 32 times the usual storage. This gives you:

    1. Protection against an attacker trying to flood your database, since these still aren't very impressive amounts of data.
    2. The expectation that most e-mail addresses are not of this length anyway.

    If you consider 64 characters too harsh a requirement, use 255 instead, for a worst-case storage increase factor of 127.5. Ridiculous? Possibly. Likely? No. Fast? Very.

提交回复
热议问题