Why does the cardinality of an index in MySQL remain unchanged when I add a new index?

前端 未结 2 1359
别那么骄傲
别那么骄傲 2020-12-31 13:23

I have added a FULLTEXT index to one of my MySQL database tables as follows:

ALTER TABLE members ADD FULLTEXT(about,fname,lname,job_title);

2条回答
  •  青春惊慌失措
    2020-12-31 13:44

    If you only have 1 row in the table, the cardinality for the index should be 1, of course. It's just counting the number of unique values.

    If you think of an index as a lookup-table based on buckets (like a hash), then the cardinality is the number of buckets.

    Here's how it works: When you build an index over a set of columns (a,b,c,d), then the database goes over all the rows in the table, looking at the ordered quadruplets of those 4 columns, for each row. Let's say your table looks like this:

    a  b  c  d  e   
    -- -- -- -- --  
    1  1  1  1  200 
    1  1  1  1  300
    1  2  1  1  200
    1  3  1  1  200
    

    So what the database looks at is just the 4 columns (a,b,c,d):

    a  b  c  d  
    -- -- -- --
    1  1  1  1 
    1  2  1  1 
    1  3  1  1 
    

    See that there are only 3 unique rows left? Those will become our buckets, but we'll get back to that. In reality, there's also a record id, or row identifier for each row in the table. So our original table looks like this:

    (row id) a  b  c  d  e   
    -------- -- -- -- -- --  
    00000001 1  1  1  1  200 
    00000002 1  1  1  1  300
    00000003 1  2  1  1  200
    00000004 1  3  1  1  200
    

    So when we look at only the 4 columns of (a,b,c,d), we're really looking also at the row id:

    (row id) a  b  c  d 
    -------- -- -- -- --
    00000001 1  1  1  1
    00000002 1  1  1  1
    00000003 1  2  1  1
    00000004 1  3  1  1
    

    But we want to do lookup by (a,b,c,d) and not by row id, so we produce something like this:

    (a,b,c,d) (row id)
    --------- --------
    1,1,1,1   00000001
    1,1,1,1   00000002
    1,2,1,1   00000003
    1,3,1,1   00000004
    

    And finally, we group all the row ids of rows that have identicle (a,b,c,d) values together:

    (a,b,c,d) (row id)
    --------- ---------------------
    1,1,1,1   00000001 and 00000002
    1,2,1,1   00000003
    1,3,1,1   00000004
    

    See that? The values of (a,b,c,d), which are (1,1,1,1) (1,2,1,1) and (1,3,1,1) have become keys for our lookup table into the rows of the original table.

    Actually, none of this really happens, but it should give you a good idea on how a "naive" (i.e. straight-forward) implementation of an index might be done.

    But the bottom line is this: cardinality just measures how many unique rows there are in an index. And in our example that was the number of keys in our lookup table, which was 3.

    Hope that helps!

提交回复
热议问题