Get most frequent top 10 items found in a HTML column using Count

后端 未结 3 1276
南方客
南方客 2021-01-14 09:45

I have a bit of a messy query to try figure out.

I have a column called "meta_value" and in that I have some HTML data such as:



        
3条回答
  •  时光取名叫无心
    2021-01-14 10:24

    Following is purely MySQL-only solution; you can run this query once (or twice) a day in off-peak hours, to update the count in a cache/summary table. Moreover, number of rows are roughly around 6000 (only), so (depending on your server configuration), it should not cause performance issues.

    Now, since the number of cards in a particular row is variable (can range from 40-60), we can use a Sequence table. You can define a permanent table in your database storing integers ranging from 1 to 100 (you may find this table helpful in many other cases as well):

    CREATE TABLE seq (n tinyint(3) UNSIGNED NOT NULL, PRIMARY KEY(n));
    INSERT INTO seq (n) VALUES (1), (2), ...... , (99), (100);
    

    Now, we will do a JOIN between wph3_postmeta and seq table, based on the count of occurrence of substring 'data-name=""' inside the specific meta_value. We can get the count of occurrence of the substring (which also means, count of cards in a particular row) using:

    (
      CHAR_LENGTH(wp.meta_value) 
      - CHAR_LENGTH(REPLACE(wp.meta_value, 'data-name=""', ''))
    ) / CHAR_LENGTH('data-name=""')
    
    

    Now, we can use the Substring_Index() function to extract the card values out. Using the different numbers in different row, we can basically extract out the first card, second card, and so on...

    Once we have extracted all the words out, in separate rows; we can then use the complete result-set as a Derived Table, and perform the aggregation queries to get the required results:

    Query (View on DB Fiddle)

    SELECT dt.name,
           Count(DISTINCT dt.meta_id) AS unique_metaid_count
    FROM   (SELECT wp.meta_id,
                   Substring_index(Substring_index(wp.meta_value, 'data-name=""',
                                   -seq.n),
                   '"">', 1
                   ) AS name
            FROM   wph3_postmeta AS wp
                   JOIN seq
                     ON ( Char_length(wp.meta_value) - Char_length(
                                                       REPLACE(wp.meta_value,
                                                       'data-name=""'
                                                            ,
                                                            '')) ) /
                             Char_length('data-name=""') >= n
            WHERE  wp.meta_key = 'deck_list') AS dt
    GROUP  BY dt.name
    ORDER  BY unique_metaid_count DESC  
    /* To get top 10 counts only, add LIMIT 10 */
    

    Result

    | name                                          | unique_metaid_count |
    | --------------------------------------------- | ------------------- |
    | Call of the Haunted                           | 2                   |
    | Inferno Reckless Summon                       | 2                   |
    | Mystic Box                                    | 2                   |
    | Mystical Space Typhoon                        | 2                   |
    | Number 39: Utopia                             | 2                   |
    | #created by ygopro2                           | 1                   |
    | 98095162                                      | 1                   |
    | Abyss Dweller                                 | 1                   |
    | Advanced Ritual Art                           | 1                   |
    | Armed Dragon LV3                              | 1                   |
    | Armed Dragon LV5                              | 1                   |
    | Axe of Despair                                | 1                   |
    | B.E.S. Covered Core                           | 1                   |
    .....
    
    | The Dragon Dwelling in the Cave               | 1                   |
    | The Flute of Summoning Dragon                 | 1                   |
    | The Forces of Darkness                        | 1                   |
    | Threatening Roar                              | 1                   |
    | Time Machine                                  | 1                   |
    | Torike                                        | 1                   |
    | Tornado Dragon                                | 1                   |
    | Torrential Tribute                            | 1                   |
    | Tragoedia                                     | 1                   |
    | Trap Hole                                     | 1                   |
    | Treeborn Frog                                 | 1                   |
    | Trishula, Dragon of the Ice Barrier           | 1                   |
    | Twin Twisters                                 | 1                   |
    | Vanity's Ruler                                | 1                   |
    | Wind-Up Snail                                 | 1                   |
    | Wind-Up Soldier                               | 1                   |
    | Wulf, Lightsworn Beast                        | 1                   |
    | Zure, Knight of Dark World                    | 1                   |
    

    Note: If you want Top 10 only (by count), you can simply add LIMIT 10 at the end of the query.

提交回复
热议问题