How can I use group_concat on an entire subquery?

后端 未结 1 465
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-21 22:54

...without making unnecessary comparisons

I want to get an md5 hash of a range of rows. Due to bandwidth limitations, I want it to happen server-side.

This w

相关标签:
1条回答
  • 2020-12-21 23:05

    The solution was simply to omit group by 1 = 1 entirely. I had assumed that group_concat would require that I provide it a group, but it can be used directly on a subquery, like so:

    select group_concat(id,col1,col2) from
        (select * from some_table
         where id >= 2 and id < 5
         order by id desc) as some_table;
    

    Be aware that null values will need to be cast to something concat-friendly, like so:

    insert into some_table (col1, col2)
                    values ('a', 1),
                           ('b', 11),
                           ('c', NULL),
                           ('d', 25),
                           ('e', 50);
    
    select group_concat(id, col1, col2) from
        (select id, col1, ifnull(col2, 'NULL') as col2
         from some_table
         where id >= 2 and id < 5
         order by id desc) as some_table;
    

    Output:

    +------------------------------+
    | group_concat(id, col1, col2) |
    +------------------------------+
    | 2b11,3cNULL,4d25             |
    +------------------------------+
    

    Another caveat: mysql has a max-length for group_concat defined by the variable: group_concat_max_len. In order to hash a concatenation of n table rows, I needed to:

    1. Hash row so that it is represented in 32 bits, regardless of how many columns it has
    2. Ensure that group_concat_max_len > (n * 33) (the extra byte accounts for added commas)
    3. Hash the group_concat of the hashed rows.

    Ultimately I ended up using the client language to examine the name, number, and nullability of each column, and then build queries like this:

    select md5(group_concat(row_fingerprint)) from
        (select concat(id, col1, ifnull(col2, 'null')) as row_fingerprint
         from some_table
         where id >= 2 and id < 5
         order by id desc) as foo;
    

    For more detail, you can poke through my code here (see function: find_diff_intervals).

    0 讨论(0)
提交回复
热议问题