Optimizing Mysql Table Indexing for Substring Queries

梦想的初衷 提交于 2019-12-23 15:53:17

问题


I have a MySQL indexing question for you guys.

I've got a very large table (~100Million Records) in MySQL that contains information about files. Most of the Queries I do on it involve substring operations on the file path column.

Here's the table ddl:

CREATE TABLE `filesystem_data`.`$tablename` (
                `file_id` INT( 14 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
                `file_name` VARCHAR( 256 ) NOT NULL ,
                `file_share_name` VARCHAR ( 100 ) NOT NULL,
                `file_path` VARCHAR( 900 ) NOT NULL ,
                `file_size` BIGINT( 14 ) NOT NULL ,
                `file_tier` TINYINT(1) UNSIGNED NULL, 
                `file_last_access` DATETIME NOT NULL ,
                `file_last_change` DATETIME NOT NULL ,
                `file_creation` DATETIME NOT NULL ,
                `file_extension` VARCHAR( 50 ) NULL ,
                INDEX ( `file_path`, `file_share_name` ) 
                ) ENGINE = MYISAM 
             };

So for example ill have a row with a file_path like:

'\\Server100\share2\Home\Zenshai\My Documents\'

And I'll extract the User's name (Zenshai in this example) with something like

SELECT substring_index(substring_index(fp.file_path,'\\',6),'\\',-1) as Username
FROM (SELECT '\\\\Server100\\share2\\Home\\Zenshai\\My Documents\\' as file_path) fp

It gets a bit ugly, but that's not really my concern right now.

What I'd like some advice on is what kind of index (if any at all) can help speed up these types of queries on this table. Any other suggestions are welcome too.

Thanks.

PS. Although the table gets very large there is enough space for indexes.


回答1:


You cannot use indices with your current table design.

You may add a column called USERNAME, fill it in the INSERT/UPDATE trigger with the expression you use in SELECT, and search on this column.

P. S. Just curious, you really have 100 mln+ files on your server?




回答2:


I'd create a tiny (columns, not record count) subtable that would have the file path broken out and stored like so:

FK_TO_PARENT    PATH_PART
1               Server100
1               share2
1               Home
1               Zenshai
1               My Documents

And then just index PATH_PART. Of course if the parent table is 100 Million plus, then this would be going into the billions of records.



来源:https://stackoverflow.com/questions/546829/optimizing-mysql-table-indexing-for-substring-queries

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!