问题
I have a big table with 60M rows and 50 columns (columns include "company_idx" and "timestamp"). Thus, when I do my simple SQL Query such as:
SELECT * FROM companies_Scores.Scores
WHERE `company_idx`=11
AND `timestamp` BETWEEN '"+start_date+" 00:00:00' AND '"+end_date+" 00:00:00'
It takes basically 4 minutes to run (which is way too long). Thus, I thought about indexing my table, so I've done:
CREATE INDEX idx_time ON companies_Scores.Scores(company_idx, timestamp) USING BTREE;
However, when I now do the following, it takes also 4 minutes to run.
SELECT * FROM companies_Scores.Scores
USE INDEX(idx_time)
WHERE `company_idx`=11
AND `timestamp` BETWEEN '"+start_date+" 00:00:00' AND '"+end_date+" 00:00:00'
I'm really a beginner with SQL and indexes. So I'm not really sure how to use indexes in a query. I guess the one I've done above is correct? Why does it take so much time? How can I improve it? I'd like my queries for each company_idx to be as quick as possible.
When I run EXPLAIN, I get:
[{'Cardinality': 115751,
'Collation': 'A',
'Column_name': 'company_idx',
'Comment': '',
'Index_comment': '',
'Index_type': 'BTREE',
'Key_name': 'idx_time',
'Non_unique': 1,
'Null': 'YES',
'Packed': None,
'Seq_in_index': 1,
'Sub_part': None,
'Table': 'Scores'},
{'Cardinality': 45831976,
'Collation': 'A',
'Column_name': 'timestamp',
'Comment': '',
'Index_comment': '',
'Index_type': 'BTREE',
'Key_name': 'idx_time',
'Non_unique': 1,
'Null': 'YES',
'Packed': None,
'Seq_in_index': 2,
'Sub_part': None,
'Table': 'Scores'}]
回答1:
Your index looks correct for the query. You are forcing index usage, so we can assume the index is being used, if possible.
One issue may be that the index cannot be used. That would occur if you have type problems with the columns. For instance, the comparison value 11 is a number. If customer_idx is a string, you have a problem. The comparison should be a string -- '11'.
Another issue is simply that there might be a lot of data. If even a few percent of the rows match the conditions, the index is not going to help. One major use of indexes is for "needle-in-the-haystack" queries. They help finding a small subset. They don't help if you need all or much of the haystack.
来源:https://stackoverflow.com/questions/58224677/how-to-index-and-query-a-very-large-db-with-60m-rows-and-50-columns