I am using a MySQL DB, and have the following table:
CREATE TABLE SomeTable (
PrimaryKeyCol BIGINT(20) NOT NULL,
A BIGINT(20) NOT NULL,
FirstX INT(11) N
Indexes will not help you in this scenario, except for a small percentage of all possible values of X.
Lets say for example that:
And you have following indexes:
FirstX, LastX,
LastX, FirstX,
Now:
If X is 50 the clause FirstX <= 50
matches approximately 5% rows while LastX >= 50
matches approximately 95% rows. MySQL will use the first index.
If X is 990 the clause FirstX <= 990
matches approximately 99% rows while LastX >= 990
matches approximately 5% rows. MySQL will use the second index.
Any X between these two will cause MySQL to not use either index (I don't know the exact threshold but 5% worked in my tests). Even if MySQL uses the index, there are just too many matches and the index will most likely be used for covering instead of seeking.
Your solution is the best. What you are doing is defining upper and lower bound of "range" search:
WHERE FirstX <= 500 -- 500 is the middle (worst case) value
AND FirstX >= 500 - 42 -- range matches approximately 4.3% rows
AND ...
In theory, this should work even if you search FirstX for values in the middle. Having said that, you got lucky with 4200000 value; possibly because the maximum difference between first and last is a smaller percentage.
If it helps, you can do the following after loading the data:
ALTER TABLE testdata ADD COLUMN delta INT NOT NULL;
UPDATE testdata SET delta = LastX - FirstX;
ALTER TABLE testdata ADD INDEX delta (delta);
This makes selecting MAX(LastX - FirstX)
easier.
I tested MySQL SPATIAL INDEXES which could be used in this scenario. Unfortunately I found that spatial indexes were slower and have many constraints.