问题
Lets say I have a table called ABC in a MS-Access Database.
There are several columns in this table but only two columns are of interest for this question - "Hugo_symbol" and "Start_position". "Hugo_Symbol" has gene names and several lines can have the same Hugo_symbol - meaning this column has duplicate entries. "Start_position" has numbers - anything from 1000 to 100000000.
I want to build a query that returns lines from table ABC that 1) Have the same Hugo_Symbol AND 2) Start_position is within 20 of each other.
For eg., the query should return,
Hugo_Symbol Start_Position
TP53 987654
TP53 987660
TP53 987662
APOB 12345
APOB 12350
APOB 12359
because these lines have the same Hugo_Symbol and Start_Position is within 20 of each other.
Is such a query possible? If so, what would the SQL code be?
回答1:
I don't use Access, but this is how I'd approach it with ANSI SQL.
SELECT
*
FROM
ABC AS first
INNER JOIN
ABC AS second
ON second.Hugo_Symbol = first.Hugo_Symbol
AND second.Start_Position <= first.Start_Position + 20
AND second.Start_Position > first.Start_Position
This will potentially return more data that you expect, and potentially a different format that you expect.
First.Hugo_Symbol First.Start_Position Second.Hugo_Symbol Second.Start_Position
TP53 987654 TP53 987660
TP53 987654 TP53 987662
TP53 987660 TP53 987662
APOB 12345 APOB 12350
APOB 12350 APOB 12359
APOB 12350 APOB 12359
EDIT:
The answer above is highly influence with "Each Other".
If you reform the requirements as "all records where another record exists with the same symbol and a position with 20 of it's own position" you could get something like...
SELECT
*
FROM
ABC AS data
WHERE
EXISTS (SELECT *
FROM ABC AS lookup
WHERE lookup.hugo_symbol = data.hugo_symbol
AND lookup.start_position >= data.start_position - 20
AND lookup.start_position <= data.start_position + 20
AND lookup.start_position <> data.start_position
)
But Access2000 is more limitted that the databases I normally use. I don't know what Access2000 can and can't do.
回答2:
SELECT ABC.Hugo_Symbol, ABC.Start_Position, ABC_1.Start_Position
FROM ABC INNER JOIN ABC AS ABC_1 ON
ABC.Hugo_Symbol = ABC_1.Hugo_Symbol AND
ABC.Start_Position + 20 >= ABC_1.Start_Position AND
ABC.Start_Position < ABC_1.Start_Position
来源:https://stackoverflow.com/questions/12377088/selecting-entries-that-are-numerically-close-to-each-other-in-a-database