Selecting entries that are numerically close to each other in a database

断了今生、忘了曾经 提交于 2019-12-11 09:45:18

问题


Lets say I have a table called ABC in a MS-Access Database.

There are several columns in this table but only two columns are of interest for this question - "Hugo_symbol" and "Start_position". "Hugo_Symbol" has gene names and several lines can have the same Hugo_symbol - meaning this column has duplicate entries. "Start_position" has numbers - anything from 1000 to 100000000.

I want to build a query that returns lines from table ABC that 1) Have the same Hugo_Symbol AND 2) Start_position is within 20 of each other.

For eg., the query should return,

Hugo_Symbol         Start_Position

TP53                      987654
TP53                      987660
TP53                      987662
APOB                      12345
APOB                      12350
APOB                      12359

because these lines have the same Hugo_Symbol and Start_Position is within 20 of each other.

Is such a query possible? If so, what would the SQL code be?


回答1:


I don't use Access, but this is how I'd approach it with ANSI SQL.

SELECT
  *
FROM
  ABC    AS first
INNER JOIN
  ABC    AS second
    ON  second.Hugo_Symbol     = first.Hugo_Symbol
    AND second.Start_Position <= first.Start_Position + 20
    AND second.Start_Position >  first.Start_Position

This will potentially return more data that you expect, and potentially a different format that you expect.

First.Hugo_Symbol First.Start_Position Second.Hugo_Symbol Second.Start_Position
     TP53              987654                TP53              987660
     TP53              987654                TP53              987662
     TP53              987660                TP53              987662
     APOB              12345                 APOB              12350
     APOB              12350                 APOB              12359
     APOB              12350                 APOB              12359

EDIT:

The answer above is highly influence with "Each Other".

If you reform the requirements as "all records where another record exists with the same symbol and a position with 20 of it's own position" you could get something like...

SELECT
  *
FROM
  ABC     AS data
WHERE
  EXISTS (SELECT *
            FROM ABC AS lookup
           WHERE lookup.hugo_symbol     = data.hugo_symbol
             AND lookup.start_position >= data.start_position - 20
             AND lookup.start_position <= data.start_position + 20
             AND lookup.start_position <> data.start_position
         )

But Access2000 is more limitted that the databases I normally use. I don't know what Access2000 can and can't do.




回答2:


SELECT ABC.Hugo_Symbol, ABC.Start_Position, ABC_1.Start_Position
FROM ABC INNER JOIN ABC AS ABC_1 ON 
   ABC.Hugo_Symbol = ABC_1.Hugo_Symbol AND 
   ABC.Start_Position + 20 >= ABC_1.Start_Position AND
   ABC.Start_Position < ABC_1.Start_Position


来源:https://stackoverflow.com/questions/12377088/selecting-entries-that-are-numerically-close-to-each-other-in-a-database

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!