How do you identify record pattern sequences in records using TSQL?

半城伤御伤魂 提交于 2019-12-18 13:32:36

问题


This is a fairly new exercise for me but I need to find a way to identify pattern sequences within a table. So for example, lets say I have a simple table that resembles the following:

Now what I would like to do is identify and group all the records which have the sequenced pattern of values 5, 9 and 6 presenting them in a query. How would you accomplish this task using T-SQL?

The results should look like this:

I've looked for some potential examples of how this might be accomplished but couldn't find anything that really helps.


回答1:


You can use the following query wrapped in a CTE in order to assign sequence numbers to the values contained in your sequence:

;WITH Seq AS (
    SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
    FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
)

Output:

v   rn
-------
5   1
9   2
6   3

Using the above CTE you can identify islands, i.e. slices of sequential rows containing the whole of the sequence:

;WITH Seq AS (
    SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
    FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
), Grp AS (
SELECT [Key], [Value], 
       ROW_NUMBER() OVER (ORDER BY [Key]) - rn AS grp            
FROM mytable AS m
LEFT JOIN Seq AS s ON m.Value = s.v
)
SELECT *
FROM Grp

Output:

    Key Value   grp
   -----------------
    1   5       0
    2   9       0
    3   6       0
    6   5       3
    7   9       3
    8   6       3

grp field helps you identify exactly these islands.

All you need to do now is to just filter out partial groups:

;WITH Seq AS (
    SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
    FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
), Grp AS (
SELECT [Key], [Value], 
       ROW_NUMBER() OVER (ORDER BY [Key]) - rn AS grp            
FROM mytable AS m
LEFT JOIN Seq AS s ON m.Value = s.v
)
SELECT g1.[Key], g1.[Value]
FROM Grp AS g1
INNER JOIN (
   SELECT grp
   FROM Grp
   GROUP BY grp
   HAVING COUNT(*) = 3 ) AS g2
ON g1.grp = g2.grp

Demo here

Note: The initial version of this answer used an INNER JOIN to Seq. This won't work if table contains values like 5, 42, 9, 6, as 42 will be filtered out by the INNER JOIN and this sequence falsely identified as a valid one. Credit goes to @HABO for this edit.




回答2:


Not very optimized, but I think propper answer:

CREATE TABLE pattern (
    rowID INT IDENTITY(1,1) PRIMARY KEY, 
    rowValue INT NOT NULL
);

INSERT INTO pattern (rowValue) VALUES (5);
INSERT INTO pattern (rowValue) VALUES (9);
INSERT INTO pattern (rowValue) VALUES (6);

SELECT * FROM pattern;

SELECT Trg.* FROM Keys Trg 
INNER JOIN pattern Pt ON (Trg.fValue = Pt.rowValue) 
INNER JOIN (
    SELECT K.fKey - P.rowID AS X, COUNT(*) AS Xc FROM Keys K 
        LEFT JOIN pattern P ON (K.fValue = P.rowValue) 
    WHERE 
        (P.rowID IS NOT NULL) 
    GROUP BY K.fKey - P.rowID 
    HAVING COUNT(*) = (SELECT COUNT(*) FROM pattern)
) Z ON (Trg.fKey - Pt.rowID = Z.X);

I use a table for pattern joining it to the main table. I calculate difference between the Key and pattern Key and I show only that rows which difference is matching (and rows count for the difference matching rows inside pattern table).



来源:https://stackoverflow.com/questions/31703765/how-do-you-identify-record-pattern-sequences-in-records-using-tsql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!