Deleting records with number repeating more than 5

▼魔方 西西 提交于 2019-12-13 02:38:04

问题


I have data in a table of length 9 where data is like

999999969
000000089
666666689

I want to delete only those data in which any number from 1-9 is repeating more than 5 times.


回答1:


This can be much simpler with a regular expression using a back reference.

DELETE FROM tbl
WHERE  col ~ '([1-9])\1{5}';

That's all.

Explain

([1-9]) ... a character class with digits from 1 to 9, parenthesized for the following back reference.
\1 ... back reference to first (and only in this case) parenthesized subexpression.
{5} .. exactly (another) 5 times, making it "more than 5".

Per documentation:

A back reference (\n) matches the same string matched by the previous parenthesized subexpression specified by the number n [...] For example, ([bc])\1 matches bb or cc but not bc or cb.

SQL Fiddle demo.




回答2:


OK, so the logic here can be summed up as:

  • Find the longest series of the same consecutive digit in any given number; and
  • Return true if that longest value is > 5 digits

Right?

So, lets split it into series of consecutive digits:

regress=> SELECT regexp_matches('666666689', '(0+|1+|2+|3+|4+|5+|6+|7+|8+|9+)', 'g');
 regexp_matches 
----------------
 {6666666}
 {8}
 {9}
(3 rows)

then filter for the longest:

regress=> 

SELECT x[1] 
FROM regexp_matches('6666666898', '(0+|1+|2+|3+|4+|5+|6+|7+|8+|9+)', 'g') x 
ORDER BY length(x[1]) DESC 
LIMIT 1;

    x    
---------
 6666666
(1 row)

... but really, we don't actually care about that, just if any entry is longer than 5 digits, so:

SELECT x[1] 
FROM regexp_matches('6666666898', '(0+|1+|2+|3+|4+|5+|6+|7+|8+|9+)', 'g') x 
WHERE length(x[1]) > 5;

can be used as an EXISTS test, e.g.

WITH blah(n) AS (VALUES('999999969'),('000000089'),('666666689'),('15552555'))
SELECT n
FROM blah
WHERE EXISTS (
    SELECT x[1] 
    FROM regexp_matches(n, '(0+|1+|2+|3+|4+|5+|6+|7+|8+|9+)', 'g') x 
    WHERE length(x[1]) > 5
)

which is actually pretty efficient and return the correct result (always nice). But it can be simplified a little more with:

WITH blah(n) AS (VALUES('999999969'),('000000089'),('666666689'),('15552555'))
SELECT n
FROM blah
WHERE EXISTS (
    SELECT x[1] 
    FROM regexp_matches(n, '(0{6}|1{6}|2{6}|3{6}|4{6}|5{6}|6{6}|7{6}|8{6}|9{6})', 'g') x;
)

You can use the same WHERE clause in a DELETE.




回答3:


Horrible and terrible in terms of performance, but it should work:

DELETE FROM YOURTABLE 
 WHERE YOURDATA LIKE '%111111%' 
    OR YOURDATA LIKE '%222222%' 
    OR YOURDATA LIKE '%333333%'
    OR YOURDATA LIKE '%444444%'
    OR YOURDATA LIKE '%555555%'
    OR YOURDATA LIKE '%666666%'
    OR YOURDATA LIKE '%777777%'
    OR YOURDATA LIKE '%888888%'
    OR YOURDATA LIKE '%999999%'


来源:https://stackoverflow.com/questions/24363986/deleting-records-with-number-repeating-more-than-5

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!