How to use a regex capture group in redshift (or alternative)

血红的双手。 提交于 2019-12-23 11:46:03

问题


I have a field in a redshift column that looks like the following:

abcd1234df-TEXT_I-WANT

the characters and numbers in the first 10 digits can be either letters or numbers.

If I use a capture group regex, I would use a poorly written expression like (\w\w\w\w\w\w\w\w\w\w\W)(.*) and grap the 2nd group

But I'm having trouble implementing this in redshift, so not sure how I can grab only the stuff after the first hyphen


回答1:


As mentioned before, regex might be an overkill. However, it could be useful in some cases.

Here's a basic replace pattern:

SELECT
    regexp_replace(
      'abcd1234df-TEXT_I-WANT'  -- use your input column here instead
    , '^[a-z0-9]{10}-(.*)$'     -- matches whole string, captures "TEXT_I-WANT" in $1
    , '$1'                      -- inserts $1 to return TEXT_I-WANT
    )
;



回答2:


Regular expressions might be overkill. Basic string operations are good enough:

select substring(col from position('-' in col) + 1)



回答3:


This can be done with charindex and substring.

substring(col,charindex('-',col)+1)



回答4:


@wp78de gives a very good advice to use REGEX_REPLACE. I allows you to choose the capture group. Using your regex, it would look like that, although you don't need 2 groups in here and using one is sufficient here.

select 
  regexp_replace(
    'abcd1234df-TEXT_I-WANT',
    '(\\w\\w\\w\\w\\w\\w\\w\\w\\w\\w\\W)(.*)', 
    '$2' -- replacement selecting 2nd capture group
  );

Another oprion, although less flexible is using REGEX_SUBSTR with e parameter set (Extract a substring using a subexpression). It allows you to select a substring, but only of a first capture group in your regex. You also have to set the position and occurence parameters to default 1:

Using REGEX you suggested, but only with 1 group:

select 
  regexp_substr(
    'abcd1234df-TEXT_I-WANT',
    '\\w\\w\\w\\w\\w\\w\\w\\w\\w\\w\\W(.*)', 
    1, -- position
    1, -- occurrence
    'e' -- parameters
  );



回答5:


You can get the desired string by using the common table expression with regexp_substr as below :

with cte as
(
 select 'abcd1234df-TEXT_I-WANT' as str   
)    
select regexp_substr(str,'-.*') 
       as derived_str
  from cte;

derived_str
-------------
-TEXT_I-WANT  


来源:https://stackoverflow.com/questions/50710794/how-to-use-a-regex-capture-group-in-redshift-or-alternative

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!