Regex to extract nth token of a string separated by pipes

生来就可爱ヽ(ⅴ<●) 提交于 2020-07-03 13:03:23

问题


I'm new in Regex

I need to count and extract tokens from the below sample text:

AA||CCCCCCCC|||FFFFFFFFFFF

Requesting 4th token I must get a empty '' string, requesting 6th I must get 'FFFFFFFFFFF'

Would it be possible to have such regex?

Thanks in Advance!

PS: For token counting I've used '\|' adding +1 to the result is the string is not empty, surely there another more efficient way to do that using just a regex...


回答1:


For DB2 please try this to get the 6th element in the list. This works on Oracle and allows for NULL list elements. The syntax for the REGEXP_SUBSTR call is the same so I suspect it will work:

regexp_substr('AA||CCCCCCCC|||FFFFFFFFFFF', '(.*?)(\||$)', 1, 6, 'c', 1)

EDIT: 'c' for case-sensitive




回答2:


Splitting the string on | would be more effective, but this works too.

Code

We'll call the counter the number between curly brackets {X}. That counter begins at 0. If it's set to 0, we'll get the 1st element, if it's set to 5, we'll get the 6th element, etc.

See regex in use here

^(?:[^|]*\|){5}\K[^|]*

Alternatively, if \K isn't supported in your regex engine, you can use the following (result in the first capture group):

^(?:[^|]*\|){5}([^|]*)

Explanation

  • ^ Assert position at the start of the line
  • (?:[^|]*\|){5} Match the following exactly 5 times
    • [^|]* Match any character except | any number of times
    • \| Match | literally
  • \K Resets the starting point of the match. Any previously consumed characters are no longer included in the final match
  • [^|]* Match any character except | any number of times


来源:https://stackoverflow.com/questions/48285174/regex-to-extract-nth-token-of-a-string-separated-by-pipes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!