Regex to extract nth token of a string separated by pipes

问题

I'm new in Regex

I need to count and extract tokens from the below sample text:

AA||CCCCCCCC|||FFFFFFFFFFF

Requesting 4th token I must get a empty '' string, requesting 6th I must get 'FFFFFFFFFFF'

Would it be possible to have such regex?

Thanks in Advance!

PS: For token counting I've used '\|' adding +1 to the result is the string is not empty, surely there another more efficient way to do that using just a regex...

回答1:

For DB2 please try this to get the 6th element in the list. This works on Oracle and allows for NULL list elements. The syntax for the REGEXP_SUBSTR call is the same so I suspect it will work:

regexp_substr('AA||CCCCCCCC|||FFFFFFFFFFF', '(.*?)(\||$)', 1, 6, 'c', 1)

EDIT: 'c' for case-sensitive

回答2:

Splitting the string on | would be more effective, but this works too.

Code

We'll call the counter the number between curly brackets {X}. That counter begins at 0. If it's set to 0, we'll get the 1st element, if it's set to 5, we'll get the 6th element, etc.

See regex in use here

^(?:[^|]*\|){5}\K[^|]*

Alternatively, if \K isn't supported in your regex engine, you can use the following (result in the first capture group):

^(?:[^|]*\|){5}([^|]*)

Explanation

^ Assert position at the start of the line
(?:[^|]*\|){5} Match the following exactly 5 times
- [^|]* Match any character except | any number of times
- \| Match | literally
\K Resets the starting point of the match. Any previously consumed characters are no longer included in the final match
[^|]* Match any character except | any number of times

来源：https://stackoverflow.com/questions/48285174/regex-to-extract-nth-token-of-a-string-separated-by-pipes

标签

regex

db2

db2-400