Regex find comma not inside quotes

前端 未结 5 1986
不思量自难忘°
不思量自难忘° 2020-12-05 07:52

I\'m checking line by line in C#

Example data:

bob jones,123,55.6,,,\"Hello , World\",,0
jim neighbor,432,66.5,,,Andy \"Blank,,1
john smith,555,77.4,         


        
相关标签:
5条回答
  • 2020-12-05 07:53

    The below regex is for parsing each fields in a line, not an entire line

    Apply the methodical and desperate regex technique: Divide and conquer

    Case: field does not contain a quote

    • abc,
    • abc(end of line)

    [^,"]*(,|$)

    Case: field contains exactly two quotes

    • abc"abc,"abc,
    • abc"abc,"abc(end of line)

    [^,"]*"[^"]*"[^,"]*(,|$)

    Case: field contains exactly one quote

    • abc"abc(end of line)
    • abc"abc, (and that there's no quote before the end of this line)

    [^,"]*"[^,"]$

    [^,"]*"[^"],(?!.*")

    Now that we have all the cases, we then '|' everything together and enjoy the resultant monstrosity.

    0 讨论(0)
  • 2020-12-05 08:04

    Stand back and be amazed!


    Here is the regex you seek:

    (?!\B"[^"]*),(?![^"]*"\B)


    Here is a demonstration:

    regex101 demo


    • It does not match the second line because the " you inserted does not have a closing quotation mark.
    • It will not match values like so: ,r"a string",10 because the letter on the edge of the " will create a word boundary, rather than a non-word boundary.

    Alternative version

    (".*?,.*?"|.*?(?:,|$))

    This will match the content and the commas and is compatible with values that are full of punctuation marks

    regex101 demo

    0 讨论(0)
  • 2020-12-05 08:06

    try this pattern ".*?"(*SKIP)(*FAIL)|, Demo

    0 讨论(0)
  • 2020-12-05 08:08
    import re
    
    print re.sub(',(?=[^"]*"[^"]*(?:"[^"]*"[^"]*)*$)',"",string)
    
    0 讨论(0)
  • 2020-12-05 08:13

    The best answer written by Vasili Syrakis does not work with negative numbers inside quotation marks such as:

    bob jones,123,"-55.6",,,"Hello , World",,0
    jim neighbor,432,66.5
    

    Following regex works for this purpose:

    ,(?!(?=[^"]*"[^"]*(?:"[^"]*"[^"]*)*$))
    

    But I was not successful with this part of input:

    ,Andy "Blank,
    
    0 讨论(0)
提交回复
热议问题