Regex match all lines that don't end with ,0 and ,1

前端 未结 4 1715
刺人心
刺人心 2020-12-21 10:18

I have a malformed CSV file which has two columns: Text,Value

The value is either 1 or 0, but some lines are malformed and span two lines:

1. \"This          


        
相关标签:
4条回答
  • 2020-12-21 10:18

    ,[^01]$

    Make sure regex mode is on.

    0 讨论(0)
  • 2020-12-21 10:18

    General considerations

    In general, to match a line that does not end with a specific pattern, you may use

    ^(?!.*pattern$).*$
    

    where ^ matches the start of a line, (?!.*pattern$) is a negative lookahead that fails the match if there are 0 or more chars other than line break chars, as few as possible (.*) followed with pattern at the end of the line ($), and the .*$ actually matches the line.

    To remove a line that does not end with some pattern together with a line break at the end, use

    ^(?!.*pattern$).*\R?
    

    where \R? is an optional line break sequence.

    In case of several fixed strings, you may use

    ^(?!.*(?:pattern|pattern2|patternN)$).*\R?
    

    If there is just one or two fixed strings to check at the end of the line, you may use a bit quicker regex like

    ^.*$(?<!a)(?<!bcd)
    

    that will match any line not ending with a and bcd.

    ^.*$(?<!1)(?<!0)
    

    Current problem solution

    So, for the current issue, to match a line not ending with 1 or 0, you may use

    ^(?!.*[01]$).*$    # without the line break
    ^(?!.*[01]$).*$\R? # with the line break
    

    Or,

    ^.*(?<![01])$    # without the line break
    ^.*(?<![01])$\R? # with the line break
    

    To remove/replace a line break on a line that does not end with a specific pattern you may use

    (?<![01])$\R?
    

    Replace with either an empty string (to remove the line break) or with any other delimiter string or character.

    0 讨论(0)
  • 2020-12-21 10:26

    The expression you would use is

    ([^,].|,[^01])$
    

    But unfortunately, notepad++ does not support alternation (the | operator). [1] You can match the broken lines with these two expressions then:

    [^,].$
    ,[^01]$
    

    Except, of course, if the "Text" part does end in ,0 or ,1 itself. :-)

    [1] http://sourceforge.net/apps/mediawiki/notepad-plus/index.php?title=Unsupported_Regex_Operators

    0 讨论(0)
  • 2020-12-21 10:31

    I don't know how the other answer would work:

    Something like the below is what I would use in Notepad++

    [^,][^01]$
    

    Here are the steps I did:

    Use ([^,][^01])$ to match the lines and replaced with \1{marked}

    Then switched to extended mode and replaced {marked}\r\n with `` ( empty ) to get a single line.

    Screenshots below:

    enter image description here

    enter image description here

    0 讨论(0)
提交回复
热议问题