I have a text file that denotes remarks with a single \'
.
Some lines have two quotes but I need to get everything from the first instance of a \
When I tried '.* in windows (Notepad ++) it would match everything after first ' until end of last line.
To capture everything until end of that line I typed the following:
'.*?\n
This would only capture everything from ' until end of that line.
The appropriate regex would be the ' char followed by any number of any chars [including zero chars] ending with an end of string/line token:
'.*$
And if you wanted to capture everything after the ' char but not include it in the output, you would use:
(?<=').*$
This basically says give me all characters that follow the ' char until the end of the line.
Edit: It has been noted that $ is implicit when using .* and therefore not strictly required, therefore the pattern:
'.*
is technically correct, however it is clearer to be specific and avoid confusion for later code maintenance, hence my use of the $. It is my belief that it is always better to declare explicit behaviour than rely on implicit behaviour in situations where clarity could be questioned.
This will capture everything up to the ' in backreference 1 - and everything after the ' in backreference 2. You may need to escape the apostrophes though depending on language (\')
/^([^']*)'?(.*)$/
Quick modification: if the line doesn't have an ' - backreference 1 should still catch the whole line.
^ - start of string
([^']*) - capture any number of not ' characters
'? - match the ' 0 or 1 time
(.*) - capture any number of characters
$ - end of string
'.*
I believe you need the option, Multiline.
In your example I'd go for the following pattern:
'([^\n]+)$
use multiline and global options to match all occurences.
To include the linefeed in the match you could use:
'[^\n]+\n
But this might miss the last line if it has no linefeed.
For a single line, if you don't need to match the linefeed I'd prefer to use:
'[^$]+$
'.*$
Starting with a single quote ('
), match any character (.
) zero or more times (*
) until the end of the line ($
).