What is a cross platform regex for removal of line breaks?

后端 未结 5 1906
囚心锁ツ
囚心锁ツ 2020-12-08 04:36

I am sure this has been asked before, but I cannot find it.

Basically, assuming you are parsing a text file of unknown origin and want to replace line breaks with so

5条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-08 05:21

    The regex to find any Unicode line terminator should be (?>\x0D\x0A?|[\x0A-\x0C\x85\x{2028}\x{2029}]) rather than as drewk wrote it, at least in Perl. Taken directly from the perl 5.10.0 documentation (it was removed in later versions). Note the braces after \x: U+2029 is \x{2029} but \x2029 is an ASCII whitespace (U+0020) + a digit 2 + a digit 9. \n outside a character class ,is also not guaranteed to match \x{0a}.

提交回复
热议问题