awk: Split on “\n”

柔情痞子 提交于 2021-02-08 08:32:30

问题


I'm trying to process a log file in which entries are compressed into one line with the newline encoded as "\n". I want to keep everything up to the first "\n" and discard the rest. awk -F"\n" '{print $1}' file doesn't work, and neither does awk -F"\\n" '{print $1}' file. What's the correct form of this command?


回答1:


$ echo 'a\nb'
a\nb

$ echo 'a\nb' | awk -F'\\\\n' '{print $1}'
a

Here's why: Consider these uses of the above characters in regexp comparisons:

  • n = the literal character n ($0 ~ /n/)
  • \n = a literal newline character ($0 ~ /\n/)
  • \\ = a backslash when used in a regexp constant ($0 ~ /\\/)
  • \\\\ = a backslash when used in a dynamic regexp ($0 ~ "\\\\")

That last one is because a dynamic regexp is a string which has to be parsed once to be converted to a regexp and then gets parsed again when used as that regexp, so since it gets parsed twice it needs all escapes to be doubled.

Since a field separator is basically a regexp (with a few twists) when you say -F "whatever" you are defining the FS variable to be a dynamic regexp and so escapes have to be doubled.




回答2:


Since you want to remove everything after \n, use

awk -F '\\\\n' '{print $1}' fileName

This will look for \n (escaping backslash with \ and escaping n with \) and print everything upto it.



来源:https://stackoverflow.com/questions/43924863/awk-split-on-n

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!