awk: Split on “\n”

问题

I'm trying to process a log file in which entries are compressed into one line with the newline encoded as "\n". I want to keep everything up to the first "\n" and discard the rest. awk -F"\n" '{print $1}' file doesn't work, and neither does awk -F"\\n" '{print $1}' file. What's the correct form of this command?

回答1:

$ echo 'a\nb'
a\nb

$ echo 'a\nb' | awk -F'\\\\n' '{print $1}'
a

Here's why: Consider these uses of the above characters in regexp comparisons:

n = the literal character n ($0 ~ /n/)
\n = a literal newline character ($0 ~ /\n/)
\\ = a backslash when used in a regexp constant ($0 ~ /\\/)
\\\\ = a backslash when used in a dynamic regexp ($0 ~ "\\\\")

That last one is because a dynamic regexp is a string which has to be parsed once to be converted to a regexp and then gets parsed again when used as that regexp, so since it gets parsed twice it needs all escapes to be doubled.

Since a field separator is basically a regexp (with a few twists) when you say -F "whatever" you are defining the FS variable to be a dynamic regexp and so escapes have to be doubled.

回答2:

Since you want to remove everything after \n, use

awk -F '\\\\n' '{print $1}' fileName

This will look for \n (escaping backslash with \ and escaping n with \) and print everything upto it.

来源：https://stackoverflow.com/questions/43924863/awk-split-on-n

标签

awk

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!