bash grep text within squared brackets

a 夏天 提交于 2020-03-19 06:48:07

问题


I try to grep a text from a log file on a linux bash.The text is within two square brackets.

e.g. in:

32432423 jkhkjh [234] hkjh32 2342342

I am searching 234.

usually that should find it

 \[(.*?)\]

but not with

|grep \[(.*?)\]

what is the correct way to do the regular expression search with grep


回答1:


You can look for an opening bracket and clear with the \K escape sequence. Then, match up to the closing bracket:

$ grep -Po '\[\K[^]]*' <<< "32432423 jkhkjh [234] hkjh32 2342342"
234

Note you can omit the -P (Perl extended regexp) by saying:

$ grep -o '\[.*]' <<< "32432423 jkhkjh [234] hkjh32 2342342"
[234]

However, as you see, this prints the brackets also. That's why it is useful to have -P to perform a look-behind and look-after.

You also mention ? in your regexp. Well, as you already know, *? is to have a regex match behave in a non-greedy way. Let's see an example:

$ grep -Po '\[.*?]' <<< "32432423 jkhkjh [23]4] hkjh32 2342342"
[23]
$ grep -Po '\[.*]' <<< "32432423 jkhkjh [23]4] hkjh32 2342342"
[23]4]

With .*?, in [23]4] it matches [23]. With just .*, it matches up to the last ] hence getting [23]4]. This behaviour just works with the -P option.




回答2:


[ has special meaning to both the shell and grep, so you need to quote it twice. The backslashes prevent grep from treating them as part of a bracket expression; quoting the entire thing prevents the shell from trying to expand the regular expression as a pattern before passing it to grep.

... | grep '\[(.*?)\]'

In your attempt, the shell stripped the backslashes after they were to force the shell to treat them literally, it was approximately to ... | grep '[(.*?)]'.




回答3:


I prefer \\[[^]]*] (that's: \\[ [ ^] ]* ], ie. anything-but-right-square-brackets in square brackets) over \\[.*] because of greediness:

$ grep -o \\[.*] <<<"[this] and that too]"
[this] and that too]

vs.

$ grep -o \\[[^]]*] <<<"[this] and that too]"
[this]

Then again grep is not the tool for everything (it was g/re/p after all). If you just want what's inside the square brackets, I'd use sed for that:

$ sed 's/.*\[\([^]]*\)].*/\1/' foo
234

ie. replace-everything-with-what's-in-parenthesis...sies.




回答4:


To grep all values between square brackets including the brackets you may use a POSIX BRE based grep command like

grep -o '\[[^][]*]' file

See the online grep demo. The -o option makes grep output matched substrings only, not whole lines, and the \[[^][]*] pattern matches a [, then 0 or more occurrences of any chars but [ and ] (see the negated [^][]* bracket expression), and then a ].

If you need to get the value inside square brackets excluding the square brackets, you can use a PCRE regex based grep commands like

grep -oP '\[\K[^][]*(?=])' file

See another online demo

The \[\K[^][]*(?=]) pattern matches

  • \[ - a [ char
  • \K - a match reset operator that discards the text matched so far from the match memory buffer
  • [^][]* - 0 or more chars other than ] and [
  • (?=]) - a positive lookahead that requires a ] char immediately to the right of the current location.


来源:https://stackoverflow.com/questions/39412636/bash-grep-text-within-squared-brackets

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!