问题
I try to grep a text from a log file on a linux bash.The text is within two square brackets.
e.g. in:
32432423 jkhkjh [234] hkjh32 2342342
I am searching 234
.
usually that should find it
\[(.*?)\]
but not with
|grep \[(.*?)\]
what is the correct way to do the regular expression search with grep
回答1:
You can look for an opening bracket and clear with the \K
escape sequence. Then, match up to the closing bracket:
$ grep -Po '\[\K[^]]*' <<< "32432423 jkhkjh [234] hkjh32 2342342"
234
Note you can omit the -P
(Perl extended regexp) by saying:
$ grep -o '\[.*]' <<< "32432423 jkhkjh [234] hkjh32 2342342"
[234]
However, as you see, this prints the brackets also. That's why it is useful to have -P
to perform a look-behind and look-after.
You also mention ?
in your regexp. Well, as you already know, *?
is to have a regex match behave in a non-greedy way. Let's see an example:
$ grep -Po '\[.*?]' <<< "32432423 jkhkjh [23]4] hkjh32 2342342"
[23]
$ grep -Po '\[.*]' <<< "32432423 jkhkjh [23]4] hkjh32 2342342"
[23]4]
With .*?
, in [23]4]
it matches [23]
. With just .*
, it matches up to the last ]
hence getting [23]4]
. This behaviour just works with the -P
option.
回答2:
[
has special meaning to both the shell and grep
, so you need to quote it twice. The backslashes prevent grep
from treating them as part of a bracket expression; quoting the entire thing prevents the shell from trying to expand the regular expression as a pattern before passing it to grep
.
... | grep '\[(.*?)\]'
In your attempt, the shell stripped the backslashes after they were to force the shell to treat them literally, it was approximately to ... | grep '[(.*?)]'
.
回答3:
I prefer \\[[^]]*]
(that's: \\[ [ ^] ]* ]
, ie. anything-but-right-square-brackets in square brackets) over \\[.*]
because of greediness:
$ grep -o \\[.*] <<<"[this] and that too]"
[this] and that too]
vs.
$ grep -o \\[[^]]*] <<<"[this] and that too]"
[this]
Then again grep
is not the tool for everything (it was g/re/p
after all). If you just want what's inside the square brackets, I'd use sed
for that:
$ sed 's/.*\[\([^]]*\)].*/\1/' foo
234
ie. replace-everything-with-what's-in-parenthesis...sies.
回答4:
To grep all values between square brackets including the brackets you may use a POSIX BRE based grep
command like
grep -o '\[[^][]*]' file
See the online grep demo. The -o
option makes grep
output matched substrings only, not whole lines, and the \[[^][]*]
pattern matches a [
, then 0 or more occurrences of any chars but [
and ]
(see the negated [^][]*
bracket expression), and then a ]
.
If you need to get the value inside square brackets excluding the square brackets, you can use a PCRE regex based grep
commands like
grep -oP '\[\K[^][]*(?=])' file
See another online demo
The \[\K[^][]*(?=])
pattern matches
\[
- a[
char\K
- a match reset operator that discards the text matched so far from the match memory buffer[^][]*
- 0 or more chars other than]
and[
(?=])
- a positive lookahead that requires a]
char immediately to the right of the current location.
来源:https://stackoverflow.com/questions/39412636/bash-grep-text-within-squared-brackets