问题
I need to find files starting with three lowercase letters but for some reason I'm getting an undesired case-insensitive behavior. I'm using find with the -regex option but it finds even the files starting with capital.
$ find . -regextype posix-egrep -regex '.*/[a-z]{3}\w+\.abc'
./TTTxxx.abc
./tttyyy.abc
prints the same as:
$ find . -regextype posix-egrep -regex '.*/[A-Z]{3}\w+\.abc'
./TTTxxx.abc
./tttyyy.abc
If instead of using a range of characters I use a single character, works as sensitive, printing only the lowercase file.
find . -regextype posix-egrep -regex '.*/[t]{3}\w+\.abc'
./tttyyy.abc
I've tried using different regextypes and the result is the same.
In addition, an egrep to seems to work:
find . -regextype posix-egrep -regex '.*/.+\.abc' | egrep '/[a-z]\w+\.abc'
./tttyyy.abc
Why is the "find -regex" case-insensitive when using a char range ?
Note: I need to use find as I need the -exec option.
Many thanks.
回答1:
Accroding to Why does [A-Z] match lowercase letters in bash?, the collation is the issue here:
Standard collations with locales such as en_US have the following order:
aAbBcC...xXyYzZ
Between
a
andz
(in[a-z]
) are ALL uppercase letters, except forZ
. BetweenA
andZ
(in[A-Z]
) are ALL lowercase letters, except fora
.
So you need to list explicitly all lowercase letters or change the collation:
$ export LC_COLLATE=C
and use standard [a-z]
.
[...]{3}\w\.abc
- this pattern, where [...]
is [a-z]
or lowercase letters listed, will get you filenames.
来源:https://stackoverflow.com/questions/51727109/bash-find-using-regex-is-not-case-sensitive