Case-insensitive search & replace with sed

℡╲_俬逩灬. 提交于 2019-11-27 04:01:47

To be clear: On macOS - as of Mojave (10.14) - sed - which is the BSD implementation - does NOT support case-insensitive matching - hard to believe, but true. The formerly accepted answer, which itself shows a GNU sed command, gained that status because of the perl-based solution mentioned in the comments.

To make that Perl solution work with foreign characters as well, via UTF-8, use something like:

perl -C -Mutf8 -pe 's/öœ/oo/i' <<< "FÖŒ" # -> "Foo"
  • -C turns on UTF-8 support for streams and files, assuming the current locale is UTF-8-based.
  • -Mutf8 tells Perl to interpret the source code as UTF-8 (in this case, the string passed to -pe) - this is the shorter equivalent of the more verbose -e 'use utf8;'.Thanks, Mark Reed

(Note that using awk is not an option either, as awk on macOS (i.e., BWK awk, a.k.a. BSD awk) appears to be completely unaware of locales altogether - its tolower() and toupper() functions ignore foreign characters (and sub() / gsub() don't have case-insensitivity flags to begin with).)

Wesley Rice

Editor's note: This solution doesn't work on macOS (out of the box), because it only applies to GNU sed, whereas macOS comes with BSD sed.

Capitalize the 'I'.

sed 's/foo/bar/I' file

Another work-around for sed on Mac OS X is to install gsedfrom MacPorts or HomeBrew and then create the alias sed='gsed'.

user1307434

The Mac version of sed seems a bit limited. One way to work around this is to use a linux container (via Docker) which has a useable version of sed:

cat your_file.txt | docker run -i busybox /bin/sed -r 's/[0-9]{4}/****/Ig'

The sed FAQ addresses the closely related case-insensitive search. It points out that a) many versions of sed support a flag for it and b) it's awkward to do in sed, you should rather use awk or Perl.

But to do it in POSIX sed, they suggest three options (adapted for substitution here):

  1. Convert to uppercase and store original line in hold space; this won't work for substitutions, though, as the original content will be restored before printing, so it's only good for insert or adding lines based on a case-insensitive match.

  2. Maybe the possibilities are limited to FOO, Foo and foo. These can be covered by

    s/FOO/bar/;s/[Ff]oo/bar/
    
  3. To search for all possible matches, one can use bracket expressions for each character:

    s/[Ff][Oo][Oo]/bar/
    

I had a similar need, and came up with this:

this command to simply find all the files:

grep -i -l -r foo ./* 

this one to exclude this_shell.sh (in case you put the command in a script called this_shell.sh), tee the output to the console to see what happened, and then use sed on each file name found to replace the text foo with bar:

grep -i -l -r --exclude "this_shell.sh" foo ./* | tee  /dev/fd/2 | while read -r x; do sed -b -i 's/foo/bar/gi' "$x"; done 

I chose this method, as I didn't like having all the timestamps changed for files not modified. feeding the grep result allows only the files with target text to be looked at (thus likely may improve performance / speed as well)

be sure to backup your files & test before using. May not work in some environments for files with embedded spaces. (?)

CBB

If you are doing pattern matching first, e.g.,

/pattern/s/xx/yy/g

then you want to put the I after the pattern:

/pattern/Is/xx/yy/g

Example:

echo Fred | sed '/fred/Is//willma/g'

returns willma; without the I, it returns the string untouched (Fred).

Nishanth
sed 's/string1/string2/Ig'

Capital I is an option which is useful for searching of a string irrespective of case sensitiveness.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!