Why doesn't this simple RegEx work with sed?

旧时模样 提交于 2021-01-29 17:41:50

问题


This is a really simple RegEx that isn't working, and I can't figure out why. According to this, it should work.

I'm on a Mac (OS X 10.8.2).

script.sh

#!/bin/bash
ZIP="software-1.3-licensetypeone.zip"
VERSION=$(sed 's/software-//g;s/-(licensetypeone|licensetypetwo).zip//g' <<< $ZIP)

echo $VERSION

terminal

$ sh script.sh
1.3-licensetypeone.zip

回答1:


Looking at the regex documentation for OS X 10.7.4 (but should apply to OP's 10.8.2), it is mentioned in the last paragraph that

Obsolete (basic) regular expressions differ in several respects. | is an ordinary character and there is no equivalent for its functionality...

... The parentheses for nested subexpressions are \(' and )'...

sed, without any options, uses basic regular expression (BRE).

To use | in OS X or BSD's sed, you need to enable extended regular expression (ERE) via -E option, i.e.

sed -E 's/software-//g;s/-(licensetypeone|licensetypetwo).zip//g'

p/s: \| in BRE is a GNU extension.


Alternative ways to extract version number

  1. chop-chop (parameter expansion)

    VERSION=${ZIP#software-}
    VERSION=${VERSION%-license*.zip}
    
  2. sed

    VERSION=$(sed 's/software-\(.*\)-license.*/\1/' <<< "$ZIP")
    

    You don't necessarily have to match strings word-by-word with shell patterns or regex.




回答2:


sed works with simple regular expressions. You have to backslash parentheses and a vertical bar to make it work.

sed 's/software-//g;s/-\(licensetypeone\|licensetypetwo\)\.zip//g'

Note that I backslashed the dot, too. Otherwise, it would have matched any character.




回答3:


You can do this in the shell, don't need sed, parameter expansion suffices:

shopt -s extglob
ZIP="software-1.3-licensetypeone.zip"
tmp=${ZIP#software-}
VERSION=${tmp%-licensetype@(one|two).zip}

With a recent version of bash (may not ship with OSX) you can use regular expressions

if [[ $ZIP =~ software-([0-9.]+)-licensetype(one|two).zip ]]; then
    VERSION=${BASH_REMATCH[1]}
fi

or, if you just want the 2nd word in a hyphen-separated string

VERSION=$(IFS=-; set -- $ZIP; echo $2)



回答4:


$ man sed | grep "regexp-extended" -A2
       -r, --regexp-extended

              use extended regular expressions in the script.


来源:https://stackoverflow.com/questions/13344049/why-doesnt-this-simple-regex-work-with-sed

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!