How can I match spaces with a regexp in Bash?

前端 未结 3 1550
甜味超标
甜味超标 2020-12-03 13:15

I expect the code below to echo \"yes\", but it does not. For some reason it won\'t match the single quote. Why?

str=\"{templateUrl: \'}\"
regexp=\"templateU         


        
相关标签:
3条回答
  • 2020-12-03 13:45

    Get rid of the square brackets in the regular expression:

    regexp="templateUrl:\s*'"
    

    With the square brackets present, the \s inside gets interpreted literally as matching either the \ or s characters, but your intent is clearly to match against the white space character class for which \s is shorthand (and therefore no square brackets needed).

    $ uname -a
    Linux noname 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
    $ bash --version
    GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)
    Copyright (C) 2013 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    
    This is free software; you are free to change and redistribute it. 
    There is NO WARRANTY, to the extent permitted by law.
    $ cat test.sh
    str="{templateUrl: '}" 
    regexp="templateUrl:\s*'"
    
    if [[ $str =~ $regexp ]]; then
      echo "yes"
    else
      echo "no"
    $ bash test.sh
    yes 
    
    0 讨论(0)
  • 2020-12-03 13:48

    Replace:

    regexp="templateUrl:[\s]*'"
    

    With:

    regexp="templateUrl:[[:space:]]*'"
    

    According to man bash, the =~ operator supports "extended regular expressions" as defined in man 3 regex. man 3 regex says it supports the POSIX standard and refers the reader to man 7 regex. The POSIX standard supports [:space:] as the character class for whitespace.

    The GNU bash manual documents the supported character classes as follows:

    Within ‘[’ and ‘]’, character classes can be specified using the syntax [:class:], where class is one of the following classes defined in the POSIX standard:

    alnum alpha ascii blank cntrl digit graph lower print
    punct space upper word xdigit

    The only mention of \s that I found in the GNU bash documentation was for an unrelated use in prompts, such as PS1, not in regular expressions.

    The Meaning of *

    [[:space:]] will match exactly one white space character. [[:space:]]* will match zero or more white space characters.

    The Difference Between space and blank

    POSIX regular expressions offer two classes of whitespace: [[:space:]] and [[:blank:]]:

    • [[:blank:]] means space and tab. This makes it similar to: [ \t].

    • [[:space:]], in addition to space and tab, includes newline, linefeed, formfeed, and vertical tab. This makes it similar to: [ \t\n\r\f\v].

    A key advantage of using character classes is that they are safe for unicode fonts.

    0 讨论(0)
  • 2020-12-03 14:06

    This should work:

    #!/bin/bash
    str="{templateUrl: '}"
    regexp="templateUrl:[[:space:]]*'"
    
    if [[ $str =~ $regexp ]]; then
      echo "yes"
    else
      echo "no"
    fi
    

    If you want to match zero or more whitespaces the * needs to added after [[:space:]].

    0 讨论(0)
提交回复
热议问题