Regex to match words and those with an apostrophe

前端 未结 5 857
天命终不由人
天命终不由人 2020-12-09 05:46

Update: As per comments regarding the ambiguity of my question, I\'ve increased the detail in the question.

(Terminology: by words I am refering to

相关标签:
5条回答
  • 2020-12-09 05:48

    This works fine

     ('*)(?:'')*('?(?:\w+'?)+\w+('\b|'?[^']))(\1)
    

    on this data no problem

        'bou
        it's
        persons'
        'open'
        open
        foo''bar
        ''foo
        bee''
        ''foo''
        '
        ''
    

    on this data you should strip result (remove spaces from matches)

        'bou it's persons' 'open' open foo''bar ''foo ''foo'' ' ''
    

    (tested in The Regulator, results in $2)

    0 讨论(0)
  • 2020-12-09 06:02

    Try using this:

    (?=.*\w)^(\w|')+$

    'bout     # pass
    it's      # pass
    persons'  # pass
    '         # fail
    ''        # fail
    

    Regex Explanation

    NODE      EXPLANATION
      (?=       look ahead to see if there is:
        .*        any character except \n (0 or more times
                  (matching the most amount possible))
        \w        word characters (a-z, A-Z, 0-9, _)
      )         end of look-ahead
      ^         the beginning of the string
      (         group and capture to \1 (1 or more times
                (matching the most amount possible)):
        \w        word characters (a-z, A-Z, 0-9, _)
       |         OR
        '         '\''
      )+        end of \1 (NOTE: because you're using a
                quantifier on this capture, only the LAST
                repetition of the captured pattern will be
                stored in \1)
      $         before an optional \n, and the end of the
                string
    
    0 讨论(0)
  • 2020-12-09 06:02

    I submitted this 2nd answer coz it looks like the question has changed quite a bit and my previous answer is no longer valid. Anyway, if all conditions are listed up, try this:

    (((?<!')')?\b[0-9A-Za-z]+\b('(?!'))?|\b[0-9A-Za-z]+('[0-9A-Za-z]+)*\b)
    
    0 讨论(0)
  • 2020-12-09 06:06
    /('\w+)|(\w+'\w+)|(\w+')|(\w+)/
    
    • '\w+ Matches a ' followed by one or more alpha characters, OR
    • \w+'\w+ Matche sone or more alpha characters followed by a ' followed by one or more alpha characters, OR
    • \w+' Matches one or more alpha characters followed by a '
    • \w+ Matches one or more alpha characters
    0 讨论(0)
  • 2020-12-09 06:12

    How about this?

    '?\b[0-9A-Za-z']+\b'?
    

    EDIT: the previous version doesn't include apostrophes on the sides.

    0 讨论(0)
提交回复
热议问题