Please help, my regular expression is not consistently producing the right group matches [duplicate]

孤者浪人 提交于 2021-01-26 02:08:15

问题


Update:

There were actually three problems, the first reply comment and answer fixes problem one, where the regular expression match/exec methods return NULL.

However, after I updated my code in this question, I am not getting all of the matches that the online regular expression testers show, which is five with the same string, show below. My code shows less than that -- problem two, and one shows that the separating comma is part of the match, which should be excluded -- problem three.

The latter two problems are due to my expression not doing what I need it to do, which is:

  1. Find all of the alphanumeric words that may also include spaces and/or parentheses.
  2. These 'words' are delimited by a comma and optionally whitespaces.
  3. The 'words' should be placed into the array of matches that the exec method returns, but the delimiting commas and optional white spaces surrounding the 'words' should not be added to the array(captured).
  4. The spaces with in the 'words' should not cause the 'words' to be split into two or more array elements, whereas the comma followed by any white spaces do separate the 'words'.

The 'words' will be placed into a database table and will be used to categorize related data, so having some 'words' with a comma and others without a command is problem matic.

Example:

This, is, a test(hi), and so, is this, => [This][is][a test(hi)[and so][is this].

Note, the ending comma is optional.

Here is what I get when I inspect the exec call in my javascript:

regexInfo: Array(2)
0: "This, "
1: "This"
groups: undefined
index: 0
input: "This, is, a test(hi), and so, is this,"
length: 2
__proto__: Array(0)

Original Question:

Here is my regular expression:

/([A-Z0-9 ()]+)(?:,\s*)?/gmi/

I'm using it to check that the following value is composed of alphanumeric 'words' separated by a comma and optional space, but do not contain any underscores:

This, is, a test(hi), and so, is this,

The problem is that when I use either the match or exec regular expression methods, they return NULL indicating no matches were found.

However, when I test the above expression and value in regexr.com, regex101, or regextester, they all show that the expression is valid and produces 5 group matches.

Why does the expression work in the online regular expression testers, but not in my code?

Here is my code:

// This function is called as an event handler and directly.

function validateTextArea( event ) {
  var textarea = ( ( ( this === window )
                     ? null
                     : this ) || event.target || event );

  var regexObject, regexInfo, hasError;

  // Test for empty, then spaces, than against the pattern ...

  if( ( textarea.innerHTML === '' ) && ( textarea.value === '' ) ) {

    textarea.setCustomValidity( '' ); // No value is allowed.

  }
  else if( ( textarea.innerHTML.trim() === '' ) &&
           ( textarea.value.trim() === '' ) )

    textarea.setCustomValidity( 'Space is|Spaces are not allowed.' );

  else {

    // ******************************************************************
    //
    // This block tests the value against the regular expression pattern.
    //
    // ******************************************************************

    regexObject = new RegExp( textarea.getAttribute( 'pattern' ),
                              textarea.getAttribute( 'flags' ) );

    document.querySelector( 'span#pattern' ).innerHTML =
      textarea.getAttribute( 'pattern' );

    document.querySelector( 'span#flags' ).innerHTML =
      textarea.getAttribute( 'flags' );

    // Using exec or match produces a NULL.

    regexInfo   = regexObject.exec( textarea.value );


    document.querySelector( 'span#regex' ).innerHTML =
      ( ( regexInfo === null )
        ? 'NULL'
        : '[' + regexInfo.join( '][' ) + ']' );

    hasError    = !( regexInfo );

    //
    // When there aren't any matches, set the textarea as invalid, and the
    // css rule, below, causing the textarea's value to show in red.
    //

    textarea.setCustomValidity( hasError
                                ? 'Please match the format requested.'
                                : '' );
    return !hasError;
  }

}

// Test the initial condition with whatever data was preloaded.

validateTextArea( document.querySelector( '#test' ) );

// Test the input/changes that the user makes to the textarea's contents.

document.querySelector( '#test' )
.addEventListener( 'input', validateTextArea, false );

document.querySelector( '#test' )
.addEventListener( 'change', validateTextArea, false );
<style>
:valid,
.ok {
  color: green;
}
:invalid,
.error {
  color: red;
}
</style>
<textarea id="test" name="test"
          pattern="([A-Z0-9 ()]+)(?:,\s*)?"
          flags="gmi">This, is, a test(hi), and so, is this,</textarea>
<button onclick="validateTextArea( document.querySelector( '#test' ) );">Test</button><br />
<br />
Pattern: <span id="pattern"></span><br />
Flags:  <span id="flags"></span><br />
Regular Expression Info: <span id="regex"></span><br />

回答1:


There's a number of issues relating to the fact you're using the RegExp constructor rather than a RegExp literal.

/([A-Z0-9 ()]+)(?:,\s*)?/gmi/

Firstly, no RegExp pattern, whether literal or via constructor, has a slash after the flags.

Secondly, if you're using the constructor, you don't use the encasing slashes at all.

Thirdly, if you're using the constructor, the flags do not follow the pattern (as they do with literals); rather, they are fed as the second argument of the constructor.

Putting it all together:

mystring.match(new RegExp('([A-Z0-9 ()]+)(?:,\s*)?', 'gmi'));

...will produce the matches.



来源:https://stackoverflow.com/questions/65790433/please-help-my-regular-expression-is-not-consistently-producing-the-right-group

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!