What happens when I set the same variable to the same regex value in multiple statements?

前端 未结 3 1271
青春惊慌失措
青春惊慌失措 2020-12-20 05:29

Let\'s say I do this:

re = /cat/;
re = /cat/;

From reading Zakas\' book about Javascript, it seems that when executing the second line, no

3条回答
  •  粉色の甜心
    2020-12-20 06:31

    In modern JavaScript (ES5+), evaluating a RegExp literal is specified to return a new instance each time a regular expression literal is evaluated. In ES3, a JavaScript literal creates a distinct RegExp object for each literal (including literals with the same content) at parse time and each “physical” literal always evaluates to the same instance.

    So, in both ES5 and ES3, the following code will assign distinct RegExp instances to re:

    re = /cat/;
    re = /cat/;
    

    However, if these lines are executed multiple times, ES3 will assign the same RegExp object on each line. In ES3, there will be exactly two instances of RegExp. The latter instance will always be assigned to re after executing those two lines. If you copied re to another variable in the meantime, you will see that re === savedCopy.

    In ES5, each execution will produce new instances. So each time those lines run, a new RegExp object will be created for the first line and then another new RegExp object will be created and saved to the re variable for the second line. If you copied re to another variable in the meantime, you will see that re !== savedCopy.

    Specs

    ECMAScript 3rd Edition (ECMA-262) ­­­§ 7.8.5 (p. 20) states the following (emphasis added on pertinent text):

    7.8.5 Regular Expression Literals

    A regular expression literal is an input element that is converted to a RegExp object (section 15.10) when it is scanned. The object is created before evaluation of the containing program or function begins. Evaluation of the literal produces a reference to that object; it does not create a new object. Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical. A RegExp object may also be created at runtime by new RegExp (section 15.10.4) or calling the RegExp constructor as a function (section 15.10.3).

    ECMAScript 5.1 (ECMA-262) § 7.8.5 states the following (emphasis added on pertinent text):

    7.8.5 Regular Expression Literals

    A regular expression literal is an input element that is converted to a RegExp object (see 15.10) each time the literal is evaluated. Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical. A RegExp object may also be created at runtime by new RegExp (see 15.10.4) or calling the RegExp constructor as a function (15.10.3).

    This means that the behavior is specified differently between ES3 and ES5.1. Consider this code:

    function getRegExp() {
        return /a/;
    }
    console.log(getRegExp() === getRegExp());
    

    In ES3, that particular /a/ will always refer to the same RegExp instance and the log will output true because the RegExp is instantiated once “when it is scanned”. In ES5.1, every evaluation of /a/ will result in a new RegExp instance, meaning that creation of a new RegExp happens each time the code refers to it because the spec says that it is “converted to a RegExp object (see 15.10) each time the literal is evaluated”.

    Now consider this expression: /a/ !== /a/. In both ES3 and ES5, this expression will always evaluate to true because each distinct literal gets a distinct RegExp object. In ES5 this happens because each evaluation of a literal always results in a new object instance. In ES3.1 this happens because the spec says “Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical.”.

    This change in behavior is documented as an incompatibility with ECMAScript 3rd Edition in ECMAScript 5.1 (ECMA-262) Annex E:

    Regular expression literals now return a unique object each time the literal is evaluated. This change is detectable by any programs that test the object identity of such literal values or that are sensitive to the shared side effects.

    Old code may have been written to rely on the ES3 behavior. This would allow a function to be called multiple times to incrementally walk through matches in a string when the expression was compiled with the g flag. This is similar to how, in C, the non-reentrant strtok() method works. If you want the same effect with ES5, you must manually store the RegExp instance in a variable and ensure that the variable has a long enough lifetime since ES5 effectively gives you behavior like the reentrant strtok_r() method instead.

    Optimization Bugs

    Supposedly there are bugs in JavaScript implementations which result in RegExp object caching resulting in observable side effects which should be impossible. The observed behavior does not necessarily adhere to either the ES3 or ES5 specification. An example for Mozilla is given at the end of this post with the spoiler text and explanation that the bug is not observable when debugging but is observable when the JavaScript is running in non-debug optimized mode. The blog author wrote a comment saying the bug was still reproducible in stable Firefox as of 2017-03-08.

提交回复
热议问题