archives

« Bugzilla Issues Index

#4078 — 21.2 RegExp: Incomplete compatibility for %RegExpPrototype%


js> "".match(RegExp.prototype)
uncaught exception: TypeError: function called on an incompatible object
js> "".split(RegExp.prototype)
uncaught exception: TypeError: function called on an incompatible object
js> "".replace(RegExp.prototype)
uncaught exception: TypeError: function called on an incompatible object

The TypeErrors are thrown because %RegExpPrototype% is not handled in the various RegExp flags accessors methods.


It is certainly possible to patch each accessor to handle %RegExpPrototype%, but I start to believe
that simply adding [[OriginalSource]] and [[OriginalFlags]] internal slots to %RegExpPrototype% is
a better solution. Therefore I'd like to propose the following changes:


1) Remove special casing %RegExpPrototype% in RegExp.prototype.test( S ) step 5.
2) Add [[OriginalSource]] and [[OriginalFlags]] internal slots to %RegExpPrototype%. The slot values are the empty string.
3) Add an assertion to 21.2.3.2.2 RegExpInitialize as the first step: "1. Assert: obj has a [[RegExpMatcher]] internal slot.".
4) Handle %RegExpPrototype% in 21.2.5.2 RegExp.prototype.exec and 21.2.5.2.1 RegExpExec this way:

21.2.5.2 RegExp.prototype.exec ( string ) - Replace steps 3-4 with:
3. If R has [[OriginalSource]] and [[OriginalFlags]] internal slots, then
a. If R does not have a [[RegExpMatcher]] internal slot, then
i. Let R be RegExpCreate(R.[[OriginalSource]], R.[[OriginalFlags]]).
ii. ReturnIfAbrupt(R).
4. Else, throw a TypeError exception.

21.2.5.2.1 Runtime Semantics: RegExpExec ( R, S ) - Replace step 6 with:
6. If R has [[OriginalSource]] and [[OriginalFlags]] internal slots, then
a. If R does not have a [[RegExpMatcher]] internal slot, then
i. Let R be RegExpCreate(R.[[OriginalSource]], R.[[OriginalFlags]]).
ii. ReturnIfAbrupt(R).
7. Else, throw a TypeError exception.

5) Change 21.2.5 Properties of the RegExp Prototype Object, from:
> It is not a RegExp instance and does not have a [[RegExpMatcher]] internal slot or any of the other internal slots of RegExp instance objects.
to:
> It is not a RegExp instance and does not have a [[RegExpMatcher]] internal slot. However, it has [[OriginalSource]] and [[OriginalFlags]] internal slots.
> The value of the [[OriginalSource]] and [[OriginalFlags]] internal slot is the empty string.

6) Change 21.2.3.1 RegExp - Replace step 5 with:
5. If Type(pattern) is Object and pattern has [[OriginalSource]] and [[OriginalFlags]] internal slots, then

7) Change 7.2.8 IsRegExp - Replace step 5 with:
5. If argument has [[OriginalSource]] and [[OriginalFlags]] internal slots, return true.

8) Change B.2.5.1 RegExp.prototype.compile (pattern, flags ) - Replace step 3 with:
3. If Type(pattern) is Object and pattern has [[OriginalSource]] and [[OriginalFlags]] internal slots, then


Legacy compat for these built-in prototypes is something we have been approaching on an asymptotically good-enough basis. It isn't clear whether or not using RegExp.prototype as the argument to those string methods are inside or beyond the good=enough line.

If we really found we needed to push legacy compatibility to include them it would probably be better to simply revert to %RegExpPrototype% being a fully configured RegExp instance with all the appropriately initialized internal slots.


I consider the current workarounds in RegExp.prototype.exec and RegExp.prototype.test [1] to be half-baked solutions. If compatibility is needed for %RegExpPrototype%, it should be present for every legacy API - that includes the String.prototype methods.

[1] RegExp.prototype.test doesn't even need to special case %RegExpPrototype% !


(In reply to Allen Wirfs-Brock from comment #1)
> If we really found we needed to push legacy compatibility to include them it
> would probably be better to simply revert to %RegExpPrototype% being a fully
> configured RegExp instance with all the appropriately initialized internal
> slots.

That's the alternative solution. The only reason why I'm proposing to add just [[OriginalSource]] and [[OriginalFlags]], is the RegExp.prototype.compile issue described in [2]. If %RegExpPrototype% is reverted to a normal RegExp instance, RegExp.prototype.compile needs to handle %RegExpPrototype% (for the current and foreign realms; looking at you, V8! [3]).

[2] https://mail.mozilla.org/pipermail/es-discuss/2015-February/041656.html
[3] https://github.com/v8/v8-git-mirror/blob/27a3879617245f347213a739eee3727e24a0c608/src/regexp.js#L63-L67


Cc'ing Kyle Simpson.
Maybe he can share some use cases where RegExp.prototype is used as a RegExp instance?


I think these changes should be reverted. We do not yet have compelling evidence that this will break enough sites to cause browser game theory to make it un-implementable.


As I've said on es-discuss, my primary usage pattern is, in order, `Function.prototype`, `Array.prototype`, and to a much lesser (but non-zero) extent, `RegExp.prototype`. I have never relied on `String.prototype` being a String, `Number.prototype` being a Number, or `Boolean.prototype` being a Boolean. `Object.prototype` is a non-issue since it was already a plain object, and I've never cared about its `instanceof` behavior.

We've already discussed `Function.prototype` and why it had to be rolled back. I also provided some evidence for `Array.prototype` being rolled back.

As for `RegExp.prototype`, the only place I ever used this was as a default value for a parameter to a function utility, something similar to this:


function lookForAllMatches(str,re) {
str = str || "";
re = re || RegExp.prototype;
if (!re.global) re = new RegExp(re.source,"g");
return str.match(re);
}


Could I have done just `/(?:)/` instead of `RegExp.prototype`? Of course. But there was no compelling reason back then, nor was there any sense or signal that such a fundamental and simple thing would ever be changed.

That's it, that's my only evidence.

-------

That evidence is not, admittedly, very compelling. But I would submit that the evidence for the change (making RegExp.prototype not a regex) is equally uncompelling, perhaps even less so. I really can't imagine any scenarios where people will be confused by `RegExp.prototype` being `instanceof RegExp`. I think that's just academic fodder.

One other tiny point in favor is keeping consistency with `Function.prototype` and `Array.prototype`, since all 3 of these have at least some demonstrated utility as default "empties" of their respective types.

But to boil it all down, I won't lose any sleep if `RegExp.prototype` changes. I would be vehemently opposed (not that that matters) to `Function.prototype` or `Array.prototype` changing.


fixed in rev35

I reverted the %RegExpPrototype% test in the exec and test methods. I think they are two much of a mack to standardize.

For, for now at least RegExp.prototype is an ordinary object that doesn't have any RegExp internal slots and no special legacy compat. recognition.

I think we may still end up revert it to being a RegExp instance, but not in this draft.


fixed in rev35