tc39 / proposal-shadowrealm Goto Github PK

View Code? Open in Web Editor NEW

1.4K 90.0 67.0 1.42 MB

ECMAScript Proposal, specs, and reference implementation for Realms

Home Page: https://tc39.es/proposal-shadowrealm/

HTML 100.00%

proposal-shadowrealm's Introduction

ECMAScript spec proposal for ShadowRealm API

Status

Champions

@dherman
@caridy
@erights
@leobalter
@rwaldron
@legendecas

Index

What are ShadowRealms?
API (TypeScript Format)
Presentations
History
Contributing
- Updating the spec text for this proposal

What are ShadowRealms?

ShadowRealms are a distinct global environment, with its own global object containing its own intrinsics and built-ins (standard objects that are not bound to global variables, like the initial value of Object.prototype).

See more at the explainer document.

API (TypeScript Format)

declare class ShadowRealm {
    constructor();
    importValue(specifier: string, bindingName: string): Promise<PrimitiveValueOrCallable>;
    evaluate(sourceText: string): PrimitiveValueOrCallable;
}

See some examples in the Explainer file.

Presentations

History

we moved on from the exposed globalThis model to a lean isolated realms API (see #289 and #291)
we worked on this during ES2015 time frame, so never went through stages process (ES6 Realm Objects proto-spec.pdf)
got punted to later (rightly so!)
goal of this proposal: resume work on this, reassert committee interest via advancing to stage 2
original idea from @dherman: What are Realms?

Contributing

Updating the spec text for this proposal

The source for the spec text is located in spec.html and it is written in ecmarkup language.

When modifying the spec text, you should be able to build the HTML version by using the following command:

npm install
npm run build
open dist/index.html

Alternatively, you can use npm run watch.

proposal-shadowrealm's People

Contributors

Stargazers

Watchers

proposal-shadowrealm's Issues

[shim] add shim/test to validate against existing Caja/SES tests

[shim] verify that all stdlibs are in place

[task] define API to control the global contour in realm

strawman:

for (let name of realm.globalContour.names) {
     // ...
}
realm.eval("class Foo {}");
console.log("Foo" in realm.globalContour.names); // yield "true"
realm.eval("let x = 12;");
realm.eval("x == 12"); // yield "true"
realm.globalContour.delete("x"); // should this be allowed?

other notes from @dherman:

globalContour is probably a very bad name.
delete is controversial, it goes against the invariants of a let declaration.

Does this proposal subsume the old one?

Does this proposal subsume the Realm API proposal at https://gist.github.com/dherman/7568885 ?

init() method

Rationale

From the strawman (from @dherman), this method was so that:

a) there could be a default init() method that _does_ populate the global object with the standard library
b) it's trivial to override to have an empty global object

In other words, it allows whitelisting, but it still has nice default behavior, e.g.:

when you say

const realm = new Realm();

you get a realm where e.g. realm.global.Array === realm.intrinsics.ArrayConstructor

when you say

class EmptyRealm extends Realm { init() { /* do nothing */ } };
const realm = new EmptyRealm();

you get a realm where e.g. !('Array' in realm.global)

Other notes

This method is not suppose to be called of the realm object. Can this be implemented via inheritance? or should this be a symbol or something?

The getters return new objects every time per spec

This seems bad. If that's the intention they should become functions, not getters, otherwise realm.intrinsics !== realm.intrinsics (and same for stdlib).

[shim] fix typeof

create state machine to detect typeof on the proxy (with's context)

It seems that calling the has() trap followed by the get() trap is only possible when the code evaluated is using a typeof. Based on this assumption (to be validated), we could simply assume that has() should always return true and the immediate followed get() returns undefined, in which case the typeof will work just fine.

Ref: https://github.com/google/caja/wiki/SES#typeof-variable

Identity discontinuity with object and array literals.

In the evaluation iframe, we unfortunately create a new form of identity discontinuity when we inject the parent window's Object and Array, either by using the with statement or a const declaration. This is in addition to the known identity discontinuity in the main window:

// in the iframe
const Array = window.parent.array;
[1,2,3] instanceof Array; // false
[1,2,3].constructor === Array; // false

// in the main window
evaluator("[1,2,3]") instanceof Array; // false
evaluator("[1,2,3]").constructor === Array; // false

Where evaluator() is some function defined in the iframe that performs the eval(). Also, checking the constructor is a popular technique used by libraries, for example in Facebook's Immutable.js:

function isPlainObj(value) {
  return value &&
    (value.constructor === Object || value.constructor === undefined);

See Immutable.js on GitHub

However, if we patch the iframe's Object and Array prototypes before shadowing them, then we regain the lost identity continuity in the iframe, and regain some in the main window:

// in the iframe
Array.prototype.constructor = window.parent.Array;
[1,2,3] instanceof Array; // false
[1,2,3].constructor === Array; // true

// in the main window
evaluator("[1,2,3]") instanceof Array; // false
evaluator("[1,2,3]").constructor === Array; // true

I was hopeful that this technique would also fix the instanceof operator in the main window (since by definition, instanceof tests the presence of constructor.prototype in the object's prototype chain). Obviously, we are not there yet!

[task] formalization of this proposal

transfer repo to tc39 org
set up the build for spec text
add details about the polyfill
gh-pages for spec text

Should globals from 262's Annex B be part of the intrinsics/stdlib?

Limit execution time to prevent malicious code from eating up CPU

Are there any plans to limit CPU execution? This can take 2 forms:

Limit the execution time. If execution does not complete, simply kill it.
Throttle the CPU to lower priority.

Maybe both.

The use case this would address would be allowing users to upload JS plugins, templates, etc, into an existing system without degrading the performance of the system. Let's say a malicious user uploads some JS that just spins in an infinite loop and eats up CPU. Being able to mitigate against this kind of threat would make those kinds of features feasible.

evaluator for modules

New method Realm.prototype.evalModule which will be equivalent to Realm.prototype.eval, but evaluates a module source text. This new method should be promise based, returning a promise to a Module Namespace Object:

const r = new Realm();
const ns = await r.evalModule(`export const x = 1;`);
console.log(ns.x); // yield "1"

This new method is just a convenience since you can achieve similar results by using the new import() syntax:

const r = new Realm();
const ns = await r.eval(`import("something")`);
console.log(ns.x); // yield "1"

where "something" resolves to the source text export const x = 1;.

Additionally, we can bikeshed on the names, and get Realm.prototype.eval to align with the new method. @dherman suggested: Realm.prototype.evalScript(scriptText) and Realm.prototype.evalModule(sourceText) or just Realm.prototype.script(scriptText) and Realm.prototype.module(sourceText).

Description of Realms.

Besides having the spec here, it'd be nice to link to (or write directly) what a Realm is, so that the spec makes more sense to those not yet familiar with what Realms are.

I myself am not even sure what to Google, as "realm" is a fairly generic word. Even "programming realm" doesn't help (or Google can't guess what I'm looking for).

Constructor seems to enforce that the global not have the same realm as the new Realm

Although this is kind of sketchy since we don't have a notion of an object having a realm in ES (yet?), it still seems like it would prevent these realms from reflecting realistic situations. For example, on the web, the Window global object created during realm-creation time is created in the new Realm. Most obviously this means it has newRealm's Object.prototype in its prototype chain, but this also has subtle implications for various APIs.

I don't see a way to do that in this API. It seems like in this API,

newRealm.eval('this.__proto__ === Object.prototype')

will always be false, because the global object has to be passed in to the Realm constructor, and thus must be created before the realm exists, and thus cannot be the Realm's Object.prototype.

Note that in InitializeHostDefinedRealm there are explicit "callback" steps for the global object and global this binding in step 7 and 8. They aren't arguments, because that would prohibit all the interesting situations we encounter on the web and in Node. Instead they have to be "callbacks" that execute after the realm has already come into existence.

[shim] Feasibility of running tests from test-262 inside the shim

[shim] add example of a polyfilled realm instance

create a new realm
patch the realm with an existing polyfill (e.g.: Array.prototype.includes)
evaluate code that relies on the polyfilled feature

[shim][investigation] find a way to support global-contour for `let` and `const`

@directEval and @indirectEval need better names

These are correctly described in prose as "hooks" ("traps" would also be ok). However, the names "@directEval" and "@indirectEval" suggest that they are actually doing the evaluation, or that they are called in order to do the evaluation.

If we called these "directEvalHook" and "indirectEvalHook" (or "...Trap") then their purpose would be clearer. We would then not need the added clarity and inconvenience of making them symbols.

Path to Stage 4

Stage 4

Stage 3

Stage 2

receive developer feedback
committee approval
spec text written
spec reviewers selected

Stage 1

presentation by @dherman

[task] should the hooks be called before or after `HostEnsureCanCompileStrings`?

specify how to create a new realm that inherit all intrinsics/primordials from parent realm

This will allow us to create multiple realms sharing the primordials but still creating the global object and the this value. This is somehow equivalent to the previously discussed Scopes concept.

[shim] add support for node

should hook methods denoted by symbols be marked as non-writable when extracted from the instance?

Realm.indirectEval and Realm.DirectEval hooks are cached during the constructor invocation, and successive attempts to change them will have no effects, should they be marked as non-writable?

const o = new Realm();
o[Realm.indirectEval] = function (x) { throw new Error('...') };
o.eval('1'); // will not throw

[shim] Compliance with Annex B

[shim] evaluate whether or not strict mode have to be enforced

Need some way to evaluate code as global code

https://rawgit.com/caridy/proposal-realms/master/index.html#sec-realm.prototype.eval says:

Extensible web: This is the dynamic equivalent of a <script> in HTML.

But it is not. "eval" evaluates code "as eval code". Script tags evaluate code "as global code". The first cannot be used to safely emulate the second. We need to be able to do both.

See
https://github.com/FUDCo/proposal-frozen-realms/wiki/Differences-between-SES,-SES5,-and-ES5#top-level-declarations
tc39/proposal-ses#24

should `get intrinsics` be frozen?

let intrinsics = myRealmObj.intrinsics;
intrinsics.Array = MyArray;

intrinsics.Array assignment has no side effect on the realm obj.

Do we really need a default `global` per realm?

Notes from discussion (oct 18th):

In the current proposal creating a realm does not create a global, instead, globals are created when creating new scopes from a realm. There are two points of contention here:

function's constructor is shared between all scopes.
the intrinsics and stdlib of the realm can be invoked directly e.g.:

const r = new Realm();
r.intrinsics.eval("this");
r.stdlib.eval.value("this");
r.intrinsics.Function(...);

side note: invoking the function constructor does not imply creating a new global or altering the contour, but the function itself might attempt to access some globals.

proposal: during the discussion, another suggestion was to prevent access to Function constructor by walking the prototype chain, and only allow it via the global evaluation Function. obviously, we will have to evaluate how much of the existing code is relying on that, otherwise existing code will not work inside a realm.

[shim] verify that all intrinsics are in place

How to prevent sloppy mode evaluation inside a realm, or inside a scope?

At the moment, both evaluators (eval and Function) are not enforcing "strict mode". The realm's hooks might help here, but we need to provide a reliable way to do so. The reason why this is important is because sloppy mode will allow inspection of the caller, which is undesirable in most cases.

`realm.indirectEval()` vs `realm.directEval()` vs `realm.nonEval()` hooks

These 3 methods of realm objects are hooks into any evaluation process carry on by the realm, whether it is by calling realm.eval() or calling eval or any reference to it from the code evaluated inside the realm.

These 3 methods are not suppose to be called directly of the realm object, few question about this:

is the statement above true?
should we restrict the access of the realm object?
Is there a case where we want direct invocation of these 3 methods of the realm object?

The basic explanation of these 3 methods, and their different with realm.eval() can be inferred from the following examples:

realm.eval(str);
// will call realm.indirectEval()

realm.eval("eval(str)");
// will call indirectEval(outer), directEval(inner)

realm.eval("(0,eval)(str)");
// will call indirectEval(outer), indirectEval(inner)

realm.eval("{ let eval = console.log; eval(str) }");
// will call indirectEval (outer), nonEval(inner)

[spec] Module Record can have an internal slot to store the result of HostResolveImportedModule() resolution

Dave came up with the idea of creating module records with a pre-linked resolution for each of its imports (linked during the import hook in the realm), in which case we don't have to call HostResolveImportedModule() when instantiating and evaluating module records created in a realm.

In 262, there is an invariant for the abstract operation HostResolveImportedModule():

Multiple different referencingModule, specifier pairs may map to the same Module Record instance. The actual mapping semantic is implementation-defined but typically a normalization process is applied to specifier as part of the mapping process. A typical normalization process would include actions such as alphabetic case folding and expansion of relative and abbreviated path specifiers.

We have two options:

the current invariant is not longer relevant if the resolved module record and its specifier is added to a new internal slot the first time it is resolved.
the invariant can be preserved, while the implementation can signal when to store the resolved module record and its specifier into the new internal slot.

Need better direct-eval test

Right now, one cannot set up a realm or lexical context in which (0,eval)(str) invokes a virtualized eval function -- by replacing or shadowing the global binding of eval -- and in which eval(str) performs a direct eval. The direct and indirect eval hooks are not an adequate substitute because they only enable virtualization by translation.

It should be possible to provide a completely different binding for the lexical eval variable, or even none at all, which still somehow stipulating that eval(str) in that context be a direct eval which still also calls the direct eval hook first.

I don't think this question implies a revival of the nonEval hook idea. I think that was something else, though possibly related.

HTML misrenders

HTML at https://rawgit.com/caridy/proposal-realms/master/index.html misrenders for me as shown in the attachment.

[investigation] indirect vs direct eval cross realms

It is not clear from 262 what will happen when eval from another realm is invoked, this leads to the question of: should it be consider "indirect eval" or "direct eval"? It seems that the answer is "indirect eval on the other realm", but it has to be clarified.

the nature of the realm's global object

Questions to be discussed with implementers:

Can the global object be a proxy?
Can two realms shared the same global object?

[shim] implement evaluator for generator functions

Constructor design seems weird

The current design accepts (target, handler) and then, if both were supplied, uses them to create a proxy. If zero or one of them were supplied, it falls back to the default (an empty object).

As far as I can tell, this seems to be trying to enforce that the global object is a proxy. Why? Why not just allow any object to be passed in, including a proxy if you want, but an ordinary object if not?

scope vs realm

We still have a hard time drawing the line between these two concepts. Aside from that, the contour plays another important role to hold other bindings (let, const, etc.), and it is not very clear at the moment where do these pieces fit.

[shim] add example of inheritance

Options

extend the RealmShim class
provide a custom init method
provide a custom intrinsics getter
provide a custom stdlibs getter
provide a custom eval method
provide a proxy to the global object

ES6 Realm proto-spec

Note that ES6 drafts included a Realm api for quite sometime before it was drop. The main reason for dropping it was that once the module loader API was dropped from ES6 there was concern that shipping the Realm API in ES6 might lead to problems when we wanted to reintroduce a loader API in a future ES edition.

The attachment is contains that Realm API spec text that was removed from the ES6 draft.

ES6 Realm Objects proto-spec.pdf

How to get reference to current global realm

This would be very useful for certain security applications, to restrict or modify a function's behavior (e.g. like what weak references already require) if it's called from a different realm. Something like get Realm.current would be perfect for this application.

const myRealm = Realm.current
const theirRealm = new Realm()

myRealm !== theirRealm.eval("Realm.current")
theirRealm === theirRealm.eval("Realm.current")

Should we really include all the intrinsics?

There are a couple interesting problems with the current approach of copying all the intrinsics from table 7:

Membership in the list varies depending on whether 262, and other specs (such as HTML), need a copy of the intrinsic at some later time. For example, entries like ObjProto_toString and ObjProto_valueOf are not at all fundamental intrinsics (compared to e.g. Object.prototype.hasOwnProperty), but exist in the table simply because other specs need them. Similarly, we have a long-outstanding issue to formalize JSON-parsing/serializing in the web platform, which will likely lead to adding JSON_parse and JSON_serialize intrinsics. I don't think we've run into a case of the table shrinking over time yet, but that also seems plausible, e.g. if a spec is refactored to no longer need such a reference, then we might remove it from the table.
At least one upcoming intrinsic, %AsyncFromSyncIteratorPrototype%, is really just a spec device, which you cannot observe through the engine. Exposing it through Realm APIs would preclude implementations that never reify that prototype.

Neither of these is a showstopper, but they do imply to me that maybe copying table 7 is not the right move. Instead perhaps we should be looking for what the actual use cases for the intrinsics are and trying to craft something based on that. In particular I'm skeptical that intrinsics provides any value over stdlib.

Make the hooks constructor arguments instead of symbol properties

What is the use case for changing directEval/indirectEval (and maybe init??) after the fact? They're not protocols across many objects like most symbol-valued properties.

I'd suggest making them constructor arguments so only the creator of the Realm instance can customize directEval/eval behavior. (And maybe init, although I don't understand that. ~~I guess I'll file a separate issue for that.~~ I guess #10 is a reasonable explanation.)

Implementation of the Shim without `with` and `proxy`

Meeting Notes

Motivation

with + proxy seems to be very slow. initial evaluation is fine, but JIT can't really optimize things if the look up is not deterministic. According to some research from JF (from salesforce) it is multiple order of magnitude slower, and degrades quickly for more complex code.

Proposal

function addReturnStatementForCompletionValueToSource(sourceText) {
    return 'return ' + sourceText; // this is dummy, but Mark can help to improve this
}

// build one of this per realm
var sourceOfEvaluator = `
(function() {
    var parent = null;
    var top = null;
    return function evaluatorFactoryForGlobal(window) {

        // ... begin of the taming process
        var addEventListener = window.addEventListener;
        // add a line like the above for `name in window`
        // or set it to undefined if you should not have access
        // ... end of the taming process

        function secureEval() {
            return eval('(function(eval){return(function(){ "use strict"; ' + 
               addReturnStatementForCompletionValueToSource(arguments[0]) +
            '})()})(secureEval);');
        }
        return secureEval;
    };
})();
`;

// secureEval is constructed per realm and cached
var secureEval = (0,eval)(sourceOfEvaluator)(window); // app level window
var secureEval2 = (0,eval)(sourceOfEvaluator)(secureWindow); // secureWindow is based on iframe

// regular invocation of eval on a system mode
console.log(secureEval('addEventListener'));
console.log(secureEval('eval("addEventListener")'));

// regular invocation of eval on a realm
console.log(secureEval2('addEventListener')); // yields a secure secureWindow.addEventListener
console.log(secureEval2('eval("addEventListener")')); // yields a secure secureWindow.addEventListener

Initial analysis reveal that this could be 35% slower than a regular evaluation, which is a considerable improvement from the multiple-order-of-magnitude we are seeing with the other technique.

Compromises

Still requires "strict mode" on all code evaluated in the realm.
In theory, no new variable should be added to the iframe's window ever because the reusability of the evaluator is critical.
Still direct eval is now available on all code evaluated in the realm.

Notes

AV: Decomposition? const { foo, bar, baz, ... } = newGlobal;
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment
AV: Inspection of global object with Reflect.ownKeys()
MM: need a good algo to extra all the names from window, and its prototype chain.
MM: with this new solution, we can break out of the eval call by using ;}}eval("game-over");, we need to figure how to protect against this, but this should not be a deal breaker.
CP: we will try it with the extra protection of the with + proxy and see if it doesn't affect the JIT, while providing an extra belts and suspensions mechanism, JF can report the numbers next week.
Jorge: will help with the completion of the taming process for intrinsics.

Annexes

"The Eval that Men Do":

http://brrian.org/papers/ecoop2011-javascript-eval.pdf

Webkit Bug on Function constructor:

https://bugs.webkit.org/show_bug.cgi?id=106160

hook to resolve module when `import()` is called in a realm

Create a new Hook to intercept import() calls as a way to control what can be imported from within the realm.

This new hook called importHook, or something similar, should return a module record or a promise to a module record that is evaluated, or ready to be evaluated.

Open Question

Should this also return a namespace exotic object? E.g.: importing from the outer realm using import() and pipe that into the inner realm.

[shim] add example of a frozen realm

extend RealmShim
provide a new method called patch that will be called during the initialization phase
freeze intrinsics that cannot be polyfilled during the construction phase
provide a patch method can mutate some intrinsics (e.g.: polyfill Array.prototype.includes)
call patch during the init phase
freeze the rest of the intrinsics after calling patch
evaluate code that validate the frozen realm

Informal meeting notes from breakout session about Realms

Attendies: @dherman @erights @wycats @caridy

Notes

The "Evaluator" concept to have a parametrized evaluator is eating up the Realm concept. It seems that we don't really need the two separate things.
@wycats recommendation to simplify the API by simply allowing the creation of new realms that can be initialized with the intrinsics from other realm seems to be an important and desired simplification. This is very similar to the initial proposal of spawning realms, but it is now more formalized.
We need more clarification between intrinsics and primordials, and codify that in the spec.
We need more discussion around sloppy vs strict evaluation invocation from realm, @wycats feels that concatenating "use strict" might be sufficient, and desirable because of the implications of evaluating sloppy code in strict mode, and the possible frustration of users when code misbehaves when forced to run in strict mode.
@erights is interested in exploring a little bit more the possibility of having an eval hook that can do the actual evaluation instead of just the program rewrite as it does today. It sounds like a very long shot though.

The formalization of the realm concept (tiling process from @dherman):

Realm's Configuration:

intrinsics/primordials
~~global object config~~
import hook
import.meta hook
eval hook
isDirectEval hook

Realm's Responsibilities:

Evaluate script
Evaluate Module

Realm's Features:

evaluate script in strict mode or sloppy mode
evaluate scripts are global script vs eval script

Examples

Some of the configuration

const r = new Realm({
  intrinsics: "inherit", // or maybe "parent"
  importHook: "inherit",
  importMetaHook: "inherit",
  evalHook: "inherit",
  isDirectEval: "inherit",
});

In the previous example, the Realm instance r is effectible a new scope since it will inherit all the hooks and all the intrinsics from the parent realm, but creates a new global object and a new this value, and allow evaluations bound to those values without introducing identity discontinuity, and delegating to the HOST behavior for import() calls.

Similarity, each of those configurations can be customized:

if intrinsics is not specified, it will create a new set of intrinsics
if importHook is not specified, it will throw on any import call. And if specified as a function, it will be used as the import hook as spec'd today.
if evalHook is not specified, it will throw on any direct eval invocation. And if specified as a function, it will be used as the eval hook as spec'd today.
if isDirectEval is not specified, it will run the default algo.

Open questions

what should be the behavior if importMetaHook is not specified?
can the global object value be proxified?
can the this value be proxified?

Look for names in WebIDL that are not valid JS identifiers

Issue 54 would benefit from knowing if the default environment binds globally properties that are not identifiers.

Crawl WebIDL and compare against the ReservedWord production.

Allow customizing global this separately from global object

This would be required for this to be usable in jsdom. Right now the spec always passes undefined as globalThis, thus making it equal to the global object.

tc39 / proposal-shadowrealm Goto Github PK

proposal-shadowrealm's Introduction

ECMAScript spec proposal for ShadowRealm API

Status

Champions

Index

What are ShadowRealms?

API (TypeScript Format)

Presentations

History

Contributing

Updating the spec text for this proposal

proposal-shadowrealm's People

Contributors

Stargazers

Watchers

Forkers

proposal-shadowrealm's Issues

Rationale

Other notes

create state machine to detect typeof on the proxy (with's context)

Stage 4

Stage 3

Stage 2

Stage 1

Notes from discussion (oct 18th):

Options

Meeting Notes

Motivation

Proposal

Compromises

Notes

Annexes

Open Question

Notes

The formalization of the realm concept (tiling process from @dherman):

Examples

Open questions

Recommend Projects

Recommend Topics

Recommend Org