straker / html-tagged-template Goto Github PK

Proposal to improve the DOM creation API so developers have a cleaner, simpler, and secure interface to DOM creation and manipulation.

JavaScript 97.79% HTML 2.21%

html template-literals dom-creation dom-api template-string

html-tagged-template's Introduction

Proposal

Improve the DOM creation API so developers have a cleaner, simpler interface to DOM creation and manipulation.

Installing

npm install html-tagged-template

or with Bower

bower install html-tagged-template

Usage

let min = 0, max = 99, disabled = true;

// returns an <input> tag with all attributes set
// the use of ?= denotes an optional attribute which will only be added if the
// value is true
let el = html`<input type="number" min="${min}" max="${max}" name="number" id="number" class="number-input" disabled?="${disabled}"/>`;
document.body.appendChild(el);

// returns a DocumentFragment with two <tr> elements as children
let el = html`<tr></tr><tr></tr>`
document.body.appendChild(el);

Optional Attributes

To add an attribute only when it's value is true (such as disabled), use attrName?="${value}". If the value is true, the attribute will be added in the output, otherwise it will be omitted from the output.

Contributing

The only way this proposal will continue forward is with help from the community. If you would like to see the html function in the web, please upvote the proposal on the W3C DOM repo.

If you find a bug or an XSS case that should to be handled, please submit an issue, or even better a PR with the relevant code to reproduce the error in the xss test.

Problem Space

The DOM creation API is a bit cumbersome to work with. To create a single element with several attributes requires several lines of code that repeat the same thing. The DOM selection API has received needed features that allow developers to do most DOM manipulation without needing a library. However, the DOM creation API still leaves something to be desired which sways developers from using it.

Below are just a few examples of how DOM creation requires multiple lines of code to accomplish simple tasks and how developers tend to work around the API to gain access to a much simpler interface.

/*
  Create a single element with attributes:
  <input type="number" min="0" max="99" name="number" id="number" class="number-input" disabled/>
*/
let input = document.createElement('input');
input.type = "number";
input.min = 0;
input.max = 99;
input.name = 'number';
input.id = 'number';
input.classList.add('number-input');
input.disabled = true;
document.body.appendChild(input);

// or the hacky way - create a throwaway parent node just to use innerHTML
let div = document.createElement('div');
div.innerHTML = '<input type="number" min="0" max="99" name="number" id="number" class="number-input" disabled/>';
document.body.appendChild(div.firstChild);


/*
   Create an element with child elements:
   <div class="container">
     <div class="row">
       <div class="col">
         <div>Hello</div>
       </div>
     </div>
   </div>
 */
// use document fragment to batch appendChild calls for good performance
let frag = document.createDocumentFragment();
let div = document.createElement('div');
div.classList.add('container');
frag.appendChild(div);

let row = document.createElement('div');
row.classList.add('row');
div.appendChild(row);

let col = document.createElement('div');
col.classList.add('col');
row.appendChild(col);

let child = document.createElement('div');
child.appendChild(document.createTextNode('Hello'));  // or child.textContext = 'Hello';
col.appendChild(child);
document.body.appendChild(frag);

// or the convenient way using innerHTML
let div = document.createElement('div');
div.classList.add('container');
div.innerHTML = '<div class="row"><div class="col"><div>Hello</div></div></div>';
document.body.appendChild(div);


/*
   Create sibling elements to be added to a parent element:
   <!-- before -->
   <ul id="list">
     <li>Car</li>
   </ul>

   <!-- after -->
   <ul id="list">
     <li>Car</li>
     <li>Plane</li>
     <li>Boat</li>
     <li>Bike</li>
   </ul>
 */
let frag = document.createDocumentFragment();
let li = document.createElement('li');
li.textContent = 'Plane';
frag.appendChild(li);

li = document.createElement('li');
li.textContent = 'Boat';
frag.appendChild(li);

li = document.createElement('li');
li.textContent = 'Bike';
frag.appendChild(li);
document.querySelector('#list').appendChild(frag);

// or if you have the ability to create it through a loop
let frag = document.createDocumentFragment();
['Plane', 'Boat', 'Bike'].forEach(function(item) {
  let li = document.createElement('li');
  li.textContent = item;
  frag.appendChild(li);
});
document.querySelector('#list').appendChild(frag);

Proposed Solution

We propose that a global tagged template string function called html provide the interface to accept template strings as input and return the parsed DOM elements.

let min = 0, max = 99, disabled = true, text = 'Hello';

// single element with attributes
html`<input type="number" min="${min}" max="${max}" name="number" id="number" class="number-input" disabled?="${disabled}"/>`;

// single element with child elements
html`<div class="container">
  <div class="row">
    <div class="col">
      <div>${text}</div>
    </div>
  </div>
</div>`;

// sibling elements
html`<li>Plane</li>
     <li>Boat</li>
     <li>Bike</li>`;

Easy to Use

This proposal wouldn't exist if creating the DOM was easy. Any improvement to the DOM creation API would essentially need to replace innerHTML with something better and just as easy (if not easier), otherwise developers will continue to use it to work around the API.

Proposed Solution

To solve this problem, we propose a new API that will allow developers to create single, sibling, or nested child nodes with a single function call. With the addition of template strings to ECMAScript 2015, we and others feel that they are the cleanest, simplest, and most intuitive interface for DOM creation.

ECMAScript 2015 also introduced tagged template strings which would allow a function to accept a template string as input and return DOM. Tagged template strings also have the advantage that they can understand where variables were used in the string and be able to apply security measures to prevent XSS.

Secure

XSS attacks via string concatenation are among the most prevalent types of security threats the web development world faces. Tagged template strings provide a unique opportunity to make creating DOM much more secure than string concatenation ever could. Tagged template strings know exactly where the user substitution expressions are located in the string, enabling us to apply preventative measures to help ensure the resulting DOM is safe from XSS attacks.

Proposed Solution

There have been two proposed solutions for making template strings secure against XSS: E4H, championed by Ian Hixie, and contextual auto escaping, championed by Mike Samuel.

E4H uses an AST to construct the DOM, ensuring that substitutions are made safe against element and attribute injection. Contextual auto escaping tries to understand the context of the attribute or element in the DOM and correctly escape the substitution based on it's context.

We propose combining the best ideas from both E4H and contextual auto escaping and avoiding the problems that both encountered. First, the template string is sanitized by removing all substitution expressions (and all XSS attack vectors with them), while leaving placeholders in the resulting string that identify the substitution that belonged there. Next, the string is passed to an HTML template tag using innerHTML, which runs the string through the HTML parser and properly creates elements out of context (such as <tr> elements). Finally, all placeholders are identified and replaced with their substitution expression using the DOM APIs createElement, createTextNode, and setAttribute, and then using contextual auto escaping to prevent further XSS attack vectors.

html-tagged-template's People

Contributors

Stargazers

Watchers

Forkers

mozfreddyb kublaj tbranyen firmfirm kapouer cbforks cmcaine landvibe artskin alexandrehpiva

html-tagged-template's Issues

Comparison to hyperx

What's the difference between this and hyperx? (Apart from the "proposal that this be a standard".)

Should substitutionindex: really be deterministic?

I couldn't come up with a nice test case, but I'm concerned about an attacker supplying something like
<p>${xss}</p><a title="substitutionindex:0">yo</a>

Safe URLs deemed unsafe during attribute substitution

An element created with an interpolated URL:

let photo = 'https://example.com/example.png';
document.body.appendChild(html`<img src="${photo}">`);

produces:

<img src="#zXssPreventedz">

Unused parameter

Hello, on line 138, match is an unused parameter.

Why not just use this library?

Hey guys, forgive me if this is an obvious question, but why is this a "proposal" and not just a library? Can't developers just use this library and accomplish the "easier DOM element construction" you describe in the readme? It sounds like you're proposing it be native functionality within browsers, but why is that necessary? Is there something that you can't accomplish with tagged template literals? Or is it just not performant enough to do it that way?

Should we trim whitespace

New line characters at the start and end of the html string create text nodes. Should we just trim the string to remove these? I don't see any reason you would purposefully want those start and end text nodes myself.

// creates an empty text node, a p tag, and an empty text node
html`
<p>foo</p>
`;

Alternative

I proposed a method called String.substitute()

Which would work like the following:

var html = `<input value="${val}">`;

String.substitute(html, {
  val: 'hello'
}); // <input value="hello">

I think this would work much better, because this is not only for html, js now is on more and more platforms and not just the web.

Now for actual repeated markup, there is the <template> element, which should be extended to do the following:

<template id="temp">
  <li>${text}</li>
</template>
<ul id="list"></ul>

var list = document.getElementById('list');
var temp = document.getElementById('temp');
var dataArr = [ { text: 'hello' }, { text: 'hi' }, { text: 'hey' } ];

dataArr.forEach(function(data) {
  var li = temp.parse(data); // returns a documentFragment
  list.appendChild(li);
});

I think this is something that would be much better, to have an element that handles all of this Natively

Overloaded return type seems problematic

Moving the discussion from whatwg/dom#150 (comment).

I see a few alternatives:

Always return a DocumentFragment
Always return a single node (throw if it's not a single node)
- Optionally, add another helper (htmls??) that always returns an array of nodes
- Or, make the other helper always return a DocumentFragment

Maybe it's not too bad to have the overloaded return type, since users are always in control of its usage and it should be pretty obvious what the return type is? That is, there's no way to feed user input into a template string, so it should be pretty obvious from source inspection whether you're going to get multiple nodes or one node.

On the other hand, maybe it's not obvious: html

foo

`` creates two nodes, due to the leading whitespace. Even worse,

html`
<p>foo</p>
`;

creates three nodes. Having that throw might be better than having it silently become an array.

do not reject data: protocol

Though i don't have a clear idea if data: protocol can be a xss attack vector.
Maybe at least uri starting with data:image/ should be allowed - they should be pretty harmless.

Should we use the HTML parser

The HTML parser is difficult to work with. Creating nodes out of context (such as a <tr></tr>) result in no returned element, and the HTML parser generates unexpected results in many cases.

<image> create an "img" element, </br><br> creates two "br" elements, </p><p> creates more "p" elements than <p></p>, <table><input> creates two siblings while <table><input type=hidden> creates a parent/child relationship, <isindex prompt> creates half a dozen nodes including multiple text nodes and elements, though none of them with the tag name "isindex", and the output even has an attribute, though it's not called "prompt", <script> having all kinds of wacky interactions with the event loop, crazy things happening with association of form controls to form elements, elements being literally moved in the DOM as the DOM is created... the list of crazy behaviours is long and esoteric

However, if we were to avoid the HTML parser and instead try to use an AST to generate the DOM, we would probably have to recreate the HTML parser in some way, shape, or form to handle misnested tags (such as <b>A <span style="color:blue">B</b> C</span> outputting <b>A <span style="color:blue">B</span></b> C).

Both solutions have their benefits and problems, so the question becomes which is going to be easier to work with and generate the least amount of unexpected results for the user?

Libraries like jQuery use the HTML parser and have been for years, so maybe we should as well to coincide with what developers are already use to?

Render Fails with value set to ""

I noticed a problem with your code. In your example you have var min = 0, max = 99, disabled = true, heading = 1; With this the number input renders as expected, disabled. However, if you change the disabled value to false: var min = 0, max = 99, disabled = false, heading = 1;, the function crashes and nothing loads. This is because you have the following:

document.body.appendChild(html`<input type="number" min="${min}" max="${max}" name="number" id="number" class="number-input" ${ (disabled ? 'disabled' : '') }/>`);

That last part, ${ (disabled ? 'disabled' : '') } returns an empty string when disabled is set to false. That causes a problem on line 333 where you try to set an attribute on the node:

(tag || node).setAttribute(name, value);

When the value for an attribute is an empty string, as in the case above with disabled set to false, the method attempts to set that as an attribute, throwing an error. You could wrap that method with a test for a truthy attribute name like so:

// add the attribute to the new tag or replace it on the current node
// setAttribute() does not need to be escaped to prevent XSS since it does
// all of that for us
// @see https://www.mediawiki.org/wiki/DOM-based_XSS
if (tag || hasSubstitution) {
    if (name) {
        (tag || node).setAttribute(name, value);
    }
}

This solves the problem, but you should probably be checking the attribute earlier to make sure it isn't an empty value and skip it.

How should we handle DOM clobbering

Continued from #9

How should we deal with the issue of DOM clobbering? Currently it breaks the substitution logic of a nodes attributes, and could potentially break setting or removing attributes as well. @mozfreddyb suggested that we could do something similar to DOMPurifier and check that each property we need hasn't be clobbered. I'm not sure what to do with the clobbered node though since the substitution logic probably couldn't be run safely on said node.

Windows 10 Firefox 44 shifts the attribute NamedNodeMap when using removeAttribute

We had a failing test that revealed this bug, and I just confirmed it. When we loop over the attributes NamedNodeMap looking for substitutions, we use removeAttribute to remove a placeholder attribute name (substitutionindex:1:) with it's substitution name (disabled). However, in Firefox 44, using removeAttribute modifies the attributes NamedNodeMap, shifting the attribute out of the map and shifting all indexes by 1. This modification causes the for loop to skip an index so to speak, which caused the test to fail.

@domenic, @annevk is this behavior correct, or is this a bug in Firefox 44 (didn't happen in Windows 7, Firefox 43)? Do we still need to work around the behavior, possibly just keeping a list of attributes to remove after the loop is over so we don't do it in the loop?

let min = 0, max = 99;
let node = html`<input min="${min}" type="number" max="${max}"/>`

for (var i = 0; i < node.attributes.length; i++) {
  console.log(node.attributes);  // => [ type="number", max="99", min="0"]; 
  node.removeAttribute('type'); 
  console.log(node.attributes);  // => [ max="99", min="0" ];
}

allow sub templates ?

I just got this case:

let me = html`<p>some text${insertImage(opts)}</p>`

function insertImage(opts) {
  if (opts.src) return html`<img src=${opts.src} />`
  else return '';
}

which will serialize the img node to [object HTMLImageElement]. Cannot it be properly html-serialized ?

allow values to be array of Nodes (or text), fragments

html`<div class="field" title="${schema.description || ''}">
	<label>${schema.title}</label>
	<select name="${key}" class="ui dropdown">
		${schema.oneOf.map(getSelectOption)}
	</select>
</div>`;

where getSelectOption returns a node, so the map returns an array of nodes.

please don't make SUBSTITUION_INDEX look like a typo :)

This is hard on the heart of reviewers, thank you.

How should we allow safeHTML

Currently any variable is safely encoded when placed inside a text node. However, we should definitely have someway for the user to mark that the HTML is trusted and that it should be rendered as DOM elements rather than text nodes. However we decided to do it, an attacker shouldn't be able to mark their own string as trusted (e.g. using an expression to do so).

Easier boolean attributes

From the usage example, it looks like boolean attributes currently have to be manually handled by outputting either their name or an empty string, using a ternary.

This is a bit clumsy. It would be a lot more aesthetically appealing if you could write something like:

const el = html`<video controls?={showControls} url={url}></video>`;

That is, if the attr name is followed by a ?= operator, its value is checked as a boolean, and the attribute is either included or omitted accordingly.

(Any similar sort of syntax would work, of course.)

straker / html-tagged-template Goto Github PK

html-tagged-template's Introduction

Proposal

Installing

Usage

Optional Attributes

Contributing

Problem Space

Proposed Solution

Goals

Easy to Use

Proposed Solution

Other Solutions

Object-like notation

Secure

Proposed Solution

html-tagged-template's People

Contributors

Stargazers

Watchers

Forkers

html-tagged-template's Issues

Recommend Projects

Recommend Topics

Recommend Org