This specification defines an interoperable means for site developers to asynchronously transfer small HTTP data from the User Agent to a web server.
Latest draft @ https://w3c.github.io/beacon/
See also Web performance README
Beacon
Home Page: https://w3c.github.io/beacon/
License: Other
This specification defines an interoperable means for site developers to asynchronously transfer small HTTP data from the User Agent to a web server.
Latest draft @ https://w3c.github.io/beacon/
See also Web performance README
Grammar typo in Privacy and Security section:
https://w3c.github.io/beacon/#privacy-and-security
Spec text:
Compared to the alternatives, the
sendBeacon()
does apply two restrictions: there is no callback method, and the payload size can be restricted by the user agent. Otherwise, thesendBeacon()
API is not subject to any additional restrictions.
This part reads poorly:
... the
sendBeacon()
does apply ...
I'd expect the following: (like the later sentence in the section)
... the
sendBeacon()
API does apply ...
Following the thread at
http://lists.w3.org/Archives/Public/public-web-perf/2014Jul/0109.html
It seems to me that the security and privacy considerations should say that Beacon doesn't add extra security and privacy considerations in addition to the ones associated with form submissions in HTML. It should point to http://www.w3.org/TR/html5/introduction.html#fingerprint .
I'd like to see tests for 8a6fbdc and bugs filed against browsers. It's important to make sure that's all implemented correctly.
If for some reason this needs to change, please escalate to Fetch.
[[
User agents MUST honor the HTTP headers (including, in particular, redirects and HTTP cookie headers),
This seems to be new in this version of the spec and I don't understand the reasoning behind it. Why MUST user agents honor all response headers? If (as I believe most user agents do) a user agent typically ignores Set-Cookie headers from different origins, is that user agent non-conformant with Beacon? This requirement seems unlikely to be followed, as it would introduce privacy risks.
]]
https://lists.w3.org/Archives/Public/public-web-perf/2014Jul/0109.html
from @npdoty
See also http://www.w3.org/2014/10/28-webperf-irc
and http://www.w3.org/2015/10/webperf-tpac2015-minutes#item204
From https://w3c.github.io/beacon/#sec-sendBeacon-method
The user agent may delay transmission of provided data to optimize network and energy efficiency - e.g. deliver immediately if the network is active, or wait until network interface is active. However, the user agent should not delay transmission indefinitely and ensure that pending transmissions are periodically flushed even if there is no other network activity.
From https://w3c.github.io/beacon/#h-privacy
Similarly, from the privacy perspective, the resulting requests are initiated immediately when the API is called, or upon a page visibility change, which restricts the exposed information (e.g. user's IP address) to existing lifecycle events accessible to the developers. However, user agents might consider alternative methods to surface such requests to provide transparency to users.
The first quote allows user agents to delay requests (as long as the request eventually arrives, "should not delay transmission indefinitely"). This is at odds with the claim in the privacy section, which states that the requests are immediately triggered and that therefore the window of activity is too narrow for information leakage to occur.
As currently written, the following scenario is possible:
privacy-is-for-wimps.example.com
This scenario is specifically relevant to sendBeacon
because it is currently the only API that can send arbitrary data to a server even after the user has all reason to believe that they completely left the website.
This class of problems can be solved my exploring possible information leaks and specifying that the pending request queue should be flushed when these scenarios occur. For instance, the above scenario can be resolved by emptying the queue of pending requests when the network interface changes and/or rejecting the sendBeacon call when the network interface is down.
Our tests at web-platform-tests/wpt#4024 show that Worker is only exposed in Microsoft Edge. It seems like perhaps it should be limited to Window.
FYI, the document contains 2 broken fragments:
[[
https://fetch.spec.whatwg.org/#force-origin-header-flag (line 843)
https://fetch.spec.whatwg.org/#concept-request-context (line 845)
]]
Per https://bugzilla.mozilla.org/show_bug.cgi?id=1329298 it seems like implementations are not following Fetch for ArrayBuffer and instead use a different Content-Type which would also be problematic for cross-origin requests.
Recommendation:
As @jakearchibald pointed out elsewhere, the fact that navigator.sendBeacon()
can in theory happen after a tab has been closed or a browser has restarted means that it can reveal your new location.
E.g., I visit evil.com
, it uses navigator.sendBeacon()
, I close the tab and then close my laptop, travel for twenty hours, open my laptop, the browser transmits the beacon, evil.com
knows where I am. This is bad.
The privacy considerations need to say that if I switch networks (to the level of cell towers I suspect or some kind of language that works for that) the beacons are removed.
With ReSpec's xref, the dependencies section is no longer needed (or desirable).
Although Privacy and Security is marked as non-normative, this paragraph seems to use normative language:
Compared to the alternatives, the sendBeacon does apply two restrictions: there is no callback method, and the payload size may be restricted by the user agent. Otherwise, the sendBeacon API is not subject to any additional restrictions. The user agent should not skip or throttle processing of sendBeacon calls, as they may contain critical application state, events, and analytics data. Similarly, the user agent should not disable sendBeacon when in "incognito" or equivalent mode, both to avoid breaking the application and to avoid leaking that the user is in such mode.
If these are intended as normative, interoperability requirements, they should be marked as such. It seems like it might be more useful, rather than telling user agents that they can't disable Beacon in a private browsing mode, to simply describe the implications of disabling sendBeacon (potentially broken functionality and potential leakage of the fact that the user was browsing in such a mode).
Also, "incognito" seems specific to a single browser vendor. Maybe "a private browsing mode"?
I see that the Beacon specification indicates that the Fetch request used by sendBeacon() should follow redirects. Is this something that we know is required? Are people relying on this?
My understanding is that Beacon is supposed to be a very lightweight networking API. This is important given that beacon requests can outlive the page. In WebKit, one of the aspects of being lightweight would be to not require our WebProcess to stay alive. This means we would hand off the request to our NetworkProcess and that's it. The NetworkProcess would takes things over from there.
However, Support for redirects add a lot of complexity because:
Doing all this while not requiring our WebProcess to stay alive is not trivial. This means that for us, sendBeacon() is not that lightweight anymore.
For this reason, I wanted to start a discussion here to get feedback from editors and other implementors on this aspect of the API.
[[
What are the security considerations of this document? Is there an origin-restriction on the POST URL? Should one be recommended? Does making background POST requests to other origins including sending credentials provide an increased risk of CSRF attacks? (Maybe this risk is identical to the existing risk of submitting POST forms to other origins.) Are cross-origin POST requests with credentials necessary to satisfy the purpose of the Beacon specification? If not, why add the attack surface? I understand the group has already discussed using POST vs. GET, even though this is a request that may be repeated under error conditions. But use of POST also expands the methods attackers have for conducting CSRF attacks, since many server operations will require POST.
The CORS specification is listed in the References, but doesn't seem to be referred to in the text of the specification. Are user agents intended to follow the CORS cross-origin request model when making a beacon request to a different origin? If so, is preflight required because of the non-simple Beacon-Age header?
If you haven't already, I suspect it would be worthwhile to follow up with the Web Security Interest Group or the Web Application Security Group to check with them about the potential CSRF threat and the use of CORS.
]]
https://lists.w3.org/Archives/Public/public-web-perf/2014Jul/0109.html
from @npdoty
See also http://www.w3.org/2014/10/28-webperf-irc
and http://www.w3.org/2015/10/webperf-tpac2015-minutes#item204
http://w3c-test.org/beacon/headers/header-referrer-no-referrer-when-downgrade.https.html
assert_equals: Correct referrer header result expected "" but got "http://w3c-test.org/beacon/headers/header-referrer-no-referrer-when-downgrade.https.html"
at Anonymous function (http://w3c-test.org/beacon/headers/header-referrer.js:13:7)
[[
What are the privacy considerations of this document? For example, do users want or expect their agents to communicate data after they leave a page or close a window? Does the API give users control over this functionality or will that be handled by UA/site implementations outside the protocol? Is there a recommendation for how this data is handled when a user has toggled a private browsing mode?
Perhaps more specifically: will users be able to inspect requests made after a page is unloaded? For this and other instances (perhaps this would also apply to some APIs around service workers) where UAs make requests not associated with a visible window or tab, what guidance can we give implementers on enabling transparency or control?
Perhaps a section dedicated to privacy and security considerations would be helpful.
]]
https://lists.w3.org/Archives/Public/public-web-perf/2014Jul/0109.html
from @npdoty
See also http://www.w3.org/2014/10/28-webperf-irc
and http://www.w3.org/2015/10/webperf-tpac2015-minutes#item204
Currently, the data reporting behavior is the way to use new Image().src = 'xxx?a=b'
, so our company's data center only supports the parameters passed by the GET method, but currently Beacon only supports the POST method, so I can't use Beacon.
We created a new HTTP header for Beacon-Age. We should consider registering it using
http://tools.ietf.org/html/rfc3864#section-4.3
If we want a permanent status, we will need to define the ABNF, etc.; see http://httpwg.github.io/specs/rfc7231.html#considerations.for.new.header.fields.
This is a not blocker for CR.
The user agent should restrict the maximum data size to ensure that beacon requests are able to complete quickly and in a timely manner.
The sendBeacon method returns true if the user agent is able to successfully queue the data for transfer. Otherwise it returns false.
(note) If the user agent limits the amount of data that can be queued to be sent using this API and the size of data causes that limit to be exceeded, this method returns false. A return value of true implies the browser has queued the data for transfer.
As of today, the spec does not specify a specific maximum data size limit. This was an intentional decision when we were iterating on the early drafts; we wanted to allow UA's to experiment and adjust the limits in the future as needed. Fast forward a few years...
Technically, all implementations are "spec compliant", because we did not spell out the limit or exact conditions for how it should be enforced. However, the differences are also causing failures in Chrome for the proposed web-platform-tests: web-platform-tests/wpt#4024 (comment). Some thoughts and options...
a) I don't think we should spec a hard limit; I think it still makes sense to allow UAs to adjust this if and when needed. However, as a matter of general guidance to developers, I think we probably should add a non-normative note in the doc indicating that multiple browsers picked 64KB for their initial implementation.
b) Given that we don't spec a hard limit, I don't think web-platform-tests should hardcode 64KB either. It's good to have tests for "super large payloads are not allowed", but perhaps we can just change that number to something much larger in our tests.. e.g. >=1MB. That way, if an existing implementation decides to increase their limit, they're not immediately greeted with a bunch of test failures.
c) sendBeacon adoption has been growing steadily in Chrome, and our telemetry for quota exceeded shows that very few pages hit the 64KB quota; I've not heard developer complaints about it, so far, at least.. As such, I'm inclined to say that this behavior shouldn't be treated as a test failure either. It may be the case that it's a bit too restrictive (e.g. SPA app that sticks around for hours~days and uses sendBeacon under the hood), and we might want to revisit this behavior in the future (e.g. by raising the quota limit; only apply the quota requirement for beacons that fire when queued when the page is onloading; switch to same per-beacon enforcement as Edge/FF; ...), but I think it's a reasonable implementation under current spec language and should be allowed.. unless we can point to specific cases where it breaks. My proposal for this one is:
Additional notes and related discussions:
Ilya created a very useful test at:
http://output.jsbin.com/suxagi/latest/quiet
It demonstrates a Cross Origin Redirect encountering CORs in Firefox/Microsoft Edge which is correct per Fetch spec at time of this issue being created. HTTP Fetch 5.3, point 5.4 which instructs a network error to occur per the current Beacon/Fetch specs.
Chrome does not follow the current specs and allows the redirect.
There are some large web properties evaluating a move to sendBeacon and this issue was encountered by them in private communication with Ilya and he raised it with each browser vendor. As it seems to require a spec update, I've moved the issue to the spec issues list.
The question for the specs is this:
Is the intent of the ACAO header to protect only the body? (Which seems to require a Beacon/Fetch spec update)
OR is it intended to protect redirects as well? (As written)
Should sendBeacon set the FetchRequest's cacheMode to "no-cache"?
Based on code introspection, Blink seems to bypass the cache by adding "max-age=0" Cache-control header on the request. I don't know about Gecko.
Seems to me that sendBeacon() should bypass the cache.
The introduction notes that Beacon will "ensure" that data is delivered to the destination. Is this what's intended? It seems to provide a higher level of guarantee than I expected, especially given that Beacon is explicitly noted as being de-prioritized in some cases for performance reasons. And for privacy reasons, I expect there may be some situations where the spec will state that a request shouldn't be delivered or where a user will choose not to deliver. Reliable delivery is important to encourage sites to use sendBeacon as opposed to painful onunload measures, but I thought we were noting this as best effort rather than a sure thing.
(Apologies for opening an issue on a small point, but I think it's important.)
In the beacon spec. in the introduction section in example 1 the coding line
if (document.visiblityState === 'hidden') contains a typo: visiblityState should read visibilityState
Sending a beacon should create a resource timing entry. To make that formal in the spec, the integration with FETCH needs to be clearer.
http://w3c-test.org/beacon/headers/header-content-type.html
In 'Test content-type header for a body string', the test seems to require that the performance entry for sendBeacon is NOT exposed to the performance timeline when the sendBeacon call is made.
In Microsoft Edge, the operation is so fast that (or the costs of completing the sendBeacon call is so slow) that the entry IS exposed to the timeline.
The test should be updated to remove the line I've commented below:
function testContentTypeHeader(what, contentType, title) {
function wait(ms) {
return new Promise(resolve => step_timeout(resolve, ms));
}
promise_test(async t => {
const id = self.token();
const testUrl = new Request(RESOURCES_DIR + "content-type.py?cmd=put&id=" + id).url;
assert_true(navigator.sendBeacon(testUrl, what), "SendBeacon Succeeded");
assert_equals(performance.getEntriesByName(testUrl).length, 0); //REMOVE
Currently, the Beacon standard doesn't allow specifying HTTP headers in the request (except for Content-Type
when the Blob API is used).
We'd like to specify additional headers in the HTTP beacon (for example, when using zero-rating to indicate to ISPs that a request should not count against a user's data transmission quota).
This would require updating:
httpHeaders
to include object whose key-value pairs are included in the requestheaderList
logic to start out by populating list from httpHeaders
, if non-nullMany applications need to report activity, state, and analytics data in response to user interactions and various app-specific events. Such requests are also typically delay-tolerant (as long as the delay is relatively small), because they are not fetching data required to update the UI, etc.
On mobile devices coalescing network access can have significant impact on improving battery life: waking up the radio incurs a lot of overhead regardless of size (due to timeout logic in the controller that keeps the radio active for some period of time), and coalescing multiple requests to fire at once would help amortize that cost and reduce overall energy footprint.
Implementing this kind of coalescing in app-space is hard: it requires explicit coordination between all actors that initiate fetches (third party scripts, iframes, etc); there isn't sufficient information to infer whether radio is active or not -- e.g. multiple apps/pages running on the device can't coordinate effectively.
Putting all of this together, it seems like it would be helpful to provide some form of a signal to the platform that a particular request is delay-tolerant (for some small delay value - e.g. ~60s) and may be coalesced with other requests. In turn, the platform could keep track of such requests, group them into batches, and periodically "flush them".
Perhaps, there should be a flag or attribute on Fetch API that allows us to mark a particular request as delay-tolerant?
This came up because enabling coalescing is one of the goals of Beacon API. But, at the same time, and for best results it should not be restricted to Beacon only. Further, in terms of layering, it seems like Fetch is the right place to define this type of functionality: it can observe all requests from all contexts, coalesce them effectively, flush them, etc.
It would be interesting to consider an option for the beacon API where you could request the payload to be compressed before being sent.
With the sendBeacon()
API today, if you want to do some sort of compression on the data before sending it, you would use the Compression Stream API, which is an async-only API.
Here's an example of doing this at e.g. page load time:
async function compressBlob(data) {
const stream = new Response(data).body
.pipeThrough(new CompressionStream('deflate'));
return new Response(stream).arrayBuffer();
}
(async function() {
var data = JSON.stringify(performance.getEntries());
// this will send OK
var dataGz = await compressBlob(data);
navigator.sendBeacon('/beacon?load', dataGz);
}());
Unfortunately if you wanted to do this at pagehide
/beforeunload
/unload
, you can't utilize the Compression Stream API since it is async. You would be waiting for the stream callback (await
), but by then the page would be unloaded:
(async function() {
window.addEventListener("pagehide", async function() {
var data = JSON.stringify(performance.getEntries());
// this will send OK
navigator.sendBeacon('/beacon?before-await', data);
// this will not send due to await
var dataGz = await compressBlob(data);
navigator.sendBeacon('/beacon?after-await', dataGz);
});
}());
If we could ask the browser to compress the payload before sending, it could look something like this:
navigator.sendBeacon('/beacon?after-await', data, {
compress: "deflate"
});
And we could easily get it compressed from unload-style events (assuming the browser handles that async compression and beaconing later).
[[
Some requirements are placed on "the User Agent" and others on "user agents"; consistency would be better.
Sections 1 and 4.1 (both Introductions) seem duplicative. In both cases, this sentence is first:
The Beacon specification defines an interface that web developers can use to asynchronously transfer small HTTP data from the User Agent to a web server.
Nothing in the specification limits the size of the data sent. In fact, analytics data aggregated over an entire session (since unloading seems like the primary use case) might be quite large. If the purpose is specific to small amounts of data -- which might itself address the basic privacy principle of data minimization -- then those requirements should be specified so that implementations have the same restrictions and sites are aware of them.
I think it would be more correct to refer to transferring data via HTTP rather than "HTTP data".
Web developers already have interfaces for asynchronously transferrring data via HTTP. For example, XMLHttpRequest, as you note. Perhaps a better summary would be: "This specification defines an interface that web developers can use to asynchronously transfer data from the user agent to a web server during or after the unloading of a page."
]]
https://lists.w3.org/Archives/Public/public-web-perf/2014Jul/0109.html
from @npdoty
See also http://www.w3.org/2014/10/28-webperf-irc
and http://www.w3.org/2015/10/webperf-tpac2015-minutes#item204
As pointed out in #34 this section still assumes CORS throughout, while sometimes CORS is not used to avoid blocking on redirects.
From an older thread: https://lists.w3.org/Archives/Public/public-web-perf/2014Jan/0003.html ...
When using beacon against a cross origin, is it clear that we might need to do a preflight for content types that are not specified by CORS? For example, if you send a ArrayBuffer, we are probably going to need to preflight. This is clear when reading the CORS spec, but it isn’t obviously clear from the beacon spec that there might be two http requests for a beacon — one for the OPTIONS and one for the POST. I am not sure if we can or should relax this or not. Maybe we should be more explicit about which mime types in the beacon specification don’t cause the OPTIONS request.
Comment from @bzbarsky: https://lists.w3.org/Archives/Public/public-web-perf/2014Jan/0008.html ...
The way Beacon currently invokes "make a cross-origin request" is weird: it doesn't pass the parameters that operation needs.
It's not clear to me what happens when "mime type" ends up null at the end of step 7 in the processing model. Nothing really seems to define the behavior in this situation; I suspect that in practice step 7 should ensure that this situation never happens (that is, that a mime type is always defined by the time that step finishes).
data
parameter is allowed to be one of ArrayBufferView, Blob, DOMString, or FormData. In our Processing Model, we should make sure we always set the Content-Type...
Content-Type
to?type
attribute is omitted, what should we set the Content-Type
to?We need to track w3ctag/design-reviews#23
Firefox does not comply with step 8 of the processing model algorithm, specifically:
[[
Append a Accept header with / as the value to headerList.
]]
Instead, it uses its own default setting. Should we consider it a firefox bug or should we update the spec to allow for browser specific/user settings?
https://fetch.spec.whatwg.org/#keep-alive-flag was apparently designed for sendBeacon but is not used here.
[needs pointer to those notes!]
If I understand the processing model [1] correctly:
As a result, my understanding would be that a CORS-preflight should be done. Based on my local testing, I don't think Blink does such preflight.
[1] https://w3c.github.io/beacon/#sec-processing-model
[2] https://fetch.spec.whatwg.org/#concept-bodyinit-extract
/cc @yutakahirano
https://w3c.github.io/beacon/#sec-processing-model
corsMode is declared in one of the substeps in the step 4 but referenced from the step 5. Instead, it should be declared at the top level and modified by the step 4.
I also think the step 5 should be part of the step 4.
headerList should be initialized as an empty header list, not null.
So, it should be:
no-cors
".Content-Type
/mimeType is not a CORS-safelisted request-header, set fetchMode to "cors
".Content-Type
/mimeType to headerList.A large number of existing RUM solutions depend on GET via XHR or the image tag. To ensure that sendBeacon can be used by these existing solutions, should we update it to support GET?
I think it is possible to add without significant change to the API surface if we mimic the Fetch WebIDL.
If thoughts are positive, I'll put together a pull request. Thoughts?
function trackClickAndForward() {
navigator.sendBeacon("https://www.example.com/tracker");
window.location.href = "https://www.example.com/homePage";
}
Is it guaranteed that sendBeacon's analytics request to 'www.example.com/tracker' will not include any cookies that may be set on the *.example.com domain from the "www.example.com/homePage" response?
Grammar typo in the note at the end of the Introduction.
https://w3c.github.io/beacon/#introduction
Developers should avoid relying on unload event because it will not fire whenever a page in background state (i.e. visiblityState equal to hidden) and the process is terminated by the mobile OS.
This part reads poorly:
... not fire whenever a page in background state ... and the process is terminated ...
This should probably read:
... not fire whenever a page is in a background state ... and the process is terminated ...
Typo in Processing Model. The Fetch step is missing a <dd>
for origin's value, which should be origin.
https://w3c.github.io/beacon/#sec-processing-model
Spec markup:
<dl>
<dt><a data-link-type="dfn" href="https://fetch.spec.whatwg.org/#concept-request-method">method</a></dt>
<dd><code>POST</code></dd>
<dt><a href="#request-url">url</a></dt>
<dd><var>parsedUrl</var></dd>
<dt><a data-link-type="dfn" href="https://fetch.spec.whatwg.org/#concept-request-header-list">header list</a></dt>
<dd><var>headerList</var></dd>
<dt><a href="#request-origin">origin</a></dt>
<dt><a data-link-type="dfn" href="https://fetch.spec.whatwg.org/#keep-alive-flag">keep-alive flag</a></dt>
<dd><code>true</code></dd>
<dt><a data-link-type="dfn" href="https://fetch.spec.whatwg.org/#concept-request-body">body</a></dt>
<dd><var>transmittedData</var></dd>
<dt><a data-link-type="dfn" href="https://fetch.spec.whatwg.org/#concept-request-mode">mode</a></dt>
<dd><var>corsMode</var></dd>
<dt><a data-link-type="dfn" href="https://fetch.spec.whatwg.org/#concept-request-credentials-mode">credentials mode</a></dt>
<dd><i>include</i></dd>
</dl>
Namely:
<dt><a href="#request-origin">origin</a></dt>
Should be:
<dt><a href="#request-origin">origin</a></dt>
<dd><i>origin</i></dd>
Note that origin
is defined at step 2.
The fetch spec defines the context "beacon" for this feature, but the spec currently uses "ping". I'm modifying Gecko to fetch the URL as "beacon".
cc @annevk
The sendBeacon() spec specifies that UAs must enforce a maximum data size, but it doesn't say what that data size is or how they should enforce it.
If someone were to write a new implementation of sendBeacon(), they'd want the information that's in #38 about the actual limits that are enforced -- otherwise sendBeacon() calls that sites expect to work might not work in that implementation.
I'd think that the information you need to build a working implementation ought to be in the spec.
Folks shouldn't use beacon with unload since it's not reliable. We should give some guidelines in the spec.
Discussion: #50 (comment)
http://w3c-test.org/beacon/beacon-navigate.html was broken by @Brandr0id when he added new tests. Opening this issue to track a fix.
[[
We should set the "Content-Type" header as an author header, and the "omit" credentials
mode" flag to "never" when invoking the fetch algorithm.
]]
http://lists.w3.org/Archives/Public/public-web-perf/2014Feb/0025.html
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.