Git Product home page Git Product logo

Comments (1)

tunetheweb avatar tunetheweb commented on July 22, 2024

As discussed onSlack, I think these limitations are fine and similar to what we had last year.

The problem with the script is that it's detecting every importScripts(), even inside libraries, like Workbox. The good thing is that since importScrpts() is a worker method, it doesn't match client-side libraries, so we might be able to filter this better.

The intent of the importScripts query “How are developers using PWAs? Are they writing it from scratch or using libraries and, if so, which ones?” So if WorkBox is using some other library then the page is (indirectly) using that so good to track. So I don't think this is an issue.

Yes there is a concern if this pattern was used:

if (some test) {
  importScript(...)
}

And the page doesn't actually pass that test and use that library but I'd imagine that's small and arguably it is using that library even though it's not actually using it - if that contradictory message makes sense 😁

Something similar happens with the service worker events and properties (output at $.data.runs[1].firstView.pwa.swEventListenersInfo and $.data.runs[1].firstView.pwa.wPropertiesInfo). This one is matching all events including those written inside libraries (e.g. Workbox's install) and, in many cases, the onmessage listener, which can be used in client-side libraries as well.

Again I think that's fine. Workbox in particular breaks it's libraries into small files so we'll only find the ones for workbox functionality we use - even if we don't use them all. And again arguably by importing that code we are "using" those events - or at least having access to use them. If we remove them from the web platform we'd likely need to rewrite this code so that's a "use" in my convoluted, contrived mind here to answer this concern 😁

We can see how many message events are logged for non-service worker pages after the run and decide whether this is a concern or not. Suspect it won't be.

Matching only importScripts() and service worker events inside the service worker file.

As Rick left in one comment in the script: "We should use serviceWorkerInitiatedURLs here SW detection but it has some false negatives".

I'm a bit concerned that these false negatives might be actually quite frequent (minification might be one cause, as Rick mentioned here).

Here are some example sites that have service workers, where the test returns it empty for the serviceWorkers field, but have values for swEventListenersInfo:

Those are weird. Had a look at them myself and can't see how the service worker is registered! So not surprised the custom metric can't figure this out. I think we can accept the limitation of this for really obfuscated code.

As suggested on Slack, we can also look at sites that register service worker event listeners (e.g. install, fetch...etc.) and see how many pages with those don't have the serviceWorkers object defined to see the scale of the problem. Can then decide if we need to include those pages or, if it's small enough, to just ignore.

WPT also detected they were service worker calls as they are in blue. So could also ask @pmeenan how it does that and use that potentially?

So, on one hand, if we end up taking into account only the service worker URLs, we might be able to reduce the noise in the results, but, given that there are potentially so many false negatives, it might not be a good idea.

I think that because we are mostly looking for service worker specific methods and text, we should look at everything like we did last year otherwise we exclude too much from importScripts. I think the likelihood of false positives is reasonably small and the risk of false negatives if we limited just to the main service worker.js is much larger.

I'm pretty new to all this, but I wanted to see if we could get these things done by Monday, when the crawl takes place, but based on these early results, I think we might need to discuss some of the limitations the pwa.js script a little bit more.

As discussed, I think your changes are good to propose in a PR now. Not sure when @rviscomi or @OBTo will get a chance to look at this since it's a weekend (and a holiday weekend at that!), but even if it doesn't make the start of the June crawl, we can still hopefully get it in for some of that crawl to give us enough to have a look at and make any amendments before the main July crawl we will use for the Web Almanac.

from legacy.httparchive.org.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.