Git Product home page Git Product logo

Comments (4)

Cherry avatar Cherry commented on May 19, 2024 1

I've had to tackle something like this in the past for someone, with very specific caching rules for different files in a project, set by Gatsby. For example:

/* Cache-Control templates used for caching long-term, or instructing browsers not to cache */
// We use `immutable` for supported browsers so this is cached perpetually, since URLs are cache-busted
// Because of this, we can't just use the `browserTTL` option in kv-asset-handler, since it doesn't support this
const CACHE_CONTROL_FOREVER = 'public, max-age=31536000, immutable';

// We use `max-age=0` and `must-revalidate` instead of a blanket `no-cache` so the client can be smart about ETag/Last-Modified validation
const CACHE_CONTROL_NEVER = 'public, max-age=0, must-revalidate';

// ... more bootstrap, regex, etc.

async function handleEvent(event) {
  const url = new URL(event.request.url);

  // Setup default kv-asset-handler options for browser and edge-based caching
  // Cache everything on the edge for 2 days by default, because these will be cleared on deploy
  // But _don't_ cache on the browser at all - browserTTL not set, `cache-control` header not sent
  // These defaults are safe and won't cause any issues with caching or new deploys. We'll override specific headers below where we can
  let options = {
    cacheControl: {
      edgeTTL: 2 * 60 * 60 * 24, // 2 days,
      browserTTL: null,
    },
  };

  try {
    if (DEBUG) {
      // customize caching
      options.cacheControl = {
        bypassCache: true,
      };
    }

    const originalResponse = await getAssetFromKV(event, options);
    const response = new Response(originalResponse.body, originalResponse);

    // Gatsby has pretty strict documentation on the best way to cache aspects of the site
    // If these aren't followed, weird side-effects with caching can occur, especially with a service worker
    // Reference: http://gatsbyjs.org/docs/caching
    if (
      url.pathname.startsWith('/static') ||
      (url.pathname.match(MATCH_JS_AND_CSS) && url.pathname !== '/sw.js') ||
      url.pathname === '/manifest.webmanifest'
    ) {
      // Set longer cache time for any file in /static, and any JS/CSS assets. These filenames are always cache-busted
      // The only real exception to this is the `sw.js` file, since this file's contents can change without the filename itself changing
      // We should also not cache the `manifest.webmanifest` file to prevent any changes to this file (favicons, etc.) being cached
      // Reference: https://www.gatsbyjs.org/docs/caching/#javascript-and-css and https://www.gatsbyjs.org/docs/caching/#static-files
      response.headers.set('cache-control', CACHE_CONTROL_FOREVER);
    } else if (url.pathname.match(MATCH_FEED)) {
      // Set small browser cache for the RSS feed
      options.browserTTL = 60 * 60; // 1 hour
    } else if (
      url.pathname.startsWith('/page-data') ||
      url.pathname === '/sw.js'
    ) {
      // Add Cache-Control header for page data and app data to instruct browsers to never cache, and always revalidate with the server
      // Also add this header for the `sw.js` as mentioned above
      // Reference: https://www.gatsbyjs.org/docs/caching/#page-data and https://www.gatsbyjs.org/docs/caching/#app-data
      response.headers.set('cache-control', CACHE_CONTROL_NEVER);
    } else if (response.headers.get('content-type').includes('text/html')) {
      // Add CSP header on HTML pages. This header isn't necessary on assets
      response.headers.set(
        // 'Content-Security-Policy-Report-Only',
        'Content-Security-Policy',
        url.hostname === 'staging.example.com'
          ? CSP_HEADERS_STAGE
          : CSP_HEADERS_PROD
      );
      // Add Cache-Control header for HTML pages to instruct browsers to never cache, and always revalidate with the server
      // Reference: https://www.gatsbyjs.org/docs/caching/#html
      response.headers.set('cache-control', CACHE_CONTROL_NEVER);
    }

    return response;
  } catch (e) {
    // if an error is thrown try to serve the asset at 404.html
    if (!DEBUG) {
      try {
        let notFoundResponse = await getAssetFromKV(event, {
          mapRequestToAsset: (req) =>
            new Request(`${new URL(req.url).origin}/404.html`, req),
        });

        return new Response(notFoundResponse.body, {
          ...notFoundResponse,
          status: 404,
        });
        // eslint-disable-next-line no-empty
      } catch (e) {}
    }

    return new Response(e.message || e.toString(), { status: 500 });
  }
}

It's not exactly the same as your use-case, but is similar in the sense of having very specific caching rules for very specific pages. I'm honestly not sure what the best solution here is, since it's so specific.

Netlify seems to offload this entirely to the user with a _headers file, whereas a lot of cache headers are abstracted by kv-asset-handler to suit the vast majority of static site use-cases. Perhaps per-mime type cache rules? Or perhaps just some extended documented detailing caching and how best to handle custom caching with your own cache-control headers if you need to do something complex.


To solve your immediate use-case, you could do something similar and check the parts of the URL, or the content-type response for text/html, and then override the cache-control header there.

from workers-sdk.

EatonZ avatar EatonZ commented on May 19, 2024

Any thoughts on this?

from workers-sdk.

ashleymichal avatar ashleymichal commented on May 19, 2024

perhaps we need to separate out the options for browser ttl on html files separate from other assets?

from workers-sdk.

EatonZ avatar EatonZ commented on May 19, 2024

Hi @Cherry, thank you for responding, and for your great code sample! It was really helpful.

First, regarding your comment on the CACHE_CONTROL_NEVER line, you can actually use no-cache to achieve the same result. Apparently Gatsby's docs were incorrect (see gatsbyjs/gatsby#18763). I believe no-cache is essentially a shorthand for public, max-age=0, must-revalidate. Feel free to correct me.

In relation to this issue, I see you are checking the asset response's Content-Type for HTML, instead of the file name like I am doing. That's a good idea, and cleans up my code nicely. browserTTL is basically unnecessary for me since I have more fine-grained Cache-Control values now.

My issue is ultimately solved, but I think there is room for improvement regarding Cache-Control. It seems like you could evolve browserTTL into something else to make it easier to construct a Cache-Control header. If you would consider improvements there, feel free to leave this issue open. Otherwise, if you're content with the way things work right now, you can close this issue.

from workers-sdk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.