Git Product home page Git Product logo

github-nginx-cache's Introduction

Github nginx cache

This repo contains nginx configuration tuned to sit in front of github endpoints and provide caching functionality. Github will not rate-limit conditional requests. The proxy_cache_* nginx directives force nginx to revalidate any cached content from the upstream server (in this case, github). Revalidation is performed by nginx as a conditional request, therefore it will not reduce api limits. This works for both authenticated and unauthenticated requests.

Here is an example how rate-limiting is mitigated for unauthenticated requests against both https://api.github.com and the cache running on http://localhost:8000.

Rate limiting example

Quick Start

docker run -d -p 8000:80 azuredevx/github-nginx-cache
curl localhost:8000/api/repos/azure/github-nginx-cache

The github domains are mapped as follows:

Github URL Cache URL
api.github.com/* localhost:8000/api/*
raw.githubusercontent.com/* localhost:8000/raw/*
codeload.github.com/* localhost:8000/codeload/*

CI/CD

Docker publish Build Status

Develop

Build

docker build .

Debug

Fish

docker build -t custom-nginx . && docker run -it -p 8000:80  -v (pwd)/nginx-logs:/var/log/nginx custom-nginx
curl localhost:8000/health/alive

Bash

docker build -t custom-nginx . && docker run -it -p 8000:80  -v $(pwd)/nginx-logs:/var/log/nginx custom-nginx
curl localhost:8000/health/alive

Test

# Run image on localhost:8000
cd test
npm ci
npm run test

Implementation details

Github consistency

The cache is designed for the highest possible github consistency such that it ignores any Cache-Control headers that github sends and forces nginx to REVALIDATE for every request. A limitation in nginx means that the lowest value for proxy_cache_valid directive is one second. This means that two identical requests to github within the space of one second will HIT (return cached response without revalidating) rather than REVALIDATE.

Cache partitioning

The cache may be used for complicated applications where multiple app and oauth tokens are being used to access github. The default behaviour in this case is to parition the cache by token. This means that a request with token A will not leverage any cached content from requests using token B.

This behaviour is for the following reasons.

  1. Security - there are edge cases in which using two tokens within one second of each other could cause a response to be leaked to the second request even if the second token was not allowed to access the resource.
  2. Prevent cache churn - if the cache was not partitioned, multiple requests to one api route with different tokens may cause the cache to be evacuated unnecessarily if these tokens have different access permissions.

There may be cases however where this behaviour needs to be overridden at the discretion of the client. For example when using a GitHub app, the token may expire every hour or so in which case the default behaviour would be for the cache to reset every hour which is not desirable.

By setting X-Cache-Key header, the cache will be paritioned on this arbitrary string rather than the token.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

github-nginx-cache's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.