Git Product home page Git Product logo

goblet's Introduction

Goblet: Git caching proxy

Goblet is a Git proxy server that caches repositories for read access. Git clients can configure their repositories to use this as an HTTP proxy server, and this proxy server serves git-fetch requests if it can be served from the local cache.

In the Git protocol, the server creates a pack-file dynamically based on the objects that the clients have. Because of this, caching Git protocol response is hard as different client needs a different response. Goblet parses the content of the HTTP POST requests and tells if the request can be served from the local cache.

This is developed to reduce the automation traffic to googlesource.com. Goblet would be useful if you need to run a Git read-only mirroring server to offload the traffic.

This is not an official Google product (i.e. a 20% project).

Usage

Goblet is intended to be used as a library. You would need to write some glue code. This repository includes the glue code for googlesource.com. See goblet-server and google directories.

Limitations

Note that Goblet forwards the ls-refs traffic to the upstream server. If the upstream server is down, Goblet is effectively down. Technically, we can modify Goblet to serve even if the upstream is down, but the current implementation doesn't do such thing.

goblet's People

Contributors

draftcode avatar sluongng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

goblet's Issues

E2E Test fail

Expected Behavior

bazel test //... should succeed

Actual Behavior

        fetch_test.go:57: cannot execute a git command
            Error: exit status 128
            Args: []string{"/usr/local/bin/git", "-c", "http.extraHeader=Authorization: Bearer valid-server-auth-token", "push", "-f", "http://[::]:42989/", "master:master"}
            Output: fatal: unable to access 'http://[::]:42989/': URL using bad/illegal format or missing URL

Steps to Reproduce the Problem

  1. Start the project in Github Codespace
  2. Install Bazelisk
  3. Run bazel test //...

Specifications

  • Version:
  • Platform:

Similar project discussion: git_cdn

Hello,

I wanted to say hi and discuss a project that my team has been working on at Renault in a similar timeframe

https://gitlab.com/grouperenault/git_cdn

our design looks very similar, and it makes sense to share experience.

git_cdn is written in python asyncio, and we did not experience much performance bottleneck with the python.
Most of our load comes from the git upload-pack process.

In order to mitigate this load from upload-pack, we had to implement caching of the bigger upload-packs.
This is because our CI users are doing a lot of clone from scratch for reasons that are hard to negociate..

our design also implements cache of LFS files, which can be handy.

We do not support protocol v2 yet, and to my understanding goblet only supports v2.

We would be glad to have a discussions with you either in this issue or in a live hangout session.

Upgrade go-git to V5

Expected Behavior

Use https://github.com/go-git/go-git

Actual Behavior

Project is using gopkg.in/src-d/go-git.v4 v4.13.1 which is fairly old.
Src-d is shutdown and the project has moved.

From V4 to V5, there were several memory issues fixed. Upgrading is a good idea

Steps to Reproduce the Problem

Specifications

  • Version:
  • Platform:

Fixing staticcheck linter issues

Expected Behavior

Current code should pass staticcheck linting

Actual Behavior

git_protocol_v2_handler.go:133:55: should use time.Since instead of time.Now().Sub (S1012)
managed_repository.go:118:41: should use time.Since instead of time.Now().Sub (S1012)
reporting.go:96:40: should use time.Since instead of time.Now().Sub (S1012)
reporting.go:148:9: infinite recursive call (SA5007)

Steps to Reproduce the Problem

  1. Run staticcheck

Specifications

  • Version:
  • Platform:

Feature request: Implement Pack Object cache with a hook

When using goblet to serve a repository for a large concurrent CI setup, it might be desirable to make use of https://git-scm.com/docs/git-config#Documentation/git-config.txt-uploadpackpackObjectsHook to wrap around git-pack-objects and cache the output per unique request.

Multiple requests (from CI workers) asking for a same combination of objects would only trigger a single git-pack-objects which will simultaneously write to both process stdout and to an on-disk cache dir. Subsequent requests can be served from the cache dir instead of having to re-trigger git-pack-objects, which will save a lot of CPU time.

Prior art could be found in Gitlab's Gitaly https://gitlab.com/gitlab-org/gitaly/-/blob/master/doc/design_pack_objects_cache.md

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.