Git Product home page Git Product logo

git-remote-subtree's Introduction

git-remote-subtree

An alternative to git submodules and subtrees. Subrepos appear as normal remotes which differ from your main repository only in the contents of one subdirectory.

The idea is that hide the functionality of git-subtree behind a protocol which works transparently, doesn't pollute the commit history of the repo, and doesn't interfere with normal git functionality. Just as a failing escalator becomes stairs, any issue with git-remote-subtree should yield a monorepo that continues to work perfectly, with no loss of data or commits.

Project status

There is no working implementation yet. git-remote-subtree currently just mirrors a remote repository in a hidden bare repo and performs pushes and pulls indirectly through it.

The next step is to add support for rewriting paths and basing the rewritten commits on top of a specific parent commit provided by the user. This will be sufficient to allow pulling as long as no new commits are added to the super repo, and because it's deterministic, it won't require scanning the super repo for matching commits.

Proposed approach

Say the user sets up a remote like so:

git remote add subRepo subtree::subRepoDir::on::branch::from::http://example.com/normalRepo

This creates a git remote called subRepo which wraps a normal git repo at http://example.com/normalRepo. The ref subRepo/branch contains the same content as normalRepo/branch, but it is as if all of the files in normalRepo are moved into a single top-level directory subRepoDir/, and subRepoDir is grafted into the same tree as all the other content at superRepo/branch. If subRepoDir already exists in superRepo/branch, the effect is as if its existing contents were replaced by the contents of normalRepo/subBranch. If subRepoDir already exists in superRepo/branch and the contents are exactly the same, then subRepo/subBranch will have the same SHA as superRepo/branch, and pulling one into the other will be a no-op.

Similarly, pushing from superRepo/branch to subRepo/subBranch behaves as if the contents of subRepoDir were at the top level, and everything else was thrown away. If the contents of subRepoDir are the same as normalRepo/subBranch, then pushing is a no-op.

This setup means that the functionality of git-subtree can be implemented totally by pushing to and pulling from a subtree:: remote. Because all the magic is inside the remote helper, the main repo remains clean, and all other git functionality works just as you would expect.

This should be fairly simple to implement while preserving history. A hand-wavy algorithm for fetching is as follows:

  • Do a fetch on normalRepo and superRepo into our hidden repo.

  • See if the tree object of the oldest commit in normalRepo/subBranch is present in the local repo. If not, we know that we've never merged this branch in before, and we can skip some of the following work.

  • Walk backwards in the commit graph from normalRepo/subBranch until we find a tree object that's in the local repo. See if one of the associated commits in the local repo is the same in every other respect except for the parent commits and the fact that the tree is rewritten. If so, this is the last common commit. If not, keep walking backwards; if we run out of commits, we've never merged this branch in before, and in the steps below we can just start at the current commit in the local repo.

  • Create a temporary branch in our hidden repo pointing at the version of the common commit in our local repo.

  • Cherry-pick commits from our local repo until we either hit a commit that modifies the tree object for subRepoDir (which we won't cherry-pick), or we run out of commits.

  • Cherry-pick all of the commits from normalRepo/subBranch onto our temporary branch, with the tree rewritten appropriately. We know they'll apply cleanly because the state of subRepoDir in our temporary repo is clean with respect to normalRepo.

  • The temporary repo contains the data we'll return from the fetch. Repeat as necessary for the other branches.

This is obviously quite expensive, so in practice we'll want to cache some information to speed this up.

Pushing is a bit simpler; once we find the common commit we just need to transform each commit in our local repo that touches subRepoDir into a corresponding commit on normalRepo/subBranch.

It might sound like there's a lot to implement here, but actually git-subhistory in particular is fairly close to what's needed here, and translating it into e.g. Python would get us 80% of the way there.

Resources and related work

How to Write a New Git Protocol

Mastering Git Subtrees

git-subtree docs

git-subhistory

git-subrepo

Which commit has this blob?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.