Git Product home page Git Product logo

Comments (5)

newren avatar newren commented on August 16, 2024 1

I am a bit reluctant to rewriting the history and then check if it went well.

The check is really easy though. For any branch that you didn't want/expect to change, just check whether its sha1sum afterward matches its sha1sum before. Basically, just save the output of git show-ref, then filter-history, and re-run git show-ref and compare the output.

For example when I tried git rebase --rebase-merges on this case, git rewrote the history of master even if master remains theoretically untouched. But maybe this is not equivalent to the mechanism in filter repo?

rebase uses an entirely unrelated mechanism. It operates on diffs between changes, allowing you to e.g. drop or modify the diff, but then runs the risk of conflicts as it attempts to apply future diffs (and for merges it re-does the merge, again risking more conflicts). If you tweak the diff, since it just applies more diffs for the remaining patches, you'll still see your changes at the end.

filter-repo uses fast-export and fast-import which treat every commit not as a diff but as a "use the same versions of most files from the parent commit, but make these five files have these exact contents". So, if you decide to drop a commit or tweak the contents of those new files in that commit, those changes will be reverted by the next commit in the stream that mentions that file because it's not applying a diff but a "make this file have these exact contents". So, filter-repo works well for things like removing a file entirely, but if you want to make any tweaks to any files you have to make the exact same tweak over and over for every single commit that touches that file. That means rebase is a much better tool if you want to just tweak what kind of changes were included in some old commit and have those changes propagate forward.

By the way, when re-writing all the history, how do you automate the merge resolution based on the existing merge commit? That’s clearly a missing piece in my rebase-merge method: I had to go through the same conflicts.

Yes, by rebase's design once you change an old commit you risk conflicts that you'll have to go through. filter-repo has a dumb "make-these-files-have-this-exact-content-and-use-these-parents" design that it got from fast-export and fast-import. As such, there is no "merging" that it needs to do. But it also means the two tools should be used in much different circumstances.

from git-filter-repo.

newren avatar newren commented on August 16, 2024

Does the file in question ever appear in the master branch? While filter-repo rewrites all commits by default, it ends up writing the exact same commit when nothing changes. So, if your filter tries to remove a file that doesn't exist and no parent commit had it either, the rewritten commit will be identical to the original including its hash. This means, removing a file that only existed on one branch would result in a rewritten history where every branch is identical to the original except the one branch that had that file in various commits of its history.

Of course, there are some weird types of things that are automatically modified, so you would want to double check after a rewrite. (These kinds of things include super old commits that are somehow entirely missing an author, a non-utf-8-encoded commit message, a bogus timezone recorded in the commit objects, and others mentioned under "Inherited Limitations" at https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html#INTERNALS), but it's super rare for a repository to have any of these.

If you do have some crazy special history or you just wanted to be sure that filter-repo didn't even try to touch any commits in the master branch, then you could use the --refs flag:
git filter-repo --refs ^master feature <filter rules and other args...>

But, if you're just changing one branch and in particular only the commits on it that are newer than some other branch, then git rebase potentially using the --rebase-merges flag is a good choice too.

from git-filter-repo.

adrienbernede avatar adrienbernede commented on August 16, 2024

Thanks for the detailed answer.

The scheme was a simplified but equivalent version on the situation I was in: faulty commits localized, no impact on master. In reality though, I am talking about an open-source project with multiple branches in a merge-base workflow.

I am a bit reluctant to rewriting the history and then check if it went well.

For example when I tried git rebase --rebase-merges on this case, git rewrote the history of master even if master remains theoretically untouched. But maybe this is not equivalent to the mechanism in filter repo?

Anyway, I should do some tests.

By the way, when re-writing all the history, how do you automate the merge resolution based on the existing merge commit? That’s clearly a missing piece in my rebase-merge method: I had to go through the same conflicts.

from git-filter-repo.

adrienbernede avatar adrienbernede commented on August 16, 2024

Thank you @newren!
That was super useful.

from git-filter-repo.

adrienbernede avatar adrienbernede commented on August 16, 2024

For the record, it worked superbly with the ^master.
Since it translates in git as "ignore commits attainable by master", and with your explanations, I’m happy to say I understand why. That’s perfect!
Thank you again.

from git-filter-repo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.