Git Product home page Git Product logo

bfg-repo-cleaner's Introduction

bfg-repo-cleaner's People

Contributors

alecthegeek avatar dwijnand avatar javabrett avatar peterdavehello avatar rtyley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bfg-repo-cleaner's Issues

Replace large files with a text file

Thank you for the great tool.

I have been building a GitHub mirror of our Subversion repository. New development will still happen in Subversion, so I need to regularly remove the large files that creep in to the source tree.

Instead of completely deleting large files from the repository, I would like to replace them with a text message which points to where they can find the file. For example:

bfg --strip-blobs-bigger-than 50M --replacement ../replace-message.txt .
cat repo/lib/large-binary.jar
"This file has been removed to keep the repository small. It can be obtained from the master Subversion repository at http://svn.example.com"

I tried to do this using the text substitution and a regex expression like ".*", but didn't have any luck. If there isn't a way to do this already, I expect it would be easy to add since all the component functionality appears to already exist in the tool.0

Bonus points for being able to insert a variable into the text such as the version number of the file.

Protecting branches with a slash

I'd like to protect the tip of a branch which has a slash in its name: feature/worker-prototype.

When I run the BFG on the repository while protecting the branch, JGit doesn't like the "/" and throw an exception.

Exception in thread "main" java.lang.IllegalArgumentException: / is not permitted as a path 'segment' for this filesystem. Segment in question: 991564ae-feature/worker-prototype.csv. If you want to create a Path from a system dependent string then use fromString. If you want to create a child path use resolve instead of / to create the child path. It should be noted that the string after '/' must be a single segment but resolve accepts full strings. Examples: Path.fromString("c: \a\b") path / ("a/b/c", '/') path resolve "a\b\c" at scalax.file.FileSystem.checkSegmentForSeparators(FileSystem.scala:280) at scalax.file.defaultfs.DefaultPath.$div(DefaultPath.scala:44) at com.madgag.git.bfg.cleaner.CLIReporter$$anonfun$reportProtectedCommitsAndTheirDirt$1.apply(Reporter.scala:122) at com.madgag.git.bfg.cleaner.CLIReporter$$anonfun$reportProtectedCommitsAndTheirDirt$1.apply(Reporter.scala:108) at scala.collection.immutable.List.foreach(List.scala:318) at com.madgag.git.bfg.cleaner.CLIReporter.reportProtectedCommitsAndTheirDirt(Reporter.scala:107) at com.madgag.git.bfg.cleaner.CLIReporter.reportObjectProtection(Reporter.scala:80) at com.madgag.git.bfg.cleaner.RepoRewriter$.rewrite(RepoRewriter.scala:96) at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:59) at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:34) at scala.Option.map(Option.scala:145) at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:33) at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$$anonfun$main$1.apply(App.scala:71) at scala.App$$anonfun$main$1.apply(App.scala:71) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32) at scala.App$class.main(App.scala:71) at com.madgag.git.bfg.cli.Main$.main(Main.scala:27) at com.madgag.git.bfg.cli.Main.main(Main.scala)

Using BFG 1.11. Is there a fix / an easy workaround?

Doesn't install

Trying to run the installer does nothing on my computer.

Mac OSX Mavericks 10.9, just updated by JRE:
$ java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

Double-clicking on the file seems to have little effect. If I try to run from the command line:
$ java -cp . bfg-1.11.1.jar
Error: Could not find or load main class bfg-1.11.1.jar

I tried downloading the source, but I have no idea what build system this thing uses.

Question on .REMOVED.git-id

After running the cleaner against my repository, most commits other than HEAD had a number of filename.REMOVED.git-id files in place of the original files. These were not the blobs I expected to be removed but instead small code and config files. Is this the expected behavior, or am I missing something in the configuration? Thanks much.

Ability to get mapping between old and new commit hashes

It would be an awesome feature addition if there was a way to get the mapping of the old and new commit hashes in order to use them to fix other systems (such as Gerrit) that keep their own reference to commit hashes.

Maybe an option that will output a file that has "old_hash new_hash" for each change that was made.

There's a way to do this with filter-branch right now, and it's the one blocker from my ability to use BFG for production purposes.

bfg hangs when pruning repo

I'm running bfg on a repo for the second time, and the program seems to be hanging on this part:

This repo has been processed by The BFG before! Will prune repo before proceeding - to avoid unnecessary cleaning work on unused objects...

I ran it for a full day and it used 100% for the process but did not continue from this point. If I run git gc --prune=now --aggressive before running that command, the git gc command takes about 5 minutes, but the bfg command still hangs afterwards.

How can I debug this? Any ideas as to what might be the issue?

Clarify terminology - clean vs dirty

Some confusion around whether a branch is 'clean' before or after the BFG has run :)

The design of the BFG is based around regarding the user as a reformed-alcoholic : they've made some mistakes in the past, but now they've cleaned up their act. The current tips of their protected branches are assumed to be clean, but there are dirty objects in the history that need to be cleansed.

Deleting file from repository

Hi,

I want to use your tool to delete some data files that were committed by mistake. I have already deleted them in the last commit but I need to remove them from the history. Using your sample command didn't do anything, I have the impression that the files must be present for the deletion to work, is this the case?

Thanks.

.REMOVED.git-id for protected files

On the home page, the documentation says

The BFG is about completely removing bad stuff from the history of your repo. If something questionable - like a 10MB file, when you're telling The BFG to strip out everying over 5MB - is in a protected commit, it won't be removed, and because it's still there, there's no point deleting it from earlier commits either. If you want the BFG to delete something you need to make sure your current commits are clean.

I experimented with my repository as well as a very simple one, and when I use --strip-blobs-bigger-than, the protected files in HEAD above that size limit are replaced with .REMOVED.git-id files in the whole history but the HEAD commit.

My understanding of

there's no point deleting it from earlier commits either

is that those protected files should be left untouched throughout the history, and that only unprotected files should be replaced everywhere by such .REMOVED.git-id files.

To reproduce this issue (is this really an issue or is intended?), I created a small repo. Its git log --patch output is shown in https://gist.github.com/vtintillier/2f249fc420f24fdd3f77

Then I ran java -jar /c/Users/I051121/Downloads/bfg-1.11.7.jar --strip-blobs-bigger-than 5k and got the repo shown in https://gist.github.com/vtintillier/e021aad9e4efed8ef1fa

I would have expected y.txt to be left untouched in the whole history.

This is using version 1.11.7.

Update documentation to explain how to mirror-push up to a GitHub repo with pull-requests

The problem manifests like this:

$ git fetch -q && git push -q --mirror github
remote: error: hook declined to update refs/pull/1001/head
remote: error: hook declined to update refs/pull/1001/merge
(snip)
remote: error: hook declined to update refs/pull/957/head
remote: error: hook declined to update refs/pull/957/merge
To [email protected]:pdurbin/openscholar.git
 * [new branch]      1017 -> 1017
(snip)
 * [new branch]      origin/SCHOLAR-3.x-make-1072 -> origin/SCHOLAR-3.x-make-1072
 * [new tag]         SCHOLAR-2-0-BETA1 -> SCHOLAR-2-0-BETA1
(snip)
 * [new tag]         SCHOLAR-3.1.6 -> SCHOLAR-3.1.6
 ! [remote rejected] refs/pull/1001/head -> refs/pull/1001/head (hook declined)
 ! [remote rejected] refs/pull/1001/merge -> refs/pull/1001/merge (hook declined)
(snip)
 ! [remote rejected] refs/pull/957/head -> refs/pull/957/head (hook declined)
 ! [remote rejected] refs/pull/957/merge -> refs/pull/957/merge (hook declined)
error: failed to push some refs to '[email protected]:pdurbin/openscholar.git'

See also:

http://christoph.ruegg.name/blog/2013/1/26/git-howto-mirror-a-github-repository-without-pull-refs.html

Cleaning refs that are - effectively - from other people's repos might not be possible, but we could consider renaming the refs to give a "here's how you fix your pull request" ref.

NPE on --delete-folders , even on empty repo

gny@tharanga:/tmp$ git init blow
Initialized empty Git repository in /tmp/blow/.git/
gny@tharanga:/tmp$ cd blow/
/tmp/blow
gny@tharanga:/tmp/blow$ java -jar /pub/install/bfg/bfg-1.11.7.jar --delete-folders blah

Using repo : /tmp/blow/.git

Exception in thread "main" java.lang.NullPointerException
at org.eclipse.jgit.lib.ObjectIdOwnerMap.get(ObjectIdOwnerMap.java:131)
at org.eclipse.jgit.revwalk.RevWalk.parseAny(RevWalk.java:807)
at com.madgag.git.package$RichObjectId.asRevObject(package.scala:190)
at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$$anonfun$3.apply(ProtectedObjectCensus.scala:66)
at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$$anonfun$3.apply(ProtectedObjectCensus.scala:66)
at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:332)
at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:331)
at scala.collection.immutable.Set$Set1.foreach(Set.scala:83)
at scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:331)
at scala.collection.AbstractTraversable.groupBy(Traversable.scala:104)
at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$.apply(ProtectedObjectCensus.scala:66)
at com.madgag.git.bfg.cli.CLIConfig.objectProtection$lzycompute(CLIConfig.scala:145)
at com.madgag.git.bfg.cli.CLIConfig.objectProtection(CLIConfig.scala:145)
at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:57)
at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:34)
at scala.Option.map(Option.scala:145)
at com.madgag.git.bfg.cli.Main$.delayedEndpoint$com$madgag$git$bfg$cli$Main$1(Main.scala:33)
at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:27)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:383)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at com.madgag.git.bfg.cli.Main$.main(Main.scala:27)
at com.madgag.git.bfg.cli.Main.main(Main.scala)

Ensure user can easily see usage

Running BFG without arguments should display usage, but right now- if you're not in a Git dir - it just crashes out with a massive exception

Crash with LargeObjectException

Hello,

I tried your tool but it crashes after some minutes.
It looks like one commit exceeds the cache limit of 5 mb.

org.eclipse.jgit.errors.LargeObjectException$ExceedsLimit:

Here is it thrown :
http://grepcode.com/file/repo1.maven.org/maven2/com.madgag/org.eclipse.jgit/1.0.99.0.2-UNOFFICIAL-ROBERTO-RELEASE/org/eclipse/jgit/lib/ObjectLoader.java#191

And here is the 5 mb cache limit :
http://grepcode.com/file/repo1.maven.org/maven2/org.eclipse.jgit/org.eclipse.jgit/2.2.0.201212191850-r/org/eclipse/jgit/revwalk/RevWalk.java#867

Some information about the repository :
Its about 4 gb big and the history is around 10 years.

Allow disabling 'Former-commit-id' footer & 'yyy [formerly xxx]'

Need to think about the cryptographic security implications of these fields for the case where the user is filtering files to remove passwords/credentials.

If the former-object-id values are present, and an attacker wishes to establish what the '_REMOVED_' text originally was, they can run a brute force attack, creating a series of probable passwords and then calculating the tree hash for that tree, then the commit hash for that putative original commit (it may well be easy to determine the remainder of the original state for that commit object, chop off added footer, establish original parent commit id, etc). If that commit hash is identical, they have a strong contender for the original password.

Every hexadecimal digit of the former-commit-id we make available reduces the attacker's search space by a factor 16.

It's also perfectly true that all credentials should be changed anyway before publicising a repo, obviously there's a risk that a user might not do so, soit's best not to leave this vulnerability.

Log output to file for later reference

From @adamnfish :

It would be awesome to be able to specify a file to log the before and after refs to! I realise you can just redirect the output to a file, but it is good to see it as well (especially if it'll be interactive at some point). Having an option for --ref-log-file for example would be great. There's been a couple of times today where something funny has happened and I've wanted to double-check the hashes for branches just to make sure someone didn't accidentally use an old git object somewhere.

Find big files only

It would be nice to add an option allowing the user to find all big files in a repository.

bfg just outputs usage instructions

I have no idea how am I supposed to use this tool, since all it does is output usage instructions without specifying what parameters are wrong.

$ ls -la
total 272
drwxr-xr-x  9 gadelat gadelat   4096 Jun 12 23:46 .
drwxr-xr-x 10 gadelat gadelat   4096 Jun 12 23:42 ..
drwxr-xr-x  8 gadelat gadelat   4096 Jun 12 23:10 .git
-rw-r--r--  1 gadelat gadelat    572 Apr 23 12:29 ubytko.sublime-project
-rw-r--r--  1 gadelat gadelat  51193 Jun 12 22:38 ubytko.sublime-workspace
...

is optional, right?

$ bfg --delete-files ubytko.sublime-*
bfg 1.11.2
Usage: bfg [options] [<repo>]
...

Okay maybe it's not

$ bfg --delete-files ubytko.sublime-* .git
bfg 1.11.2
Usage: bfg [options] [<repo>]
...

Maybe it needs to be executed in git's server folder?

$ cd /home/git/
$ ls
ubytko.git
$ bfg --delete-files ubytko.sublime-* ubytko.git
bfg 1.11.2
Usage: bfg [options] [<repo>]
...

Maybe it needs quotes for glob?

$ bfg --delete-files 'ubytko.sublime-*' .git
bfg 1.11.2
Usage: bfg [options] [<repo>]
...

Just some examples, nothing works.

IOException: no such file or directory

Hello,
I've tried to use your BFG Repo cleaner on my git repository at OpenShift, to remove big .ear files. However, after running

$ git gc --prune=now --aggressive

and then

$ java -jar /tmp/bfg-1.11.0.jar --strip-blobs-bigger-than 10M protection.git/

unfortunately the application crashes with this stacktrace: http://pastebin.com/s41DmEm5

Am I missing something here or is this a bug in BFG? Anyway, thanks for this great application.

The delete-folders option hits an assertion

I just tried the new --delete-folders option while trying to salvage a hg -> git migration that is causing problems due to .git references buried deep in the commit history. Unfortunately I keep hitting an exception.

Any suggestions?

$ java -jar /Users/matej/Downloads/bfg-1.6.0.jar --delete-folders .git --no-blob-protection

Using repo : /Users/matej/Documents/Work/app

Found 0 objects to protect
Exception in thread "main" java.lang.AssertionError: assertion failed: Can't find any refs in repo at /Users/matej/Documents/Work/app
    at scala.Predef$.assert(Predef.scala:179)
    at com.madgag.git.bfg.cleaner.RepoRewriter$.rewrite(RepoRewriter.scala:67)
    at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:57)
    at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:35)
    at scala.Option.map(Option.scala:145)
    at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:34)
    at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
    at scala.App$$anonfun$main$1.apply(App.scala:71)
    at scala.App$$anonfun$main$1.apply(App.scala:71)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
    at scala.App$class.main(App.scala:71)
    at com.madgag.git.bfg.cli.Main$.main(Main.scala:27)
    at com.madgag.git.bfg.cli.Main.main(Main.scala)

Reduce size of large commit messages

@chenri mentions in #20 that his repo contains a very large commit message:

due to someone repeating a short line a huge number of times (editing error?) so that a single msg line was 6Mb long. We never detected this until we tried bfg last week.

It would be nice if The BFG had the ability to somehow reduce the size of large commit messages - but how exactly would it do it? Some options:

  • Simply truncate the entire message after the first X KB?
  • Allow user to run a --replace-message-text option, similar to --replace-text?? This would work for a repeated value on a single line, so long as the line wasn't too long, but how would it work on a commit message with 1 million lines, each distinct?

Ability to remove directories

I was wondering if bfg-repo-cleaner had the ability to remove entire directories of files. I noticed the -D option, but it specifically says it doesn't work with paths, so I don't think it will work. Would it be possible to add an option to give one or more directories?

Ability to suppress commit messages when pushing?

Would this be technically possible? (I suspect not).

The problem is that we have GitHub hooks that message the team on Slack and Trello, so when deleting a file and pushing the repo, we get all the hundreds of commits replayed at once, spamming all communication channels :)

Protecting all branches

I'd like to protect all branches of my repository without having to list them all. Is this possible to do something like pattern matching for branches (-p '*')?

[rejected] failed to push to some refs

I followed the instructions correctly but then when I go to push I get this error. The repo was up to date beforehand.

! [rejected] master -> master (fetch first)
error: failed to push some refs to ''
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

Removed passwords added back into last commit.

I followed the steps to remove passwords using bfg-repo-cleaner. I then did a git log -p and saw the passwords added back into the files from which they were removed and those files amended to my last commit.

$ git log -p
...
diff --git a/flower_celery.sh b/flower_celery.sh
index bc681eb..c99b5cd 100755
--- a/flower_celery.sh
+++ b/flower_celery.sh
@@ -1,3 +1,3 @@
#!/bin/bash

Start flower

-celery flower --address=127.0.0.1 --port=5555 --broker=_REMOVED_
+celery flower --address=127.0.0.1 --port=5555 --broker=amqp://

flower_celery.sh was not part of my last commit.

BFG appears to have fixed my repo but something is still wrong

I'm having some trouble with BFG that I'd appreciate some help with. I thought I had everything working right, but apparently I didn't. BFG appears to run fine on the mirror of my repo, removing a huge extraneous file that someone committed (and has since removed) and the local repo is now much smaller. But after doing the last step from the procedure in http://rtyley.github.io/bfg-repo-cleaner, the remote repo is still big (still has the oversize blob that I don't want). And I'm not sure what the problem could be. So any help would be appreciated.

Here's exactly what I did:

[05:46:13 tobradle]$  git clone --mirror https://todd.bradley%[email protected]/opc/s/opc_nir
Initialized empty Git repository in /home/tobradle/git/opc_nirvana.git/
Password:
remote: Counting objects: 5157, done
remote: Finding sources: 100% (5157/5157)
remote: Getting sizes: 100% (2764/2764)
remote: Compressing objects: 100% (2756/2756)
remote: Total 5157 (delta 2143), reused 4988 (delta 2088)
Receiving objects: 100% (5157/5157), 321.45 MiB | 10.43 MiB/s, done.
Resolving deltas: 100% (2257/2257), done.
[~/git]
[05:47:16 tobradle]$ java -jar bfg-1.11.1.jar --strip-biggest-blobs 1 opc_nirvana.git

Using repo : /home/tobradle/git/opc_nirvana.git

Scanning packfile for large blobs: 5157
Scanning packfile for large blobs completed in 192 ms.
Found 1 blob ids for large blobs - biggest=276088476 smallest=276088476
Total size (unpacked)=276088476
Found 263 objects to protect
Found 4 commit-pointing refs : HEAD, refs/heads/NIR-6, refs/heads/axonobjctrl, refs/heads/master

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit 180c9457 (protected by 'HEAD')

Cleaning
--------

Found 307 commits
Cleaning commits:       100% (307/307)
Cleaning commits completed in 1,465 ms.

Updating 3 Refs
---------------

        Ref                      Before     After
        --------------------------------------------
        refs/heads/NIR-6       | 9841b605 | 5c7c946c
        refs/heads/axonobjctrl | 98dc2e8e | eb670d73
        refs/heads/master      | 180c9457 | 93f1363f

Updating references:    100% (3/3)
...Ref update completed in 58 ms.

Commit Tree-Dirt History
------------------------

        Earliest                                              Latest
        |                                                          |
        ..................................................DDmmmmmmmm

        D = dirty commits (file tree fixed)
        m = modified commits (commit message or parents changed)
        . = clean commits (no changes to file tree)

                                Before     After
        -------------------------------------------
        First modified commit | 0c0d6325 | e83b4edc
        Last dirty commit     | 48226765 | 6407065f


In total, 55 object ids were changed - a record of these will be written to:

        /home/tobradle/git/opc_nirvana.git.bfg-report/2014-02-21T05-50/object-id-map.old-new.txt

BFG run is complete!
[~/git]
[05:50:29 tobradle]$ cd opc_nirvana.git
[~/git/opc_nirvana.git]
[05:51:54 tobradle]$ git reflog expire --expire=now --all
[~/git/opc_nirvana.git]
[05:52:06 tobradle]$ git gc --prune=now --aggressive
Counting objects: 5157, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (3735/3735), done.
Writing objects: 100% (5157/5157), done.
Total 5157 (delta 2154), reused 0 (delta 0)
[~/git/opc_nirvana.git]
[05:52:20 tobradle]$ du
23      ./hooks
4       ./info
60346   ./objects/pack
3       ./objects/info
60358   ./objects
2       ./refs/tags
2       ./refs/heads
5       ./refs
2       ./branches
60396   .
[~/git/opc_nirvana.git]
[05:52:29 tobradle]$ git push
Password:
Counting objects: 5157, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (1581/1581), done.
Writing objects: 100% (5157/5157), 58.84 MiB | 13.40 MiB/s, done.
Total 5157 (delta 2154), reused 5157 (delta 2154)

remote: Resolving deltas: 100% (2154/2154)
remote: Updating references: 100% (3/3)

The push command says it wrote 58.85 MiB, which sounds right given the output from du (60396) above. But when I go to make a new clone from this, the new clone is 389 MiB, which seems to indicate the large blob that BFG removed is still there.

Any suggestions?

(I don't see how I can label issues, but this is a question, not a bug, I'm pretty sure - speaking of which, if there is a better way to get help using BFG, please point me in the right direction)

Git push has some rejections

Hi,

I recently used BFG to remove some old files and change a few texts in my test files. The clean operations went all OK without a problem. When I tried to push the changes I had the following error message at the end:

{code}
error: failed to push some refs to '<MY_REPOSITORY>'
{code}

These are some of the rejected messages:

{code}
! [remote rejected] refs/pull/1/head -> refs/pull/1/head (deny updating a hidden ref)
! [remote rejected] refs/pull/1/merge -> refs/pull/1/merge (deny updating a hidden ref)
! [remote rejected] refs/pull/10/head -> refs/pull/10/head (deny updating a hidden ref)
! [remote rejected] refs/pull/10/merge -> refs/pull/10/merge (deny updating a hidden ref)
! [remote rejected] refs/pull/11/head -> refs/pull/11/head (deny updating a hidden ref)
{code}

Am I in trouble? Did I just mess up my repository? Is there anyway to correct this error?

I would really appreciate any help, thanks.

Jgit MissingObjectException when deleting files

Hello Roberto,
I have an issue when trying the following command line

java -jar ../../talend-svn-git-migration/bfg-1.11.2.jar --delete-files '{*.jar,*.zip,*.war,*.jpg,*.png,*.jpeg}'

I have run bfg on it before and done the necessary cleaning that you mention in you web site :
$ git reflog expire --expire=now --all
$ git gc --prune=now --aggressive

but I get the following error :

This repo has been processed by The BFG before! Will prune repo before proceeding - to avoid unnecessary cleaning work on unused objects...
Exception in thread "main" org.eclipse.jgit.api.errors.JGitInternalException: Garbage collection failed.
        at org.eclipse.jgit.api.GarbageCollectCommand.call(GarbageCollectCommand.java:126)
        at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:49)
        at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:34)
        at scala.Option.map(Option.scala:145)
        at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:33)
        at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
        at scala.App$$anonfun$main$1.apply(App.scala:71)
        at scala.App$$anonfun$main$1.apply(App.scala:71)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
        at scala.App$class.main(App.scala:71)
        at com.madgag.git.bfg.cli.Main$.main(Main.scala:27)
        at com.madgag.git.bfg.cli.Main.main(Main.scala)
Caused by: org.eclipse.jgit.errors.MissingObjectException: Missing unknown ad8d81b862528c5cacf247e2dc4d71e4ef1a3cb8
        at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:148)
        at org.eclipse.jgit.lib.ObjectReader$1.open(ObjectReader.java:302)
        at org.eclipse.jgit.revwalk.RevWalk$2.next(RevWalk.java:921)
        at org.eclipse.jgit.internal.storage.pack.PackWriter.findObjectsToPack(PackWriter.java:1698)
        at org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:797)
        at org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:760)
        at org.eclipse.jgit.internal.storage.file.GC.writePack(GC.java:675)
        at org.eclipse.jgit.internal.storage.file.GC.repack(GC.java:531)
        at org.eclipse.jgit.internal.storage.file.GC.gc(GC.java:164)
        at org.eclipse.jgit.api.GarbageCollectCommand.call(GarbageCollectCommand.java:123)
        ... 13 more

MissingObjectException while listing dirty files

After running the following commands:

git clone http://git.chromium.org/chromium/src.git
java -jar ~/Downloads/bfg-1.9.0.jar --delete-files "*.{la,a,52,50,crx,xib,png,pdf,jpg,zip,jar,pdb,psd,jpeg,dylib,dll,DLL,exe,EXE,vcproj,ncb,so,sln,scons,nib,graffle,yuv,webm}" --delete-folders "{apple_webkit,bzip2,cygwin,cygwin_src,ceee,gnu,hyphen,o3d,findbugs,gles2_book,gles_book_examples,glew,gears,gnu,hunspell,boost,python_24,lighttpd,libunwind,layout_test_results,reference_build,scons,svn,initial,old,filter,profile_with_complex_theme,profile_with_default_theme,complex_theme,gtk_theme,custom_frame,custom_frame_gtk_theme,complex_theme,typical_history,layout_tests,MesaLib,mozilla}" .  2>&1 | tee bfg.log

I get the following stack trace:

Protected commits
-----------------

These are your latest commits, and so their contents will NOT be altered:

 * commit 92f06e9d (protected by 'HEAD') - contains 5539 dirty files : 
Exception in thread "main" org.eclipse.jgit.errors.MissingObjectException: Missing unknown 216cf5b1b3e97e285e0ee361f54e8f7a30c6af00
    at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:148)
    at org.eclipse.jgit.lib.ObjectReader.open(ObjectReader.java:229)
    at com.madgag.git.package$RichObjectId.open(package.scala:127)
    at com.madgag.git.bfg.cleaner.protection.Reporter$.com$madgag$git$bfg$cleaner$protection$Reporter$$fileInfo$1(Reporter.scala:34)
    at com.madgag.git.bfg.cleaner.protection.Reporter$$anonfun$reportProtectedCommitsAndTheirDirt$1$$anonfun$apply$1.apply(Reporter.scala:48)
    at com.madgag.git.bfg.cleaner.protection.Reporter$$anonfun$reportProtectedCommitsAndTheirDirt$1$$anonfun$apply$1.apply(Reporter.scala:48)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at com.madgag.git.bfg.cleaner.protection.Reporter$$anonfun$reportProtectedCommitsAndTheirDirt$1.apply(Reporter.scala:48)
    at com.madgag.git.bfg.cleaner.protection.Reporter$$anonfun$reportProtectedCommitsAndTheirDirt$1.apply(Reporter.scala:37)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at com.madgag.git.bfg.cleaner.protection.Reporter$.reportProtectedCommitsAndTheirDirt(Reporter.scala:36)
    at com.madgag.git.bfg.cleaner.CLIReporter.reportObjectProtection(Reporter.scala:73)
    at com.madgag.git.bfg.cleaner.RepoRewriter$.rewrite(RepoRewriter.scala:96)
    at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:59)
    at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:34)
    at scala.Option.map(Option.scala:145)
    at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:33)
    at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
    at scala.App$$anonfun$main$1.apply(App.scala:71)
    at scala.App$$anonfun$main$1.apply(App.scala:71)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
    at scala.App$class.main(App.scala:71)
    at com.madgag.git.bfg.cli.Main$.main(Main.scala:27)
    at com.madgag.git.bfg.cli.Main.main(Main.scala)

(This hasn't previously happened; it doesn't happen when I pull the "reference_build" directory out of the list of delete-folders.)

NullPointerException when running bfg -b 10M

Scanning packfile for large blobs: 1888
Scanning packfile for large blobs completed in 26 ms.
Found 25 blob ids for large blobs - biggest=33243704 smallest=12179517
Total size (unpacked)=652775371
Exception in thread "main" java.lang.NullPointerException
at org.eclipse.jgit.lib.ObjectIdOwnerMap.get(ObjectIdOwnerMap.java:131)
at org.eclipse.jgit.revwalk.RevWalk.parseAny(RevWalk.java:807)
at com.madgag.git.package$RichObjectId.asRevObject(package.scala:194)
at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$$anonfun$3.apply(ProtectedObjectCensus.scala:66)
at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$$anonfun$3.apply(ProtectedObjectCensus.scala:66)
at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:332)
at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:331)
at scala.collection.immutable.Set$Set1.foreach(Set.scala:83)
at scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:331)
at scala.collection.AbstractTraversable.groupBy(Traversable.scala:104)
at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$.apply(ProtectedObjectCensus.scala:66)
at com.madgag.git.bfg.cli.CLIConfig.objectProtection$lzycompute(CLIConfig.scala:145)
at com.madgag.git.bfg.cli.CLIConfig.objectProtection(CLIConfig.scala:145)
at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:57)
at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:34)
at scala.Option.map(Option.scala:145)
at com.madgag.git.bfg.cli.Main$.delayedEndpoint$com$madgag$git$bfg$cli$Main$1(Main.scala:33)
at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:27)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:383)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at com.madgag.git.bfg.cli.Main$.main(Main.scala:27)
at com.madgag.git.bfg.cli.Main.main(Main.scala)

Feature request/idea: dry-run mode

In order to preview which files would be removed with -b, I first used some perl script to see which files would be deleted. However, it would seem practical if BFG could be run in dry-mode to see what the output would be, without actually doing any changes in the repo.

Of course, it's also easy to just make another clone to do the test-run first. But if it's easy to implement dry-run, why not.

Crash if run on the same repo twice (wiithout an intervening 'git gc')

Due to object-id substitution and BFG's functional-recursion, running the BFG twice on the same repo (without cleaning out the dead objects using git-gc) will blow the stack.

  • Run a JGit gc prune first?
  • fix the blasted recursion?
  • Attempt to recognise 'formerly' commit message editions? - not a good idea
  • Tell the user not to do that

StackOverFlow error after pruning.

After running bfg against the chromium source tree, I get the following output:

Updating 4 Refs
---------------

Ref                           Before     After   
-------------------------------------------------
refs/heads/master           | 92f06e9d | 5528900c
refs/remotes/origin/git-svn | 3f6fca23 | 8064c07d
refs/remotes/origin/lkgr    | 6f9f75f6 | ba2c28d2
refs/remotes/origin/master  | 92f06e9d | 5528900c

Exception in thread "main" java.lang.StackOverflowError
    at org.eclipse.jgit.revwalk.MergeBaseGenerator.carryOntoOne(MergeBaseGenerator.java:205)
    at org.eclipse.jgit.revwalk.MergeBaseGenerator.carryOntoHistory(MergeBaseGenerator.java:194)
    at org.eclipse.jgit.revwalk.MergeBaseGenerator.carryOntoHistory(MergeBaseGenerator.java:195)

(Same line repeated over for the rest of the output.)

Both 92f06e9d and 5528900c appear to be valid refs, but the current master branch is still set at 92f06e9d at the time of the fatal exception. Not sure how to proceed with debugging.

(Context: Chromium's git repository is giant; it's about 2.1 gigs where more than half of it is binary data; I maintain scripts based on git-filter-branch to prune the binary crud in out theoretically forthcoming move from svn to git, but it takes days to run so I'm seeing what else is available.)

Add ability to easily truncate a repository

Enhancement request: I have a need to truncate a repository before a specific commit. I have been looking around and there are a number of different and non-trivial ways to do this. It would be nice if BFG could provide a way to do this.

I would imagine the call would look something like:

bfg --truncate abf26bb

where the sha hash is the new root hash (of course expecting that the actual hash will most likely change)

NPE during cleanup up of big blobs

Hi Roberto,
When removing blobs bigger than 1MB in my repo I get the following NPE.
I have not had a look at your code so I am not sure if it comes from JGit or BFG.
You'll find at the bottom the command line used that generated this exception.

 [echo] Exception in thread "main" java.lang.NullPointerException
 [echo]     at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache$Table.index(UnpackedObjectCache.java:146)
 [echo]     at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache$Table.contains(UnpackedObjectCache.java:109)
 [echo]     at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache.isUnpacked(UnpackedObjectCache.java:64)
 [echo]     at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObject1(ObjectDirectory.java:359)
 [echo]     at org.eclipse.jgit.internal.storage.file.FileObjectDatabase.openObjectImpl1(FileObjectDatabase.java:173)
 [echo]     at org.eclipse.jgit.internal.storage.file.FileObjectDatabase.openObject(FileObjectDatabase.java:158)
 [echo]     at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:145)
 [echo]     at org.eclipse.jgit.treewalk.CanonicalTreeParser.reset(CanonicalTreeParser.java:201)
 [echo]     at org.eclipse.jgit.treewalk.TreeWalk.parserFor(TreeWalk.java:984)
 [echo]     at org.eclipse.jgit.treewalk.TreeWalk.addTree(TreeWalk.java:468)
 [echo]     at com.madgag.git.package$RichRevTree.walk(package.scala:68)
 [echo]     at com.madgag.git.package$.allBlobsUnder(package.scala:215)
 [echo]     at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$$anonfun$6.apply(ProtectedObjectCensus.scala:79)
 [echo]     at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$$anonfun$6.apply(ProtectedObjectCensus.scala:79)
 [echo]     at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
 [echo]     at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
 [echo]     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 [echo]     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 [echo]     at scala.collection.MapLike$DefaultKeySet.foreach(MapLike.scala:174)
 [echo]     at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
 [echo]     at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
 [echo]     at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$.apply(ProtectedObjectCensus.scala:79)
 [echo]     at com.madgag.git.bfg.cli.CLIConfig.objectProtection$lzycompute(CLIConfig.scala:143)
 [echo]     at com.madgag.git.bfg.cli.CLIConfig.objectProtection(CLIConfig.scala:143)
 [echo]     at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:57)
 [echo]     at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:34)
 [echo]     at scala.Option.map(Option.scala:145)
 [echo]     at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:33)
 [echo]     at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
 [echo]     at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
 [echo]     at scala.App$$anonfun$main$1.apply(App.scala:71)
 [echo]     at scala.App$$anonfun$main$1.apply(App.scala:71)
 [echo]     at scala.collection.immutable.List.foreach(List.scala:318)
 [echo]     at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
 [echo]     at scala.App$class.main(App.scala:71)
 [echo]     at com.madgag.git.bfg.cli.Main$.main(Main.scala:27)
 [echo]     at com.madgag.git.bfg.cli.Main.main(Main.scala)     [echo] Exception in thread "main" java.lang.NullPointerException
 [echo]     at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache$Table.index(UnpackedObjectCache.java:146)
 [echo]     at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache$Table.contains(UnpackedObjectCache.java:109)
 [echo]     at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache.isUnpacked(UnpackedObjectCache.java:64)
 [echo]     at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObject1(ObjectDirectory.java:359)
 [echo]     at org.eclipse.jgit.internal.storage.file.FileObjectDatabase.openObjectImpl1(FileObjectDatabase.java:173)
 [echo]     at org.eclipse.jgit.internal.storage.file.FileObjectDatabase.openObject(FileObjectDatabase.java:158)
 [echo]     at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:145)
 [echo]     at org.eclipse.jgit.treewalk.CanonicalTreeParser.reset(CanonicalTreeParser.java:201)
 [echo]     at org.eclipse.jgit.treewalk.TreeWalk.parserFor(TreeWalk.java:984)
 [echo]     at org.eclipse.jgit.treewalk.TreeWalk.addTree(TreeWalk.java:468)
 [echo]     at com.madgag.git.package$RichRevTree.walk(package.scala:68)
 [echo]     at com.madgag.git.package$.allBlobsUnder(package.scala:215)
 [echo]     at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$$anonfun$6.apply(ProtectedObjectCensus.scala:79)
 [echo]     at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$$anonfun$6.apply(ProtectedObjectCensus.scala:79)
 [echo]     at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
 [echo]     at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
 [echo]     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 [echo]     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 [echo]     at scala.collection.MapLike$DefaultKeySet.foreach(MapLike.scala:174)
 [echo]     at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
 [echo]     at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
 [echo]     at com.madgag.git.bfg.cleaner.protection.ProtectedObjectCensus$.apply(ProtectedObjectCensus.scala:79)
 [echo]     at com.madgag.git.bfg.cli.CLIConfig.objectProtection$lzycompute(CLIConfig.scala:143)
 [echo]     at com.madgag.git.bfg.cli.CLIConfig.objectProtection(CLIConfig.scala:143)
 [echo]     at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:57)
 [echo]     at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:34)
 [echo]     at scala.Option.map(Option.scala:145)
 [echo]     at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:33)
 [echo]     at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
 [echo]     at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
 [echo]     at scala.App$$anonfun$main$1.apply(App.scala:71)
 [echo]     at scala.App$$anonfun$main$1.apply(App.scala:71)
 [echo]     at scala.collection.immutable.List.foreach(List.scala:318)
 [echo]     at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
 [echo]     at scala.App$class.main(App.scala:71)
 [echo]     at com.madgag.git.bfg.cli.Main$.main(Main.scala:27)
 [echo]     at com.madgag.git.bfg.cli.Main.main(Main.scala)

here is the command line used to launch BFG.

java -jar /home/talend/talend-svn-git-migration/bfg-1.11.1.jar --strip-blobs-bigger-than 1M --protect-blobs-from branch-5_0,branch-5_1,branch-5_2,branch-5_3,branch-5_4,master,release-5_0_0,release-5_0_1,release-5_0_2,release-5_0_3,release-5_1_0,release-5_1_1,release-5_1_2,release-5_1_3,release-5_2_0,release-5_2_1,release-5_2_2,release-5_2_3,release-5_3_0,release-5_3_1,release-5_3_2,release-5_4_0,release-5_4_1

Still clean commit message even if commit tree & parent commit ids are clean

So unfortunately I think Object Id Substitution makes this necessary.

Scenario : A repo with two branches operating side-by-side. One branch is completely clean (tree and parent-commit wise), but the other is dirty. On the 'clean' branch, a human writes a commit message referencing a commit from the dirty branch. This commit id must be updated, even though it is contained in the text of an ostensibly 'clean' commit on a clean branch.

Warn user if protected tips evaluate as dirty

(was "Explain presumption that current tip is sacred (protected)")

Names of protected tips should be printed at start of run.

If protected tips are dirty

It's very confusing for the user. They don't know why (some) stuff persists.

Possibly the run should be blocked if protected tips are dirty.

We'd really like to see that your current tips are dirty, and explain action based on that (present a diff?!) - @ntoll was not aware that he needed to clean his current tips before proceeding.

push denied after deleting files with bfg

After deleting select files from my repo with bfg, I'm denied when using git push.

The commands that I used (repo name changed to protect the innocent):

git clone --mirror /path/to/my/repo.git
bfg -D file1 repo.git
bfg -D file2 repo.git
bfg -D file3 repo.git
cd repo.git
git reflog expire --expire=now --all
git gc --prune=now --aggressive
git push

The output from 'git push'

Counting objects: 857, done.
Delta compression using up to 40 threads.
Compressing objects: 100% (280/280), done.
Writing objects: 100% (817/817), 6.25 MiB, done.
Total 817 (delta 513), reused 803 (delta 503)
remote: error: denying non-fast-forward refs/heads/develop (you should pull first)
remote: error: denying non-fast-forward refs/heads/feature/align_breakpoints (you should pull first)
remote: error: denying non-fast-forward refs/heads/master (you should pull first)
To /var/git/sharchaea/genome_align/
! [remote rejected] develop -> develop (non-fast-forward)
! [remote rejected] feature/align_breakpoints -> feature/align_breakpoints (non-fast-forward)
! [remote rejected] master -> master (non-fast-forward)
error: failed to push some refs to '/var/git/sharchaea/genome_align/'

Any idea what's going on?
Thanks

Getting a crash when running

Here is the command I ran:

java -jar ~/Downloads/bfg-1.11.0.jar --delete-folders .git --no-blob-protection .

Protected commits

You're not protecting any commits, which means the BFG will modify the contents of even current commits.

This isn't recommended - ideally, if your current commits are dirty, you should fix up your working copy and commit that, check that your build still works, and only then run the BFG to clean up your history.

Cleaning

Found 9099 commits
Cleaning commits: 94% (8554/9099)Exception in thread "main" com.google.common.util.concurrent.ExecutionError: com.google.common.util.concurrent.ExecutionError: com.google.common.util.concurrent.ExecutionError: com.google.common.util.concurrent.ExecutionError: com.google.common.util.conc

com.google.common.util.concurrent.ExecutionError: com.google.common.util.concurrent.ExecutionError: com.google.common.util.concurrent.ExecutionError: java.lang.StackOverflowError
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261)
at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at com.madgag.git.bfg.MemoUtil$$anonfun$concurrentCleanerMemo$1$$anon$1.apply(memo.scala:56)
at com.madgag.git.bfg.cleaner.ObjectIdCleaner.apply(ObjectIdCleaner.scala:65)
at com.madgag.git.bfg.cleaner.RepoRewriter$$anonfun$clean$1$1$$anonfun$apply$mcV$sp$3.apply(RepoRewriter.scala:112)
at com.madgag.git.bfg.cleaner.RepoRewriter$$anonfun$clean$1$1$$anonfun$apply$mcV$sp$3.apply(RepoRewriter.scala:111)
at scala.collection.immutable.List.foreach(List.scala:318)
at com.madgag.git.bfg.cleaner.RepoRewriter$$anonfun$clean$1$1.apply$mcV$sp(RepoRewriter.scala:110)
at com.madgag.git.bfg.cleaner.RepoRewriter$$anonfun$clean$1$1.apply(RepoRewriter.scala:103)
at com.madgag.git.bfg.cleaner.RepoRewriter$$anonfun$clean$1$1.apply(RepoRewriter.scala:103)
at com.madgag.git.bfg.Timing$.measureTask(timing.scala:39)
at com.madgag.git.bfg.cleaner.RepoRewriter$.clean$1(RepoRewriter.scala:103)
at com.madgag.git.bfg.cleaner.RepoRewriter$.rewrite(RepoRewriter.scala:144)
at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:59)
at com.madgag.git.bfg.cli.Main$$anonfun$1.apply(Main.scala:34)
at scala.Option.map(Option.scala:145)
at com.madgag.git.bfg.cli.Main$delayedInit$body.apply(Main.scala:33)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
at scala.App$class.main(App.scala:71)
at com.madgag.git.bfg.cli.Main$.main(Main.scala:27)
at com.madgag.git.bfg.cli.Main.main(Main.scala)

strip-blobs-with-ids and .REMOVED.git-id

Running BFG with strip-blobs-with-ids does not seem to produce the "*.REMOVED.git-id" placeholder files. Apparently this is a feature (see #11 (comment)).

Would it be possible to have BFG produce the placeholder files even when using strip-blobs-with-ids?

Some background. I do not know what the official purpose of these placeholder files is but I have found them useful in three ways:

  1. In case a build of an old version fails due to an accidentally removed historical file, I can immediately see which file is missing and why.
  2. After running BFG, I can quickly verify that a given historical version (i.e. tag) includes all the relevant files (do a checkout followed by find . -name "*REMOVED*", for example).
  3. I may even list every file that ever existed in the repo and was consequently removed by BFG (e.g. git rev-list HEAD | xargs -L 1 git ls-tree -r | grep REMOVED).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.