Git Product home page Git Product logo

Comments (3)

derrickstolee avatar derrickstolee commented on May 13, 2024 1

I just realized that the following code in revision.c will make this always be a minimum amount of slow:

	if (revs->line_level_traverse) {
		revs->limited = 1;
		revs->topo_order = 1;
	}

And the code in line-log.c for int line_log_filter(struct rev_info *rev) expects the commit list to be complete before scanning the contents.

So, taking stock on this problem we have a few things to think about:

  1. How can Bloom filters optimally interact with -L?
  2. Can we make the algorithm iterative instead of needing a full commit walk?
  3. Can we remove the rename detection by default? It will change some results (when there is a rename on the given file) but it is probably worth not downloading all the contents of EVERY changed blob in the history!

from git.

derrickstolee avatar derrickstolee commented on May 13, 2024 1

Making progress here! See gitgitgadget#622 for more details.

from git.

derrickstolee avatar derrickstolee commented on May 13, 2024

Some helpful progress is being made on-list: https://public-inbox.org/git/[email protected]/T/#m79ee9ae1d2696dc4c57f0d409d72949403ab84dc

Here are the results using a random path I picked out from the Windows
repo (it was only changed ~10 times in the 4.5 million commits):

Before:

real    2m7.308s
real    2m8.572s

With Patch 4:

real    0m38.628s
real    0m38.477s

With Patch 5:

real    0m24.685s
real    0m24.310s

For the specific file in the bug report from a real user, I got
these numbers:

real    0m32.293s (patch 4)
real    0m19.362s (patch 5)

When running without the patch, I had to kill the process after 55 minutes of waiting (and 20,000+ blob downloads). It appears that somehow this is triggering rename detection, and the blob contents are being checked! A PerfView trace records the following stack to be interesting:

line_log_filter
+ queue_diffs
  + diffcore_std
    + diffcore_rename
      + diff_populate_filespec

The changes on-list involve not forcing the entire graph to be read, so those changes are orthogonal to #175.

from git.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.