Comments (10)
Thanks @mookid, I tested it in dtolnay/trybuild#27 against some projects and it is definitely a lot better. I think it is still possible to do better, like in my example of diff -u <(echo '^^^^^^^^^^') <(echo '^^^^^^^^^^^^^^^')
above I believe most people would perceive this as characters added to the end of the line rather than characters inserted at the beginning, i.e. treating the common prefix as shared makes more sense. But at this point I would use this if you released it.
You wrote in #37 (comment) that you are concerned about the performance. Can you give me an idea of how bad it is? What size of input are you testing on and how long does it take to compute the diff?
from diffr.
Another example that comes out wrong:
Again it would be better to prioritize the common prefix and highlight 5 trailing adjacent ^
as added.
from diffr.
Hi @dtolnay
Thanks for the sugestion and the examples. I will take a look soon.
from diffr.
Greedily expanding the snakes as much as possible is never a pessimization, but can also miss global extrema.
In other words, given the result yielded by diffr-lib:
-[ note: ]AAA
+[ note:] BBB[ ]CCC
the second snake can be extended one byte to the left (reducing the count of snakes by one) and this is always a win:
+[ note: ]BBB CCC
The best solution to the problem of minimizing the number of snakes projecting to a given LCS would be better, of course. But it does not help in the second case (in the given example as in your proposal, the number of snakes is 1).
from diffr.
Hi @dtolnay,
I am working on an improvement; after computing the longest common susequence, I need to figure out the "best" partition of both parts of the diff (ie, the one minimizing the number of different segments). I'll let you know here when I have some code that you can test.
from diffr.
@dtolnay I wrote some code that corresponds to the spec I have in mind; please let me know if the result corresponds to something like what you are looking for.
from diffr.
The worst case scenario I have seen is the fwllowing diff:
git/git@786dabe
for which the perf goes from ~13s to ~55s on my dev machine, which is pretty bad. In the wost case, the time of both postprocessing steps takes roughly the same time as the Myers algorithm.
I can still improve the postprocessing algorithm, but I think there will still be bad scenarios with that design. The alternative would be to merge tweak the Myers algorithm to yield the best split at the same time, but it's not easy.
About prioritizing the prefix or the suffix, it should be easy to do.
from diffr.
Thanks, that sounds like it wouldn't be a problem for my use case since the diffs I deal with are going to be ~40 lines max. I wasn't sure if you had some kind of quintic loop going on where a 40 lines diff could take multiple seconds. But obviously if you can improve the performance that would be great!
Thanks for your work on this!
from diffr.
Ok, good to know! I don't think it will be a problem for that use case.
from diffr.
I published diffr-lib 0.1.3. Please let me know how it works for you!
from diffr.
Related Issues (20)
- Improve diffr-lib example HOT 1
- hunk header parsing failure
- Show example of how to configure git to use diffr by default? HOT 3
- Tabs do not play well with --line-numbers
- diffr available on NetBSD HOT 1
- Improve default coloring contrast HOT 8
- Question about highlighting trailing WS HOT 4
- Ignore word-level diff on new or deleted lines HOT 3
- Failing Tests HOT 1
- Inserted code line messes up the downwards coloring
- how to use with git commit --verbose? HOT 1
- Default parameters from environment variable HOT 1
- Native Debian package HOT 2
- [Debian package] Compilation warnings
- [Debian package] Unit tests failing HOT 2
- [Debian package] Missing man page HOT 4
- Doesn't properly handle changed last line followed by "\ No newline at end of file" HOT 3
- Highlighting of title of diffed file HOT 1
- Not encoding-aware, operates on individual bytes HOT 5
- Suggestion: More Precise Coloring Options
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diffr.