Comments (14)
I agree completely. Furthermore the Qt library results in a significant
slowdown of
the code. Using real arrays would be so much faster than QLists of QStrings.
However, the person who translated DMP into C++ was using Qt in his project, so
that's what he used. Using Qt also made the translation easier, since Qt
closely
mimics the Java data structures. I've got a long-term goal of removing the Qt
dependency, but there are a lot of higher priority items before I get to that.
If you or someone else wants to take a shot at removing Qt, I'd be very
grateful.
Just removing Qt from diff_map() would approximately double the speed of
differencing. I've already taken care of match_bitap() which was the other
pain-
point.
Minor note: In the Java, C++ and C# versions a diff is represented as a linked
list
of diffs. Whereas in the Python and JavaScript versions a diff is represented
by an
array of diffs. This leads to slightly different algorithms when traversing a
diff
(such as in the cleanup functions). If Qt were removed from the C++ version, I
suspect that switching from linked lists to arrays would be more efficient.
Original comment by [email protected]
on 9 Sep 2009 at 6:50
- Added labels: Performance
from google-diff-match-patch.
Please find attached my version that uses only standard C++ library.
In addition to the removing the Qt dependency, several other modifications were
done:
1) diff_linesToChars/diff_charsToLines use an array of the pointers to the
substrings in the source strings instead of an array of the substrings.
2) I noticed that compilers (at least Microsoft's one) generate much better
code for the functions returning containers by value if the functions do not
contain multiple return statements. This was the reason for a dummy loop in
diff_compute.
3) static modifiers were added to the member functions that did not use
diff_match_patch values or other non-static members. A check for the unlimited
time was moved from diff_halfMatch to diff_compute that receives the deadline
argument. Because of that, a check for optimal no halfmatch does not work and
was turned off.
4) A test case for the diff_bisect timeout was quick enough to complete before
the clocks were able to move forward. I added a loop waiting for the first tick
before this test.
The code was tested with MSVC++ 2008 and GNU G++ 4.4.5.
Original comment by [email protected]
on 16 Jan 2011 at 9:40
Attachments:
from google-diff-match-patch.
Wow, thank you! I'll start work on submitting this code next week.
Original comment by [email protected]
on 19 Jan 2011 at 12:39
from google-diff-match-patch.
Here is an update. After a little thought, I converted the whole
diff_match_patch class to a template, with all character type dependencies
moved to a separate traits class. This allows to use any string types that
provide std::basic_string interface: either derived from standard string types,
or custom. For example, I tried speedtest with the strings of 8-bit and 32-bit
chars, both without any problems (not checked scrupulously, but at least both
returned the same number of diffs as 16-bit version).
Another small modification is that output diff lists are passed by reference to
the private functions to avoid dependency on compiler optimization. MSVC that I
mostly use in my projects seems to get lost within complex functions and does
some crazy copy-construction work if the results are returned by value, which
is rather costly with the STL containers.
All recent changes have been implemented, including a fix to the issue #40.
Please take a look.
Original comment by [email protected]
on 23 Jan 2011 at 7:12
Attachments:
from google-diff-match-patch.
Hi snhere,
Nice work on the stl port. With your update, I think you accidentally missed
diff_match_patch.cpp.
Original comment by [email protected]
on 3 Jun 2011 at 2:23
from google-diff-match-patch.
Any update on the STL port of this library?
Original comment by [email protected]
on 20 Jun 2011 at 10:23
from google-diff-match-patch.
The update didn't miss diff_match_patch.cpp---it turned dif_match_patch into a
header-only library. Everything is in diff_match_patch.h
Original comment by [email protected]
on 22 Jun 2011 at 7:31
from google-diff-match-patch.
I was trying to convert the code to a Java generic list comparison. There are
a few areas left (13 errors). Some are the same. I hope you would take a
look. Since you're familiar with the code, maybe it'll take 5 to 10 minutes,
or at least provide some comments on how to deal with some of them.
Original comment by [email protected]
on 6 Aug 2011 at 12:40
Attachments:
from google-diff-match-patch.
Question: is it possible to use string instead of wstring?
because every time I try "diff_match_patch<string> dmp;"
it ends up giving me errors on the type of string I'm using
thanks
Original comment by [email protected]
on 29 Nov 2011 at 8:19
from google-diff-match-patch.
Any updates on this?
Original comment by [email protected]
on 3 Aug 2012 at 7:35
from google-diff-match-patch.
[deleted comment]
from google-diff-match-patch.
Thanks so much for your work Sergey.
I wonder why this code is not presented as the official C++ version.
#10, updates what? The awesome header-only implementation isn't working for
you?
Original comment by [email protected]
on 26 Jun 2013 at 10:04
from google-diff-match-patch.
I am wondering why you went with wchar_t as opposed to just doing UTF-8 with
std::string throughout.
Original comment by [email protected]
on 3 Jul 2013 at 2:12
from google-diff-match-patch.
I have put the good work from Sergey Nozhenko on GitHub and added some tweaks
to support std::string in addition to std::wstring. There are now test
harnesses for both types of strings.
Here is the link to the repository:
https://github.com/leutloff/diff-match-patch-cpp-stl
Original comment by [email protected]
on 18 Jul 2013 at 8:49
from google-diff-match-patch.
Related Issues (20)
- Consider SQLCLR compatibility / eliminate dependency on System.Web for UrlEncode and UrlDecode HOT 3
- xIndex for instertion after location
- Demo pages not working HOT 4
- Levenshtein distance problem
- objc version generates wrong diffs
- When is this project transferred to github? HOT 1
- Javascript version doesn't handle astral code points correctly
- Levenshtein maximum distance is greater than length of both strings HOT 1
- Substring length check missing in C# implementation
- javascript diff_cleanupSemantic uses negative indexes in the equalities array HOT 1
- diff_prettyHtml output hard-codes color for <ins> and <del> HOT 1
- C# uses \n instead of \n\r or Environment.NewLine
- c# patch_toText + patch_fromText doesn't work
- Ruby port
- performance slow?
- NewLines appear broken in patches (Python 3, Django 1.6.1) HOT 2
- Patch for /trunk/python3/diff_match_patch.py
- Patch for /trunk/python3/diff_match_patch.py
- Uninitialized string offset: 0 (function diff_cleanupSemanticLossless)
- Text containing HTML HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from google-diff-match-patch.