Git Product home page Git Product logo

diff-match-patch's Introduction

Diff-Match-Patch

Build Status Latest Stable Version Total Downloads

The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text:

  • computes character-based diff of two texts
  • performs fuzzy match of given string
  • applies a list of patches.

This is the port of Google's diff-match-patch library to PHP.

Diff

Compare two plain text and efficiently return a array of differences. It works with characters, but if you want to compute word-based or line-based diff — you can easily tune it for your needs.

Usage:

<?php

use DiffMatchPatch\DiffMatchPatch;

$text1 = "The quick brown fox jumps over the lazy dog.";
$text2 = "That quick brown fox jumped over a lazy dog.";
$dmp = new DiffMatchPatch();
$diffs = $dmp->diff_main($text1, $text2, false);
var_dump($diffs);

Returns:

array(
    array(DiffMatchPatch::DIFF_EQUAL, "Th"),
    array(DiffMatchPatch::DIFF_DELETE, "e"),
    array(DiffMatchPatch::DIFF_INSERT, "at"),
    array(DiffMatchPatch::DIFF_EQUAL, " quick brown fox jump"),
    array(DiffMatchPatch::DIFF_DELETE, "s"),
    array(DiffMatchPatch::DIFF_INSERT, "ed"),
    array(DiffMatchPatch::DIFF_EQUAL, " over "),
    array(DiffMatchPatch::DIFF_DELETE, "the"),
    array(DiffMatchPatch::DIFF_INSERT, "a"),
    array(DiffMatchPatch::DIFF_EQUAL, " lazy dog."),
)

Demo

Match

Given a search string, find its best fuzzy match in a plain text near the given location. Weighted for both accuracy and location.

Usage:

<?php

use DiffMatchPatch\DiffMatchPatch;

$dmp = new DiffMatchPatch();
$text = "The quick brown fox jumps over the lazy fox.";
$pos = $dmp->match_main($text, "fox", 0); // Returns 16
$pos = $dmp->match_main($text, "fox", 40); // Returns 40
$pos = $dmp->match_main($text, "jmps"); // Returns 20
$pos = $dmp->match_main($text, "jmped"); // Returns -1
$pos = $dmp->Match_Threshold = 0.7;
$pos = $dmp->match_main($text, "jmped"); // Returns 20

Demo

Patch

Apply a list of patches in Unidiff-like format onto plain text. Use best-effort to apply patch even when the underlying text doesn't match.

Usage:

<?php

use DiffMatchPatch\DiffMatchPatch;

$dmp = new DiffMatchPatch();
$patches = $dmp->patch_make("The quick brown fox jumps over the lazy dog.", "That quick brown fox jumped over a lazy dog.");
// @@ -1,11 +1,12 @@
//  Th
// -e
// +at
//   quick b
// @@ -22,18 +22,17 @@
//  jump
// -s
// +ed
//   over
// -the
// +a
//   laz
$result = $dmp->patch_apply($patches, "The quick red rabbit jumps over the tired tiger.");
var_dump($result);

Returns:

array(
    "That quick red rabbit jumped over a tired tiger.",
    array (
        true,
        true,
    ),
);

Demo

API

Currently this library available in:

Regardless of language, each library uses the same API and the same functionality.

Algorithms

This library implements Myer's diff algorithm which is generally considered to be the best general-purpose diff. A layer of pre-diff speedups and post-diff cleanups surround the diff algorithm, improving both performance and output quality.

This library also implements a Bitap matching algorithm at the heart of a flexible matching and patching strategy.

Requirements

Installation

composer require yetanotherape/diff-match-patch

License

Diff-Match-Patch is licensed under the Apache License 2.0 - see the LICENSE file for details

diff-match-patch's People

Contributors

laurent22 avatar mbaynton avatar ordago avatar yetanotherape avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diff-match-patch's Issues

Patch works wrong with nonlatin characters

String patches

Array
        (
            [0] => @@ -0,0 +1,18 @@
+%D0%A1%D0%BF%D0%B0%D1%81%D0%B5%D0%BD%D0%B8%D0%B5 %D1%83%D1%82%D0%BE%D0%BF%D0%B0%D1%8E%D1%89%D0%B8%D1%85

            [1] => @@ -11,8 +11,17 @@
 %D1%82%D0%BE%D0%BF%D0%B0%D1%8E%D1%89%D0%B8%D1%85
+ %D0%B4%D0%B5%D0%BB%D0%BE %D1%80%D1%83%D0%BA

            [2] => @@ -0,0 +1,43 @@
+%D0%A1%D0%BF%D0%B0%D1%81%D0%B5%D0%BD%D0%B8%D0%B5 %D1%83%D1%82%D0%BE%D0%BF%D0%B0%D1%8E%D1%89%D0%B8%D1%85 %D0%B4%D0%B5%D0%BB%D0%BE %D1%80%D1%83%D0%BA %D1%81%D0%B0%D0%BC%D0%B8%D1%85 %D1%83%D1%82%D0%BE%D0%BF%D0%B0%D1%8E%D1%89%D0%B8%D1%85
        )

Array patches, converted with patch_fromText()

Array
(
    [0] => DiffMatchPatch\PatchObject Object
        (
            [changes:protected] => Array
                (
                    [0] => Array
                        (
                            [0] => 1
                            [1] => Спасение утопающих
                        )

                )

            [start1:protected] => 0
            [start2:protected] => 0
            [length1:protected] => 0
            [length2:protected] => 18
        )

)
Array
(
    [0] => DiffMatchPatch\PatchObject Object
        (
            [changes:protected] => Array
                (
                    [0] => Array
                        (
                            [0] => 0
                            [1] => топающих
                        )

                    [1] => Array
                        (
                            [0] => 1
                            [1] =>  дело рук
                        )

                )

            [start1:protected] => 10
            [start2:protected] => 10
            [length1:protected] => 8
            [length2:protected] => 17
        )

)
Array
(
    [0] => DiffMatchPatch\PatchObject Object
        (
            [changes:protected] => Array
                (
                    [0] => Array
                        (
                            [0] => 1
                            [1] => Спасение утопающих дело рук самих утопающих
                        )

                )

            [start1:protected] => 0
            [start2:protected] => 0
            [length1:protected] => 0
            [length2:protected] => 43
        )

)

Results of patch_apply()

Array
        (
            [0] => Спасение утопающих
            [1] => Спасение утопающих дело рук
            [2] => Спасение утопающих дело рук самих утопающихСпасение утопающих дело рук
        )

Results should be

Array
        (
            [0] => Спасение утопающих
            [1] => Спасение утопающих дело рук
            [2] => Спасение утопающих дело рук самих утопающих
        )

Is this done and usable?

I'm asking this as the README still says,

NOTE: This is alpha software and is under development.

But, at the same time, the README is not updated in last 2 years whereas I can see a lot of other commits.

What is the state of the project? If not in a good condition, any alternative for google-diff-match-patch in php?

Remove UCS-2LE encoding/decoding?

I've got the following example:

use DiffMatchPatch\DiffMatchPatch;
$dmp = new DiffMatchPatch();
$diff = $dmp->patch_make('car', 'car 🚘');
var_dump($dmp->patch_toText($diff));
var_dump($dmp->patch_apply($diff, 'car'));

Which results in this exception:

Fatal error: Uncaught iconv(): Detected an illegal character in input string

/mnt/d/Web/www/joplin/vendor/symfony/phpunit-bridge/DeprecationErrorHandler.php:73
/mnt/d/Web/www/joplin/vendor/yetanotherape/diff-match-patch/src/Diff.php:971
/mnt/d/Web/www/joplin/vendor/yetanotherape/diff-match-patch/src/Patch.php:301
/mnt/d/Web/www/joplin/vendor/yetanotherape/diff-match-patch/src/DiffMatchPatch.php:270

So it's throwing an exception here when encoding the string to UCS-2LE with iconv:

$text2 = iconv($prevInternalEncoding, 'UCS-2LE', $text2);

If I comment this line and the line that encodes back to the original encoding:

$change[1] = iconv('UCS-2LE', $prevInternalEncoding, $change[1]);

then it works fine:

string(36) "@@ -1,3 +1,5 @@
 car
+ %F0%9F%9A%98
"
array(2) {
  [0]=>
  string(8) "car 🚘"
  [1]=>
  array(1) {
    [0]=>
    bool(true)
  }
}

So I'm wondering - what is the purpose of this encoding/decoding? Could it be removed, or maybe could it be skipped if the input string is UTF-8 (which I assume is commonly used format)? That would allow the lib to be compatible with emojis and other valid UTF-8 strings.

Multibyte characters lead to incorrect diffs on specific server

I currently run into problems when trying to diff strings which contain multibyte characters.

I created an minimal example which shows the problem:

<!DOCTYPE html>
<meta charset=utf8>
<?php
require_once 'src/DiffMatchPatch/DiffMatchPatch.php';
require_once 'src/DiffMatchPatch/DiffToolkit.php';
require_once 'src/DiffMatchPatch/Diff.php';
require_once 'src/DiffMatchPatch/Match.php';
require_once 'src/DiffMatchPatch/Patch.php';
require_once 'src/DiffMatchPatch/PatchObject.php';
require_once 'src/DiffMatchPatch/Utils.php';

use DiffMatchPatch\Diff;

$string1 = 'abc „def“ ghi';
$string2 = 'abc  bla „def“ ghi';

$diff = new Diff($string1, $string2);

echo sprintf(
    '<h2>Text 1</h2><pre>%s</pre><h2>Text 2</h2><pre>%s</pre><h2>Delta</h2><pre>%s</pre>',
    $diff->text1(),
    $diff->text2(),
    $diff->toDelta()
);
?>

On my local machine this produces the intended result:
local-result

But when I upload it to a production-server I work with, the result looks like this:
production-result

The production server runs PHP Version 5.3.3-7+squeeze25, here is an excerpt of probably the most important parts of phpinfo() for this issue:
phpinfo

Do you have any idea where this problem comes from and especially what I could do to solve this?

Double call to mb_string

Hi there!

There's an issue in DiffToolkit where it's calling mb_strlen(mb_strlen($longtext)) this produces an error on newer php versions (using declare(strict_types=1)) because mb_strlen must receive a string as an argument.

After changing it all tests keep passing

if (mb_strlen($longtext) < 4 || mb_strlen($shorttext) * 2 < mb_strlen(mb_strlen($longtext))) {

Here's the previous phpunit log

PhpUnit Log

 fede@desktop ~/Codes/diff-match-patch ~ $ vendor/phpunit/phpunit/phpunit
//PHPUnit 9.1.3 by Sebastian Bergmann and contributors.

Runtime:       PHP 7.4.5
Configuration: /home/fede/Codes/diff-match-patch/phpunit.xml.dist

.E.......EEE..........E.............EEEE.Elapsed time: 0.008
Memory usage: 0.144
.Elapsed time: 0.158
Memory usage: 0
..                      44 / 44 (100%)

Time: 00:00.315, Memory: 8.00 MB

There were 9 errors:

1) DiffMatchPatch\DiffMatchPatchTest::testDiffMain
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/src/DiffMatchPatch.php:174
/home/fede/Codes/diff-match-patch/tests/DiffMatchPatchTest.php:79

2) DiffMatchPatch\DiffMatchPatchTest::testPatchMake
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/src/Patch.php:293
/home/fede/Codes/diff-match-patch/src/DiffMatchPatch.php:262
/home/fede/Codes/diff-match-patch/tests/DiffMatchPatchTest.php:164

3) DiffMatchPatch\DiffMatchPatchTest::testPatchApply
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/src/Patch.php:293
/home/fede/Codes/diff-match-patch/src/DiffMatchPatch.php:262
/home/fede/Codes/diff-match-patch/tests/DiffMatchPatchTest.php:176
/home/fede/Codes/diff-match-patch/tests/DiffMatchPatchTest.php:188

4) DiffMatchPatch\DiffMatchPatchTest::testPatchApply_2
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/src/Patch.php:293
/home/fede/Codes/diff-match-patch/src/DiffMatchPatch.php:262
/home/fede/Codes/diff-match-patch/tests/DiffMatchPatchTest.php:176
/home/fede/Codes/diff-match-patch/tests/DiffMatchPatchTest.php:196

5) DiffMatchPatch\DiffTest::testMain
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/tests/DiffTest.php:729

6) DiffMatchPatch\PatchTest::testMake
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/src/Patch.php:293
/home/fede/Codes/diff-match-patch/tests/PatchTest.php:157

7) DiffMatchPatch\PatchTest::testSplitMax
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/src/Patch.php:293
/home/fede/Codes/diff-match-patch/tests/PatchTest.php:214

8) DiffMatchPatch\PatchTest::testAddPadding
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/src/Patch.php:293
/home/fede/Codes/diff-match-patch/tests/PatchTest.php:247

9) DiffMatchPatch\PatchTest::testApply
TypeError: Argument 4 passed to DiffMatchPatch\Diff::compute() must be of the type int, float given, called in /home/fede/Codes/diff-match-patch/src/Diff.php on line 991

/home/fede/Codes/diff-match-patch/src/Diff.php:1029
/home/fede/Codes/diff-match-patch/src/Diff.php:991
/home/fede/Codes/diff-match-patch/src/Patch.php:293
/home/fede/Codes/diff-match-patch/tests/PatchTest.php:288

Regards!

Upgrade to PHP 8

The class Match collides with what is now a reserved name, in this case, match.

Request help to include in a Silex project

So this isn't an issue per say because I've tested the code below in a test file. I am having trouble calling the class in a Silex project. I've added diff-match-patch to the composer.json file, ran composer update and it shows up in the autoload_namespaces.php.

When the code is run I receive the response below saying the class cannot be found.

BlobController.php

$dmp = new DiffMatchPatch\DiffMatchPatch();
$diff = $dmp->diff_match($data, $sourceFile, false);
$patch = $dmp->patch_make($diff);

composer.json
"yetanotherape/diff-match-patch": "*"

autoload_namespaces.php

// autoload_namespaces.php generated by Composer

$vendorDir = dirname(dirname(__FILE__));
$baseDir = dirname($vendorDir);

return array(
    'Twig_' => array($vendorDir . '/twig/twig/lib'),
    'Symfony\\Component\\Yaml\\' => array($vendorDir . '/symfony/yaml'),
    'Symfony\\Component\\Routing\\' => array($vendorDir . '/symfony/routing'),
    'Symfony\\Component\\Process\\' => array($vendorDir . '/symfony/process'),
    'Symfony\\Component\\HttpKernel\\' => array($vendorDir . '/symfony/http-kernel'),
    'Symfony\\Component\\HttpFoundation\\' => array($vendorDir . '/symfony/http-foundation'),
    'Symfony\\Component\\Finder' => array($vendorDir . '/symfony/finder'),
    'Symfony\\Component\\Filesystem\\' => array($vendorDir . '/symfony/filesystem'),
    'Symfony\\Component\\EventDispatcher\\' => array($vendorDir . '/symfony/event-dispatcher'),
    'Symfony\\Component\\DomCrawler\\' => array($vendorDir . '/symfony/dom-crawler'),
    'Symfony\\Component\\Debug\\' => array($vendorDir . '/symfony/debug'),
    'Symfony\\Component\\CssSelector\\' => array($vendorDir . '/symfony/css-selector'),
    'Symfony\\Component\\BrowserKit\\' => array($vendorDir . '/symfony/browser-kit'),
    'Symfony\\Bridge\\Twig\\' => array($vendorDir . '/symfony/twig-bridge'),
    'Silex' => array($vendorDir . '/silex/silex/src'),
    'Psr\\Log\\' => array($vendorDir . '/psr/log'),
    'Pimple' => array($vendorDir . '/pimple/pimple/lib'),
    'Gitter' => array($vendorDir . '/klaussilveira/gitter/lib'),
    'GitList' => array($baseDir . '/src'),
    'DiffMatchPatch' => array($vendorDir . '/yetanotherape/diff-match-patch/src'),
);

Response
Fatal error: Class 'GitList\Controller\DiffMatchPatch' not found in /Projects/gitlist/src/GitList/Controller/BlobController.php on line 149

Poke the CI server?

PHP 7 is stable, guessing Travis should be able to build now, maybe HHVM too?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.