Git Product home page Git Product logo

Comments (21)

oliver-moran avatar oliver-moran commented on August 19, 2024 2

I've added a comparison tool to the latest release: Jimp.diff( image1, image2, threshold).

This returns an object with a percent property (the percent of different pixels) and an image property that is a Jimp image showing the differences.

The threshold argument is optional and ranges from 0 to 1 defaults to 0.1.

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024 1

OK. After a bit of research what you described is essentially pHash: http://www.phash.org/

I'll implement that algorithm for the sake of using a known standard:
http://www.hackerfactor.com/blog/?/archives/432-Looks-Like-It.html

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

I don't know of any - but if there's an algorithm in another language I'm sure it could be ported.

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

There's some ideas and projects linked from here that might help you: http://stackoverflow.com/questions/843972/image-comparison-fast-algorithm

They might provide the basis for a Jimp module. Let me know how you get on.

from jimp.

marcolino avatar marcolino commented on August 19, 2024

Thanks! I'll give it a shot ASAP...

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

This library seems to do the job: https://github.com/mapbox/pixelmatch

I'll integrate it with Jimp.

from jimp.

marcolino avatar marcolino commented on August 19, 2024

Great! I will test it ASAP!

from jimp.

marcolino avatar marcolino commented on August 19, 2024

Would it be possible to add a Jimp.signature(image) method, too, returning an image signature (which you will probably already be using internally). So it should be easier for the user to build a database of images with the signatures, and being able afterwards to tell if a new image is already inside the database simply comparing the signature with the others, without having to call Jimp.diff(newImage, image, threshold) for each image on the db... Of course the signatures should be comparable with a fast hamming distance function...

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

You can use SHA1 to generate a "signature" for an image like this:

var Jimp = require("jimp");
var SHA1 = require('node-sha1');

Jimp.read("./lenna.png").then(function(image){
    var clone = image.clone(); // create a perfect copy

    var sig1 = SHA1(image.bitmap.data);
    var sig2 = SHA1(clone.bitmap.data);

    image.greyscale(); // make a change
    var sig3 = SHA1(image.bitmap.data);

    console.log(sig1); // f41bcd3f25e0616ee83a12b338088edf5491d057
    console.log(sig2); // f41bcd3f25e0616ee83a12b338088edf5491d057
    console.log(sig3); // 819904226cf1bd99f67454ed11dc1270619a6aff

    console.log(sig1 == sig2); // true
    console.log(sig1 == sig3); // false
});

See here: https://www.npmjs.com/package/node-sha1

from jimp.

marcolino avatar marcolino commented on August 19, 2024

Nooo, I mean, "comparable" signatures, to find similar images, too, not
only the same image... (don't know if it's possible...).
However, I I'm looking for a way to solve the problem to tell (in a
polynomial time) if a new image is similar to another image in a corpus of
(already processed) images...

Marco Solari
System Analyst, Software Engineer and IT Consultant at Koinè Sistemi Torino

On 6 November 2015 at 12:03, Oliver Moran [email protected] wrote:

You can use SHA1 to generate a "signature" for an image like this:

var Jimp = require("jimp");var SHA1 = require('node-sha1');

Jimp.read("./lenna.png").then(function(image){
var clone = image.clone(); // create a perfect copy

var sig1 = SHA1(image.bitmap.data);
var sig2 = SHA1(clone.bitmap.data);

image.greyscale(); // make a change
var sig3 = SHA1(image.bitmap.data);

console.log(sig1); // f41bcd3f25e0616ee83a12b338088edf5491d057
console.log(sig2); // f41bcd3f25e0616ee83a12b338088edf5491d057
console.log(sig3); // 819904226cf1bd99f67454ed11dc1270619a6aff

console.log(sig1 == sig2); // true
console.log(sig1 == sig3); // false

});

See here: https://www.npmjs.com/package/node-sha1


Reply to this email directly or view it on GitHub
#42 (comment).

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

Ah, OK.

Short answer: No.

Longer answer: Jimp loads the pixel data of an image and provides ways for
you read and modify the pixel data. There's no method in Jimp (as yet) to
generate a "signature" like you are looking for - but there are all the
tools you would need to do so.

The only question is: what is the algorithm to generate such a "signature"?

from jimp.

marcolino avatar marcolino commented on August 19, 2024

Short answer: :-)

  • resize image to a 10x10 pixels image
  • convert to greyscale
  • calculate the mean color for all the 100 pixels
  • build a 10x10 matrix of bytes with the mean color [0x00 - 0xff] for each
    pixel.
  • build a 10x10 matrix of bytes with '1' value if the corresponding pixel
    color is above the mean color, '0' otherwise
  • print this last matrix as a string representing the image signature

Of course you'll be able to find/invent a much cooler algorithm, but this
one I used in the old venerable times of PHP, and found it quite
effective... :-)

Marco Solari
System Analyst, Software Engineer and IT Consultant at Koinè Sistemi Torino

On 6 November 2015 at 13:23, Oliver Moran [email protected] wrote:

Ah, OK.

Short answer: No.

Longer answer: Jimp loads the pixel data of an image and provides ways for
you read and modify the pixel data. There's no method in Jimp (as yet) to
generate a "signature" like you are looking for - but there are all the
tools you would need to do so.

The only question is: what is the algorithm to generate such a "signature"?

On Fri, Nov 6, 2015 at 12:05 PM marcolino [email protected]
wrote:

Nooo, I mean, "comparable" signatures, to find similar images, too, not
only the same image... (don't know if it's possible...).
However, I I'm looking for a way to solve the problem to tell (in a
polynomial time) if a new image is similar to another image in a corpus
of
(already processed) images...

Marco Solari
System Analyst, Software Engineer and IT Consultant at Koinè Sistemi
Torino

On 6 November 2015 at 12:03, Oliver Moran [email protected]
wrote:

You can use SHA1 to generate a "signature" for an image like this:

var Jimp = require("jimp");var SHA1 = require('node-sha1');

Jimp.read("./lenna.png").then(function(image){
var clone = image.clone(); // create a perfect copy

var sig1 = SHA1(image.bitmap.data);
var sig2 = SHA1(clone.bitmap.data);

image.greyscale(); // make a change
var sig3 = SHA1(image.bitmap.data);

console.log(sig1); // f41bcd3f25e0616ee83a12b338088edf5491d057
console.log(sig2); // f41bcd3f25e0616ee83a12b338088edf5491d057
console.log(sig3); // 819904226cf1bd99f67454ed11dc1270619a6aff

console.log(sig1 == sig2); // true
console.log(sig1 == sig3); // false
});

See here: https://www.npmjs.com/package/node-sha1


Reply to this email directly or view it on GitHub
<#42 (comment)
.


Reply to this email directly or view it on GitHub
#42 (comment).


Reply to this email directly or view it on GitHub
#42 (comment).

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

Funnily, I had something similar in mind. Cool, I'll do up a version over the weekend.

What you end up with from your algorithm is a very big binary number (100 digits long). Would you mind if I converted that to hex or base 64 (still as a string because JS can't handle anything that big)?

from jimp.

marcolino avatar marcolino commented on August 19, 2024

Yes, great!
Il 06/nov/2015 17:16 "Oliver Moran" [email protected] ha scritto:

Funnily, I had something similar in mind. Cool, I'll do up a version over
the weekend.

What you end up with from your algorithm is a very big binary number (100
digits long). Would you mind if I converted that to hex (still as a string
because JS can't handle anything that big)?


Reply to this email directly or view it on GitHub
#42 (comment).

from jimp.

marcolino avatar marcolino commented on August 19, 2024

Any news? :-))

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

Yes. Works a charm. I'll publish an update tonight or tomorrow.

On Sat 7 Nov 2015 5:46 PM marcolino [email protected] wrote:

Any news? :-))


Reply to this email directly or view it on GitHub
#42 (comment).

from jimp.

marcolino avatar marcolino commented on August 19, 2024

I see you did already commit and update docs! My compliments!
One question: I don't understand your sentence:

Using a mix of hammering distance and pixel diffing to comare images taking 0.15 as a cut off point, ...

(note the small comare -> compare error)
I don't understand what's the use of the threshold parameter in the diff function...

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

I'm going to publish this now.

The pixel comparison (diffing) tool looks very effective. It uses the PixelMatch library. The library GitHub page points to papers on the algorithm used.

I did a test that tried to match 120 PNG against 120 corresponding JPEGs (saved at 60 quality).

Using both PixelMatch and pHash:

Correct: 14253 (99%)
False positives: 147 (1%)
False negatives: 0 (0%)
Total: 14400 (100%)

Using just pHash:

Correct: 14208 (99%)
False positives: 146 (1%)
False negatives: 46 (38%)
Total: 14400 (100%)

PixelMatch thus prevented a whole load of false negatives - although it introduce one false positive.

However, I can imagine other circumstances where pHash is stronger than PixelMatch.

from jimp.

marcolino avatar marcolino commented on August 19, 2024

I have to find duplicates in a bunch of images where there are images differing by a semitransparent text label on it, or because slightly differently cropped, or because they have a small frame border, or because they are vertically flipped (mirrored)...

Which algorithm would you suggest in this (almost desperate) situation? Should I better go with a combination of them? I suppose for the mirroring issue I'll have to compute the distance of both the image and of it's mirrored version, and keep the lower value... But for the other situations?

Unfortunately, I don't yet fully understand the meaning of the 'cut-off' point parameter of the diff() function: it does return a percent difference, right? so, what's the use of a threshold?

from jimp.

oliver-moran avatar oliver-moran commented on August 19, 2024

I'm brand new to this this. Reading this thread from top you bottom, you can read my entire knowledge of image comparison. So I can't help you any more.

If I was to begin your problems, I'd do something like this:

// crop out the border, semi-transparent labels, etc.
haystack[i].crop(50, 50, haystack[i].bitmap.width - 100, haystack[i].bitmap.height - 100);

// do the same on the source image (so that you are comparing like for like)
needle.crop(50, 50, needle.bitmap.width - 100, needle.bitmap.height - 100);

// make a clone of the needle you are searching for and flip it vertically, you'll search for this too
var needle_v = needle.clone().mirror(false, true);

// now look for matches on all variants
if (Jimp.distance(haystack[i], needle) < 0.15 
 || Jimp.diff(haystack[i], needle.resize(haystack[i].bitmap.width, haystack[i].bitmap.height).percent < 0.15 
 || Jimp.distance(haystack[i], needle_v) < 0.15 
 || Jimp.diff(haystack[i], needle_v.resize(haystack[i].bitmap.width, haystack[i].bitmap.height).percent < 0.15
) {
    // images match
} else {
    // not a match
}

By the "cut-off points" I mean the point at which I consider an whole image to be a match or not. If PixelMatch or pHash comes back at less than 15% (0.15) difference then I consider it a match.

The "threshold" argument for PixelMatch relates to PixelMatch's internal workings: at what point does PixelMatch considers individual pixels to be matched.

from jimp.

marcolino avatar marcolino commented on August 19, 2024

Thanks a million!

About cropping, since it can be added to just one image, and not to the other, I suppose I'll have to find some 'smart' auto-crop feature... (detecting similar values in border pixels...).

About texting, I suppose I'll have to ignore it, and rise a bit thresolds, since it is possible it is placed in the middle of the image, for it's whole width (like a watermark...). I hope I will not suffer much (too many false positives) from rising threshold, since my images have a big variance...

About mirroring, yes, I was thinking something exactly like you suggest...

On minor suggestion: I did clone jimp, and tested usage of pngjs2 for pngjs (currently deprecated), and tests do pass, so I suppose you can use it safely...

About PixelMatch threshold yes, I was just peeking in the sources, and did just understood what you explain (sorry for asking... :-().

from jimp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.