Git Product home page Git Product logo

js-crc32's Introduction

crc32

Standard CRC-32 algorithm implementation in JS (for the browser and nodejs). Emphasis on correctness, performance, and IE6+ support.

Installation

With a node package manager like npm:

$ npm i --save https://cdn.sheetjs.com/crc-32-latest/crc-32-latest.tgz

When installed globally, npm installs a script crc32 that computes the checksum for a specified file or standard input.

Hosted versions are available at https://cdn.sheetjs.com/:

Integration

Using NodeJS or a bundler with require:

var CRC32 = require("crc-32");

Using NodeJS or a bundler with import:

import { bstr, buf, str } from "crc-32";

In the browser, the crc32.js script can be loaded directly:

<script src="crc32.js"></script>

The browser script exposes a variable CRC32.

The script will manipulate module.exports if available . This is not always desirable. To prevent the behavior, define DO_NOT_EXPORT_CRC.

CRC32C (Castagnoli)

The module and CDNs also include a parallel script for CRC32C calculations.

Using NodeJS or a bundler:

var CRC32C = require("crc-32/crc32c");

Using NodeJS or a bundler with import:

import { bstr, buf, str } from "crc-32/crc32c";

In the browser, the crc32c.js script can be loaded directly:

<script src="crc32c.js"></script>

The browser exposes a variable CRC32C.

The script will manipulate module.exports if available . This is not always desirable. To prevent the behavior, define DO_NOT_EXPORT_CRC.

Usage

In all cases, the relevant function takes an argument representing data and an optional second argument representing the starting "seed" (for rolling CRC).

The return value is a signed 32-bit integer!

  • CRC32.buf(byte array or buffer[, seed]) assumes the argument is a sequence of 8-bit unsigned integers (nodejs Buffer, Uint8Array or array of bytes).

  • CRC32.bstr(binary string[, seed]) assumes the argument is a binary string where byte i is the low byte of the UCS-2 char: str.charCodeAt(i) & 0xFF

  • CRC32.str(string[, seed]) assumes the argument is a standard JS string and calculates the hash of the UTF-8 encoding.

For example:

// var CRC32 = require('crc-32');               // uncomment this line if in node
CRC32.str("SheetJS")                            // -1647298270
CRC32.bstr("SheetJS")                           // -1647298270
CRC32.buf([ 83, 104, 101, 101, 116, 74, 83 ])   // -1647298270

crc32 = CRC32.buf([83, 104])                    // -1826163454  "Sh"
crc32 = CRC32.str("eet", crc32)                 //  1191034598  "Sheet"
CRC32.bstr("JS", crc32)                         // -1647298270  "SheetJS"

[CRC32.str("\u2603"),  CRC32.str("\u0003")]     // [ -1743909036,  1259060791 ]
[CRC32.bstr("\u2603"), CRC32.bstr("\u0003")]    // [  1259060791,  1259060791 ]
[CRC32.buf([0x2603]),  CRC32.buf([0x0003])]     // [  1259060791,  1259060791 ]

// var CRC32C = require('crc-32/crc32c');       // uncomment this line if in node
CRC32C.str("SheetJS")                           // -284764294
CRC32C.bstr("SheetJS")                          // -284764294
CRC32C.buf([ 83, 104, 101, 101, 116, 74, 83 ])  // -284764294

crc32c = CRC32C.buf([83, 104])                  // -297065629   "Sh"
crc32c = CRC32C.str("eet", crc32c)              //  1241364256  "Sheet"
CRC32C.bstr("JS", crc32c)                       // -284764294   "SheetJS"

[CRC32C.str("\u2603"),  CRC32C.str("\u0003")]   // [  1253703093,  1093509285 ]
[CRC32C.bstr("\u2603"), CRC32C.bstr("\u0003")]  // [  1093509285,  1093509285 ]
[CRC32C.buf([0x2603]),  CRC32C.buf([0x0003])]   // [  1093509285,  1093509285 ]

Best Practices

Even though the initial seed is optional, for performance reasons it is highly recommended to explicitly pass the default seed 0.

In NodeJS with the native Buffer implementation, it is oftentimes faster to convert binary strings with Buffer.from(bstr, "binary") first:

/* Frequently slower in NodeJS */
crc32 = CRC32.bstr(bstr, 0);
/* Frequently faster in NodeJS */
crc32 = CRC32.buf(Buffer.from(bstr, "binary"), 0);

This does not apply to browser Buffer shims, and thus is not implemented in the library directly.

Signed Integers

Unconventional for a CRC32 checksum, this library uses signed 32-bit integers. This is for performance reasons. Standard JS operators can convert between signed and unsigned 32-bit integers:

CRC32.str("SheetJS")                            // -1647298270 (signed)
CRC32.str("SheetJS") >>> 0                      //  2647669026 (unsigned)
(CRC32.str("SheetJS")>>>0).toString(16)         //  "9dd03922" (hex)

(2647669026 | 0)                                // -1647298270
  • x >>> 0 converts a number value to unsigned 32-bit integer.

  • x | 0 converts a number value to signed 32-bit integer.

Testing

make test will run the nodejs-based test.

To run the in-browser tests, run a local server and go to the ctest directory. make ctestserv will start a python SimpleHTTPServer server on port 8000.

To update the browser artifacts, run make ctest.

To generate the bits file, use the crc32 function from python zlib:

>>> from zlib import crc32
>>> x="foo bar baz٪☃🍣"
>>> crc32(x)
1531648243
>>> crc32(x+x)
-218791105
>>> crc32(x+x+x)
1834240887

The included crc32.njs script can process files or standard input:

$ echo "this is a test" > t.txt
$ bin/crc32.njs t.txt
1912935186

For comparison, the included crc32.py script uses python zlib:

$ bin/crc32.py t.txt
1912935186

On OSX the command cksum generates unsigned CRC-32 with Algorithm 3:

$ cksum -o 3 < IE8.Win7.For.Windows.VMware.zip
1891069052 4161613172
$ crc32 --unsigned ~/Downloads/IE8.Win7.For.Windows.VMware.zip
1891069052

Performance

make perf will run algorithmic performance tests (which should justify certain decisions in the code).

The adler-32 project has more performance notes

License

Please consult the attached LICENSE file for details. All rights not explicitly granted by the Apache 2.0 license are reserved by the Original Author.

Badges

Sauce Test Status

Build Status Coverage Status Dependencies Status NPM Downloads ghit.me Analytics

js-crc32's People

Contributors

101arrowz avatar garrettluu avatar lmk123 avatar mithgol avatar ryanio avatar sheetjsdev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

js-crc32's Issues

How can I get the hex value of CRC32

I want to get the hex value of CRC32, but I can not, cause the result is always decimal. Is the only option to covert decimal to hex format manually.

Incorrect CRC-32 number returned when String or BinaryString contains an integer

If you pass in "123456789" (remove double-quotes) into CRC32.bstr() or CRC32.str(), the returned item should be 3421780262 (number) and CBF43926 (hex). Please reference https://www.lammertbies.nl/comm/info/crc-calculation, and https://www.libcrc.org for reference.

Furthermore if any part of the string contains an integer, the CRC32 calculation returned is incorrect.

Please check into this. I have added a Jest based unit test to show the results.
crc32.spec.ts.zip

CRC32 slower than MD5

What am I missing here?

    var buf = Buffer.from(data);

    let start = Date.now();
    var checksum = crc32.buf(buf);
    console.log(checksum);
    console.log(Date.now() - start);


    var hash = crypto.createHash("md5");
    hash.setEncoding("hex");
    
    start = Date.now();
    hash.write(buf);
    hash.end();
    var checksum = hash.read()
    console.log(checksum)
    console.log(Date.now() - start);

CRC constantly over 10ms while MD5 is under 5ms constantly.

v1.2.3 not published to npm?

I can see that this repo and the CDN contains a version 1.2.3

Unfortunately it seems this version is not available in the npm registry. Is this intentional?

Support Castagnoli CRC32

https://stackoverflow.com/questions/26429360/crc32-vs-crc32c

We would need to replace each instance of -306674912 in the signed_crc_table function. Which means we would need to decide on a way to switch between the two constants. (Default would be -306674912 so it won't be a breaking change)

The new constant for Castagnoli is -2097792136

> 0x82F63B78 << 0
-2097792136

Please let me know:

  1. If you would be interested in adding this?
  2. Would you be comfortable with me creating a pull request for this?

If yes for both, I'll go ahead and make a proposal/PR. Thanks.

Can this be used to generate unsigned ints?

Hey there,

I'm currently using your module like so:

    var bsonObjectid = require('bson-objectid');
    var CRC32 = require('crc-32')

    var MySchemaId = bsonObjectid().str
    var sales_order_id = CRC32.str(MySchemaId)

But I'm noticing sometimes the sales_order_id is negative. Is there a way to get the CRC32 module to only output unique positive values?

Thanks so much,
Thomas

Result is a Signed 32bit value, is this known and wanted?

I just spent a good hour trying to figure out why the results of the JS implementation did not match other implementation (I've specifically tested it against ruby's zLib wrapper) and then I understood: All the bitwise operators in JavaScript treat the variables as 32bit signed values. This is not a problem per se, as the bitwise operations are usually not dependent on the signs, but unfortunately the output is also 32bit signed. I've looked at various implementations now and it seems to me that usually, the output value of the crc process should be an unsigned 32 bit value.

Here's a simple test case (the numbers are not very relevant, it just takes a certain combination to trigger the sign problem):

# ruby example
a  =[ 68,69,77,79,220,187,0,0,0,0,0,0,0,0,0,
0,68,101,109,111,32,83,101,115,115,105,
111,110,32,32,32,32,32,32,32,32,32,32,32,
32,32,32,32,32,32,32,32,32,158,50,0,0,0,0,
0,0,0,0,0,0,0,7,0,0,0,3,0,0,0,3,0,0,1,1,0,0,
15,102,0,0,72,13,0,0,69,13,0,0,77,13,0,0,65,
13,0,0,0,0,0,0,0,0,0,0,0,96,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0]

a_string = a.map(&:chr).join('')

ZLib::crc32(a_string)
=> 2795502179
// JS example
a  =[ 68,69,77,79,220,187,0,0,0,0,0,0,0,0,0,
0,68,101,109,111,32,83,101,115,115,105,
111,110,32,32,32,32,32,32,32,32,32,32,32,
32,32,32,32,32,32,32,32,32,158,50,0,0,0,0,
0,0,0,0,0,0,0,7,0,0,0,3,0,0,0,3,0,0,1,1,0,0,
15,102,0,0,72,13,0,0,69,13,0,0,77,13,0,0,65,
13,0,0,0,0,0,0,0,0,0,0,0,96,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0];

CRC.buf(a)
=> -527573939

If you convert this value to an unsigned value (Stack Overflow never disappoints) by CRC.buf(a) >>> 0, you get the correct result.

I've also checked this against the crc32 utility of Mac OS X by converting the numbers into a text file and the result matches.

I''ll do a pull request if this makes things easier.

Problem with big array of bytes that not can convert the CRC Value to unsigned integer

I have a byte array that i trying to do CRC sum on and i getting a signed Integer as the library describe but when i try to convert the signed integer to a unsigned integer i get the same value

Here is a example where i put in a array of integers and gets crcValue = 1070418610 in the console:

var crcValue = CRC32.buf([0, 128, 0, 1, 35, 91, 123, 34, 73, 115, 69, 109, 112, 116, 121, 34, 58, 102, 97, 108, 115, 101, 44, 34, 88, 34, 58, 50, 56, 50, 44, 34, 89, 34, 58, 55, 52, 53, 125, 93]); console.log('crcValue', crcValue);

and here i try to convert between signed and unsigned 32-bit integers and get the same value back 1070418610:

var crcValue = CRC32.buf([0, 128, 0, 1, 35, 91, 123, 34, 73, 115, 69, 109, 112, 116, 121, 34, 58, 102, 97, 108, 115, 101, 44, 34, 88, 34, 58, 50, 56, 50, 44, 34, 89, 34, 58, 55, 52, 53, 125, 93]) >>> 0; console.log('crcValue', crcValue);

but if i make a small array it works like this returnes -1950775789:

var crcValue = CRC32.buf([0, 1, 2, 3]); console.log('crcValue', crcValue);

and this returns 2344191507:

var crcValue = CRC32.buf([0, 1, 2, 3]) >>> 0; console.log('crcValue', crcValue);

What is wrong with the big array?

Incorrect crc returned for a surrogate character

The crc for surrogate characters is not correctly computed.
Here is an example with the U+24B62 character (𤭢):

CRC32.str('𤭢')

Result: -40863161
Expected: 1512193127

Function related to this issue:

function crc32_str(str) {
    for(var crc = -1, i = 0, L=str.length, c, d; i < L;) {
        c = str.charCodeAt(i++);
        if(c < 0x80) {
            crc = (crc >>> 8) ^ table[(crc ^ c) & 0xFF];
        } else if(c < 0x800) {
            crc = (crc >>> 8) ^ table[(crc ^ (192|((c>>6)&31))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|(c&63))) & 0xFF];
        } else if(c >= 0xD800 && c < 0xE000) {
            c = (c&1023)+64; d = str.charCodeAt(i++) & 1023;
            crc = (crc >>> 8) ^ table[(crc ^ (240|((c>>8)&7))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|((c>>2)&63))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|((d>>6)&15)|(c&3))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|(d&63))) & 0xFF];
        } else {
            crc = (crc >>> 8) ^ table[(crc ^ (224|((c>>12)&15))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|((c>>6)&63))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|(c&63))) & 0xFF];
        }
    }
    return crc ^ -1;
}

Fix:

function crc32_str(str) {
    for(var crc = -1, i = 0, L=str.length, c; i < L;) {
        c = str.charCodeAt(i++);
        if(c < 0x80) {
            crc = (crc >>> 8) ^ table[(crc ^ c) & 0xFF];
        } else if(c < 0x800) {
            crc = (crc >>> 8) ^ table[(crc ^ (192|((c>>6)&31))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|(c&63))) & 0xFF];
        } else if(c >= 0xD800 && c < 0xE000) {
            c = (((c&1023) << 10)|((str.charCodeAt(i++) & 1023))) + 0x10000;
            crc = (crc >>> 8) ^ table[(crc ^ (240|(c>>18))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|((c>>12)&63))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|((c>>6)&63))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|(c&63))) & 0xFF];
        } else {
            crc = (crc >>> 8) ^ table[(crc ^ (224|((c>>12)&15))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|((c>>6)&63))) & 0xFF];
            crc = (crc >>> 8) ^ table[(crc ^ (128|(c&63))) & 0xFF];
        }
    }
    return crc ^ -1;
}

CRC32 function produces incorrect values by default

It appears that this library isn't producing expected results out of the box:

const crc32 = require("crc-32")
console.log(crc32.str('SheetJS').toString(16));
>
"-622fc6de"

However:

const crc32 = require("crc-32")
console.log((crc32.str('SheetJS') >>> 0).toString(16));
>
"9dd03922"

Same thing in Ruby:

require 'zlib'
puts(Zlib::crc32('SheetJS').to_s(16))
>
9dd03922

Same thing in Rust:

fn main() {
    let checksum = crc32fast::hash(b"SheetJS");
    println!("{:x}", checksum);
}
>
9dd03922

null values

Hello !, please excuse the translation into English. The translation is by Google Translate from Spanish

I am using the following example https://github.com/SheetJSDev/js-xlsx-demo. Everything works fine but when I send a null value, fails.
I made a small change in function "sheet_from_array_of_arrays" but it would be good that the script can solve it.

function sheet_from_array_of_arrays(data) {
  var row_num = data.length;
  var keys = Object.keys(data[0])
  var col_num = keys.length;

  var ws = {};
  var range = { s: { c: 0, r: 0 }, e: { c: col_num, r: row_num } };
  ws['!ref'] = XLSX.utils.encode_range(range);

  for (var R = 0; R < row_num; R++) {
    for (var C = 0; C < col_num; C++) {
      var cell_ref = XLSX.utils.encode_cell({ c: C, r: R });
      var cell = { v: data[R][keys[C]] };
      cell.t = 's';
      // My two cents (start)
      if (cell.v == null) {
        cell.v = '';
      }
      // My two cents (end)
      ws[cell_ref] = cell;
    }
  }
  return ws;
}

CRC sometimes returns negative value

So, I know that this problem has already been raised more than once.
Let's give an example, take a png file in which the checksum is calculated using exactly the same algorithm. There is data:

[0x49, 0x48, 0x44, 0x52, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x20, 0x08, 0x02, 0x00, 0x00, 0x00]

In the file, this data has the following CRC: FC18EDA3

The same algorithm returns a negative value: -3E7125D

If you look at other implementations of this algorithm, you will notice that they all return an unsigned value. Even in Python, this problem has already been fixed:

from zlib import crc32

crc32(b"SheetJS") # 2647669026

For example, unsigned numbers are used here so that the algorithm works correctly:
http://stigge.org/martin/pub/SAR-PR-2006-05.pdf

Maybe it's worth using >>> 0 to work correctly?
Or make a separate function for unsigned calculation🤔

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.