Git Product home page Git Product logo

atoi_simd's Introduction

atoi_simd's People

Contributors

orlp avatar rodmitry avatar shnatsel avatar steffahn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

atoi_simd's Issues

Parsing very large `i32` and `u32` returns an incorrect number instead of an error

The following test case causes an integer overflow:

#[test]
fn overflow() {
    let input = [55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 56, 55, 55, 55, 55, 55, 55, 55, 55, 56, 45];
    crate::parse::<i32>(&input);
}

It happens in this line:

res = res * 10 + digit;

Integer overflow panics in debug mode by default. If you are confident this overflow is benign, wrapping_add and wrapping_mul should be used. If it needs to be handled for correctness, use checked_mul (which looks like it would be more efficient than the overflow! macro, but I'm not willing to claim that without benchmarks).

Out-of-bounds Read

This bounds check is wrong:

let mut val: u64 = unsafe { read_unaligned(s.get_safe_unchecked(8..).as_ptr().cast()) };

It should be

 let mut val: u64 = unsafe { read_unaligned(s.get_safe_unchecked(8..16).as_ptr().cast()) }; 

Applying this change causes a number of tests to fail.

This issue can also be observed using Miri, if I run cargo +nightly miri test without the .cargo/config in this repo, I can get at least these two UB reports:

196 |             let val: u64 = unsafe { read_unaligned(s.get_safe_unchecked(8..).as_ptr().cast()) };
    |                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ using uninitialized data, but this operation requires initialized memory
    |
   --> src/fallback.rs:145:41
    |
145 |             let mut val: u64 = unsafe { read_unaligned(s.get_safe_unchecked(8..).as_ptr().cast()) };
    |                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ memory access failed: alloc29045 has size 9, so pointer to 8 bytes starting at offset 8 is out-of-bounds

Some `u128` and `i128` values are parsed incorrectly instead of returning an error

The number "707071770707000177170017011770740070701" (or, in &[u8] notation, [55, 48, 55, 48, 55, 49, 55, 55, 48, 55, 48, 55, 48, 48, 48, 49, 55, 55, 49, 55, 48, 48, 49, 55, 48, 49, 49, 55, 55, 48, 55, 52, 48, 48, 55, 48, 55, 48, 49]) is parsed as 26507036865123250243267796907203647789 by atoi_simd. The standard library correctly returns an error in this case.

It only happens with AVX enabled, so it is likely a symptom of #6

Found using cargo fuzz

Parsing numbers starting with 0 is not supported

Add the following test to the codebase:

#[test]
fn test_parse_i64_fuzzed() {
    let input = b"02221222122210221223";
    let result = parse::<i64>(input).unwrap();
    assert_eq!(result, 02221222122210221223);
}

Run it with SIMD disabled (with the .cargo/config.toml file deleted) and you'll see that it fails. Meanwhile the standard library parses this number correctly.

This bug was discovered using cargo-fuzz. You can find a guide to it here and the fuzzing setup I used here. I will contribute the fuzzing setup to your repository eventually.

Parsing `i128` and `u128` may cause integer overflow when AVX is used

The following test case causes a panic in debug mode when AVX is in use:

#[test]
fn overflow_u128() {
    let input = [54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 50, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 51, 54, 54, 54];
    crate::parse::<u128>(&input);
}

The overflow happens on this line:

((*arr.get_safe_unchecked(0) as u128 * 10_000_000_000_000_000

See #5 for potential solutions.

parse_skipped can't parse zeros

parse_skipped strips all leading zeros, but this will cause atoi_simd::parse_skipped(b"0") to return Err(Empty).

Some test cases that should all parse to zero:

assert_eq!(parse_skipped::<i8>(b"0"), Ok(0));
assert_eq!(parse_skipped::<i8>(b"-0"), Ok(0));
assert_eq!(parse_skipped::<i8>(b"-0000000000000000000000000000"), Ok(0));
assert_eq!(parse_skipped::<i8>(b"+0"), Ok(0));
assert_eq!(parse_skipped::<i8>(b"+0000000000000000000000000000"), Ok(0));
assert_eq!(parse_skipped::<u8>(b"0"), Ok(0));
assert_eq!(parse_skipped::<u8>(b"+0"), Ok(0));
assert_eq!(parse_skipped::<u8>(b"+0000000000000000000000000000"), Ok(0));

and repeated for i16, i32, i64, i128 and u16, u32, u64, u128.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.