burntsushi / memchr Goto Github PK

Optimized string search routines for Rust.

License: The Unlicense

Rust 99.71% Python 0.09% RenderScript 0.20%

rust memchr string string-searching bytes twoway simd rabin-karp

memchr's Issues

Compilation Issue on rust 1.41.0

error[E0428]: the name `imp` is defined multiple times
   --> .cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.3.1/src/lib.rs:148:5
    |
139 |     fn imp(n1: u8, haystack: &[u8]) -> Option<usize> {
    |     ------------------------------------------------ previous definition of the value `imp` here
...
148 |     fn imp(n1: u8, haystack: &[u8]) -> Option<usize> {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `imp` redefined here
    |
    = note: `imp` must be defined only once in the value namespace of this block

Plans for searching for 4 or 5 bytes or for u16?

This isn't quite a feature request at this time, just a question at this point:

Is providing memchr4 considered to be within scope for this crate? What about analogs of the byte searching functions that search up to 4 u16 values in a &[u16]?

(Context: Escaping a string according to HTML.)

What about up to 5? (Context: the Data state of the HTML parsing algorithm taking into account CR and LF for end-of-line handling.)

Memchr causes LLVM ERROR: Do not know how to split this operator's operand!

Minimal reproduction

Memchr 2.2 and up don't compile on the x86_64-unknown-uefi target using the latest nightly, currently rustc 1.47.0-nightly (e15510ca3 2020-08-20), with "LLVM ERROR: Do not know how to split this operator's operand!"

memchr 2.1 and below compile, however.

Issue #57, but that was a year ago and with a custom target, and this is an upstream one? Sorry if duplicate idk where else this should be reported,

drop libc and use vendor intrinsics

I've spent the evening thinking about this, and I think it's possible to do and match glibc's performance, but I haven't experimented yet. I'm going to take a crack at this soon.

Consider implementing memchr with arbitrary char predicate

Could be something like this:

trait CharPred: CharPredSecret {}

struct CharEq(u8); impl CharPred for CharEq {}
struct CharLtSigned(i8); impl CharPred for CharLtSigned {} 
struct CharLtUnsigned(u8); impl CharPred for CharLtSigned {}
struct CharGtSigned(i8); impl CharPred for CharGtSigned {}
struct CharGtUnsigned(u8); impl CharPred for CharGtUnsigned {}

struct CharOr(CharPred, CharPred) {}
impl CharPred for CharOr {}

fn memchr_pred(needle: impl CharPred, haystack: &[u8]) -> Option<usize> { ... }

This could be useful in scenarios like this:

fn need_escape(s: &str) -> bool {
  let pred = CharOr(,
    CharOr(CharEq(b'"'), CharEq(b'\"')),
    CharOr(CharEq(b'\\'), CharLtSigned(32)),
  );
  memchr::char_pred(pred, s.as_bytes()).is_none()
}

fn c_escape(s: &str) -> String {
  if !need_escape(s) { return s.to_owned(); }
  // else slow iteration over character
}

// trait not exposed to user
trait CharPredSecret {
  fn eval<V: Vector>(self, arg: V) -> V;
}

impl CharPredSecret for CharEq {
  fn eval<V: Vector>(self, arg: V) {
    // compiler should be smart enough to move splat out of the loop
    let c = V::splat(self.0);
    V::cmpeq(arg, c)
  }
}

ifunc macro cannot be used in such memchr_pred function because there are no generic statics, but that is probably OK, it should be at least better that non-SIMD version.

Potential use of `core::hint::unreachable_unchecked` to avoid bounds checks for all users

Consider the following code. It's used as a wrapper in one of my projects to avoid bounds checks in safe code.

#[inline]
fn chr(s: &[u8], b: u8) -> Option<usize> {
    memchr::memchr(b, s).map(|i| {
        if i >= s.len() {
            unsafe { core::hint::unreachable_unchecked() }
        }
        i
    })
}

I wonder if it's practical and sound to insert unreachable hints into memchr crate, so that all its users could get an increase in performance. It's just a tentative suggestion that needs more discussion :)

Broken crates.io badge in Readme

The crates.io badge in the Readme (https://meritbadge.herokuapp.com/memchr) no longer loads.

approaching 1.0

This crate provides a small reasonably well defined API and probably won't ever see any breaking changes. While it may still need new additions or improved platform support, I expect it won't require backwards incompatible changes. Therefore, I propose cutting a 1.0 release in the next few weeks.

cc @bluss @nicokoch

Support miri

The crate does not seem to work well with miri at the moment for multiple reasons. One reason is that the crate uses the x86 / C implementations over the fallback implementation, which miri doesn't support. But even when cfg'ing out those implementations via cfg(miri), the fallback implementation uses a lot of bit math on pointers, which miri also doesn't like.

Off by one index in Memchr iterator

It seems surprising to me that when using the Memchr iterator, the position that is returned when calling .next() is the position of the needle + 1. Looking at the tests of the crate, this is expected behavior. Could you please explain why it is this way? I could then make a PR with additional documentation and an example to show usage of the Memchr (and Memchr2/3) iterator.

Runtime configuration of byte frequency table used to classify rare bytes

memchr implements a generic SIMD accelerated search that is ideal for implementing something like CheatEngine where you scan the memory of an executable process to aid reverse engineering. This process involves repeatedly scanning for possibly millions of small values (u16, i32, ...) in the memory of that process. The user might have information ahead of time about the frequency distribution of bytes in the memory being scanned, which may vary wildly between executables. The user might also be able to control the program to ensure certain rare bytes appear in the program's memory at certain times.

There is an issue that prevents memchr from performing optimally when scanning binary executables - the byte frequency table. The core algorithm is based on detecting rare bytes with specific positions in haystack (the prefilter) and then testing these matches to check if the needle has been found. As mentioned in the incredibly detailed comments, the performance of this algorithm is highly dependent on the byte frequency table used to determine what is a rare byte. While the table that is included in memchr is optimal for the majority of cases, there are some specific data types that have very different byte frequency distributions, which causes memchr to perform worse on those inputs than it otherwise might with a different byte frequency table.

To illustrate this point, consider the following byte frequencies (where ideal is the ideal frequency for an x86 binary):

byte	`memchr`	ideal
`\x00`	55	255
`\xdd`	255	0
`\x8b`	80	186
`H`	150	254

Now, consider scanning for the needle H\x00\xdd\x8b in an x86 binary. memchr would identify \x00 and \x8b as the rarest bytes, when they are in fact common bytes. Even if memchr considered \x00 to be a frequent byte via configuration, it would still choose H and \x8b as the rarest bytes, which are both much more common than \xdd, the only actually rare byte. This would result in a lot of unnecessary false positives, decreasing the throughput. This is a simple case, but it is easy to extend this idea to many other pathological input sequences that defeat the default frequency table, and might also reasonably appear in an executable or be scanned for by a user.

Now consider a haystack that contains HHH\x00\xdd\x8b. The user might know in advance that searching for HHH\x00 and searching for H\x00\xdd\x8b will both return a single unique match, the sub-slice that was mentioned earlier (the exact indices are not identical but that is not the point). The user might also know that \xdd is a very rare byte in their dataset. The user should be able to choose scanning for \xdd instead of a more common byte to speed up their searches. I cannot imagine how to support something like this without providing the user a mechanism for customizing the byte frequency table.

The proposed solution is to allow the user to specify the byte frequency table at runtime by modifying the memchr::memmem::rarebytes::rank function. Currently, this function reads from the global byte frequency table.

My first idea was to create an enum that can be provided to a FinderBuilder and then forwarded to RareNeedleBytes to choose the table:

enum ByteFrequencies<'a> {
    Default,
    Custom(&'a [u8; 256]),   
}

This enum can be stored in the NeedleInfo struct and used at runtime to determine which byte frequency table to use. However, this introduces the lifetime 'a, which may or may not be the same as the needle ('n) and haystack ('h) lifetimes that are stored in related structs. Considering lifetime 'a to be separate and different requires the public API of Finder to be changed to add this lifetime.

ByteFrequencies -> NeedleInfo -> Searcher -> Finder

I believe that the extra lifetime might make life more difficult for the compiler, which is why I observed a small but noticeable (around 10%) impact on the performance of constructing a Finder with the default frequency table on my local machine.

Also, by introducing a new member on the struct NeedleInfo, the size/alignment properties of Finder, Searcher and NeedleInfo changed, which also might be the reason for the performance impact I observed. (if this sounds crazy to anyone I suggest you take a look at the wonderful performance talk by the legend Emery Berger titled 'Performance Matters' for more details https://www.youtube.com/watch?v=r-TLSBdHe1A).

An idea to remove the generic lifetime from ByteFrequencies:

enum ByteFrequencies {
    Default,
    Custom(&'static [u8; 256]),   
}

However, I believe this static API is logically inconsistent with the FinderBuilder API. You can construct millions of unique Finders at runtime and then discard them later, but the same cannot be said for static arrays.

Also, the user might want to perform analysis of their specific corpus at runtime to generate a specialized byte frequency table (like 'pre-training'). This is a very interesting use case in the context of the analysis of binary executables, as there is a lot of information that can only be obtained at runtime and can be useful in optimizing many kinds of searches. Forcing the user to use a static byte frequency table would necessarily prevent this use-case.

Another idea to remove the generic lifetime and also allow runtime generation of the byte frequency table:

enum ByteFrequencies {
    Default,
    Custom(Box<[u8; 256]>),
}

However, an issue with this approach is that the ByteFrequencies enum has a size of 16 bytes which is mostly wasted. Another issue is that it seems that conceptually we should be passing around some kind of reference to a byte table that can be reused, instead of copying the table for each construction, but that ultimately depends on benchmarks. Also, now the standard library and memory allocation are required for an operation that is unrelated to both of those things (Rc, Arc and others have similar issues).

I also tried storing the byte table inline, but this had disastrous results on performance. This is probably because this extra storage pushed important members on related structs into new cache lines, which affected subsequent operations on these members.

enum ByteFrequencies {
    Default,
    Custom([u8; 256]),
}

One thing I have not tried yet but might be interesting is trying to re-organize the members and memory layout of any struct that stores a ByteFrequencies object. This might allow using an inline byte frequency table for example, but would likely result in breaking changes to the layout of public structs in memchr. Even just introducing the ByteFrequencies object already changes the memory layout of certain structs, which I am not sure about whether it is something undesirable or not.

All of this culminated in the pull request I submitted, but I realize now it is better to just lay it all out here and figure out the best path forward together. I appreciate any feedback you may have on these suggestions.

P.S. I think memchr is an incredible library and the code quality and detail of documentation definitely helped me greatly in understanding the internals and even being able to suggest this in the first place, so kudos.

Poor performance on Zen 1/Threadripper due to loop unrolling

I have noticed that the SSE2 implementation of memchr in this crate unrolls the loop 4x. Unfortunately, this seems to lead to a significant performance drop on processors on the Zen 1 architecture. Benchmarked on a TR 1950x, I see about 50-60% better performance compared to this crate when avoiding loop unrolling altogether. Below is a benchmark demonstrating this. The memchr implementation in the repository linked below is written in such a way so that you just have to change one constant (UNROLL_SIZE) to change the amount of loop unrolling that the function uses for the main loop.

https://github.com/redzic/memchr-demonstration

Just clone the repository and run cargo run --release to run the benchmark.

Increasing UNROLL_SIZE leads to worse performance on my TR1950x, with 4x unrolling being basically the same speed as this crate which makes sense. However, when using UNROLL_SIZE = 8, the performance difference between this crate's implementation and the custom implementation spikes again to being about 10% faster than this crate (i.e., this crate's SSE2 memchr is 90.9% as fast).

Would it be possible to tune the unroll factor, or possibly even do something similar to OpenBLAS, so that we query information about the CPU such as cache size or even exact CPU model, and dispatch code accordingly? Perhaps this functionality could be implemented behind some kind of feature flag.

Clarity with regard to `--no-default-features` leading to `fallback::memchr`

First and foremost, thank you for the crate Andrew.

I’ve been employing it for a while, and I’ve recently made one of the libraries I’m working on no_std. Coincidentally while doing so, I altered my benchmarking habits, and never got hold of the fact --no-default-features leads to fallback::memchr due to std::is_x86_feature_detected, and only discovered the change via cargo asm.

What do you think of extending the README‘s no_std section to mention the otherwise “silent” requirement?

(I am now using std_detect with an extern no_std memchr. Considering you work on that too, I assume you’ll want it to be stable before adopting it.)

memchr 2.4.0 msrv breaks nom 5.1.2 msrv of 1.37

See rust-bakery/nom#1313.

nom 5.1 depends on `memchr = "^2.0". It has a documented msrv of 1.37.

Since 2.4.0 is supposed to be semver-compatible, cargo will select memchr 2.4.0, which does not build on 1.37.

I understand that balancing msrv vs semver can be a drag on maintainer productivity, so I have no real expectations here other than documenting that it happened.

Is there a way to also support memrchr?

I was wondering if it would be possible for this library to also support memrchr?
I can see this might be difficult because memrchr is not a posix-defined functionality.

And another (kinda related) question: Does this crate work on all platforms (especially windows)?

api argument order inconsitant

I saw sometime haystack is in first then needle is second like in https://docs.rs/memchr/2.4.1/memchr/memmem/fn.find.html. But sometime it's the opposite like https://docs.rs/memchr/2.4.1/memchr/struct.Memchr.html#method.new.

That can lead to make error.

Provide lines-iterator

I guess this came up before because it's so obvious: Provide an iterator akin to std::str::lines, but using memchr to search for line endings. The impl in core::memchr came up in my benchmarks and it turns out the code below is 2x to 3x faster than what std::str::lines() does, both in synthetic and real-life code.

pub fn lines(inp: &[u8]) -> impl Iterator<Item = &[u8]> {
    let mut inp = inp;
    std::iter::from_fn(move || {
        if inp.is_empty() {
            return None;
        }
        let ending = memchr::memchr(b'\n', inp).unwrap_or(inp.len() - 1) + 1;
        let (mut line, rest) = inp.split_at(ending);
        inp = rest;
        if let Some(b'\n') = line.last() {
            line = &line[..line.len() - 1];
            if let Some(b'\r') = line.last() {
                line = &line[..line.len() - 1];
            }
        }
        Some(line)
    })
}

pub fn str_lines(inp: &str) -> impl Iterator<Item = &str> {
    lines(inp.as_bytes()).map(|sl| unsafe { std::str::from_utf8_unchecked(sl) })
}

Is the performance difference (on my machine!) reason enough to include it in memchr ? Feel free if so :-)

Error when building `memchr` with custom rustflags `-static`

error: failed to run custom build command for `memchr v2.4.1`

Caused by:
  could not execute process `/mnt/c/Users/asaff/Documents/Dev/Github/tool/target/debug/build/memchr-b9330b1f01949571/build-script-build` (never executed)

Caused by:
  No such file or directory (os error 2)

config:

[build]
rustflags = ["-C", "link-arg=-nostdlib", "-C", "link-arg=-static", "-C", "relocation-model=pic"]

When compiling only with link-arg=-nostdlib I get the follwoing crash:

error: failed to run custom build command for `memchr v2.4.1`

Caused by:
  process didn't exit successfully: `/mnt/c/Users/asaff/Documents/Dev/Github/tool/target/debug/build/memchr-b9330b1f01949571/build-script-build` (signal: 11, SIGSEGV: invalid memory reference)
warning: build failed, waiting for other jobs to finish...
error: build failed

Thanks!

I was working on something that can parse ninja.build files. It used to be twice as slow as ninja itself, but after replacing just a few lines with memchr and memchr3 from this crate, it is now almost twice as fast as ninja.

Thanks! :)

BUG: Cannot compile with newest version of XCode Command Line Tools

I am on macOS Big Sur 11.1 and just updated to the newest version of XCode Command Line Tools. After I did (and I assume it is something to do with that, because everything with memchr worked fine before that), I got the following error message when I tried to cargo check anything that had a transitive dependency on memchr.

error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" "-m64" "-arch" "x86_64" "-L" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.0.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.1.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.10.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.11.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.12.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.13.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.14.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.15.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.2.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.3.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.4.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.5.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.6.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.7.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.8.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.9.rcgu.o" "-o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.14i7c91is1si092v.rcgu.o" "-Wl,-dead_strip" "-nodefaultlibs" "-L" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/deps" "-L" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libstd-cf45c391193686b0.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libpanic_unwind-bfb82cdc97bd35ea.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libobject-0e543fa90fe41090.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libaddr2line-f50981f4143e4c69.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libgimli-bbe9b2276f9fe948.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_demangle-c04e87d408a5de4c.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libhashbrown-3865f13d7ece40bb.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_std_workspace_alloc-83f3487f53b2e684.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libunwind-518f93c579715cca.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcfg_if-ab0ea20e972aeb4f.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liblibc-50e4694516c58a71.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liballoc-8171c7b795c55f62.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_std_workspace_core-8357f853e5f39333.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcore-80c77ff1434731cf.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcompiler_builtins-8c8eeab435e54e85.rlib" "-lSystem" "-lresolv" "-lc" "-lm"
  = note: xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun


error: aborting due to previous error

error: could not compile `memchr`

Any advice on how to fix it?

Error building for arm64 on a am64 with docker

👋 Hi!

I'm working on docker-activity and memchr is one of my dependencies.
I'm building docker-activity with docker buildx in order to have a single image for both platform and I end up having a weird behavior and I'm not sure if I open this issue in your repo, or in buildx or even in qemu, but I'll try here.

When I build on a real arm64 machine (RPi4), the image buildx perfectly but when I use docker buildx build --platform linux/arm64 I end up having this issue due to memchr apparently.

#18 33.61    Compiling futures-core v0.3.18
#18 35.36 error: could not compile `memchr` due to previous error
#18 35.37 warning: build failed, waiting for other jobs to finish...
#18 43.90 error: linking with `cc` failed: exit status: 1
#18 43.90   |
#18 43.90   = note: "cc" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crt1.o" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crti.o" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crtbegin.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.0.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.1.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.10.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.11.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.12.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.13.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.14.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.15.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.2.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.3.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.4.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.5.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.6.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.7.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.8.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.9.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.25mgmn3yz3lxkcgb.rcgu.o" "-Wl,--as-needed" "-L" "/code/target/release/deps" "-L" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib" "-Wl,--start-group" "-Wl,-Bstatic" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libstd-bb69598673ac6378.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libpanic_unwind-347c34ae82bb4da0.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libminiz_oxide-86fc36b502bfb8aa.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libadler-cb14375f652e6e86.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libobject-9e87208331b99476.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libmemchr-ebe0ff89d9e37134.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libaddr2line-b0f16d22595fdd3b.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libgimli-57bd3e568b1b69be.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libstd_detect-d2296608bd767c8a.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/librustc_demangle-e4d26fe9e39d3be6.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libhashbrown-8322f07825c42064.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/librustc_std_workspace_alloc-403fa8d4a1124a0d.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libunwind-29e90d90171d4117.rlib" "-lunwind" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libcfg_if-ed66653f82293f20.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/liblibc-ad350ff50825d4f2.rlib" "-lc" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/liballoc-98d6df8d800ab2ff.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/librustc_std_workspace_core-e0db88e40d9c7e0b.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libcore-fcedc0d4b8cb02ca.rlib" "-Wl,--end-group" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libcompiler_builtins-0c2242734ae54219.rlib" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-znoexecstack" "-nostartfiles" "-L" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib" "-L" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained" "-o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d" "-Wl,--gc-sections" "-static" "-no-pie" "-Wl,-zrelro,-znow" "-nodefaultlibs" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crtend.o" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crtn.o"
#18 43.90   = note:
#18 43.90
#18 46.48 error: build failed
------
Dockerfile:16
--------------------
  14 |     COPY src/exporter /code/src/exporter
  15 |     COPY src/format /code/src/format
  16 | >>> RUN cargo build --release --offline
  17 |
  18 |     FROM alpine
--------------------
error: failed to solve: process "/dev/.buildkit_qemu_emulator /bin/sh -c cargo build --release --offline" did not complete successfully: exit code: 101

Do you have any idea where that could come from? If you want, here is the Dockerfile I use.

Performance on 64-bit arch

I'm writing a parser and looking for 4 needles. I tried adjusting the bithacks to analyze 8 bytes at a time since I'm using a 64-bit arch. I anticipated memchr to be faster but I am finding that is not true. Am I doing something wrong? The benchmarks are the following using rustc 1.31.0-nightly (fc403ad98 2018-09-30). I got similar results in both MacOS and Linux.

test tests::bench_hasvalue                   ... bench:           2 ns/iter (+/- 0)
test tests::bench_memchr                     ... bench:          13 ns/iter (+/- 1)

The code is the following.

extern crate byteorder;
extern crate memchr;

use self::byteorder::{ByteOrder, NativeEndian};

// Taken from "Determine if a word has a zero byte" at http://graphics.stanford.edu/~seander/bithacks.html
// and adjusted for 64-bits and Rust complaining of overflow.
fn haszero(x: u64) -> bool {
    (x.wrapping_sub(0x0101_0101_0101_0101) & !x & 0x8080_8080_8080_8080) != 0
}

// Taken from "Determine if a word has a byte equal to n" at http://graphics.stanford.edu/~seander/bithacks.html
// and adjusted for 64-bits.
fn hasvalue(haystack: &[u8], needles: &[u8]) -> bool {
    let x = NativeEndian::read_u64(haystack);
    let y = !0 as u64 / 255 as u64;
    for c in needles {
        if haszero(x ^ (y * u64::from(*c))) {
            return true;
        }
    }
    false
}

extern crate test;

#[cfg(test)]
mod tests {
    use self::test::Bencher;
    use super::*;

    #[bench]
    fn bench_hasvalue(b: &mut Bencher) {
        b.iter(|| {
            let haystack = test::black_box(b"01234567");
            hasvalue(haystack, &[b'a', b'b', b'c', b'd'])
        });
    }

    #[bench]
    fn bench_memchr(b: &mut Bencher) {
        b.iter(|| {
            let haystack = test::black_box(b"01234567");
            memchr::memchr(b'a', haystack).is_none()
                && memchr::memchr3(b'b', b'c', b'd', haystack).is_none()
        });
    }
}

Revisit Windows performance

I think it would be great if someone with access to a windows environment could run benchmarks to revisit the built-in vs this crate's fallback implementation of memchr.

port the memchr implementations to "generic SIMD" code

When I wrote the new memmem implementation earlier this year, one thing I did was write the implementation as something that was generic over the vector type:

memchr/src/memmem/genericsimd.rs

Lines 95 to 107 in 186ac04

 /// # Safety 

 /// 

 /// Since this is meant to be used with vector functions, callers need to 

 /// specialize this inside of a function with a `target_feature` attribute. 

 /// Therefore, callers must ensure that whatever target feature is being used 

 /// supports the vector functions that this function is specialized for. (For 

 /// the specific vector functions used, see the Vector trait implementations.) 

 #[inline(always)] 

 pub(crate) unsafe fn fwd_find<V: Vector>( 

 fwd: &Forward, 

 haystack: &[u8], 

 needle: &[u8], 

 ) -> Option<usize> {

where an example of it being called, e.g. for AVX2, is:

memchr/src/memmem/x86/avx.rs

Line 27 in 186ac04

genericsimd::Forward::new(ninfo, needle).map(Forward)

So basically, the idea here is, you write the nasty SIMD code once, and then write some trivial shims for each target feature you want to support.

The actual use of SIMD in this crate is reasonably simple, so it turns out that the trait defining the API of a vector is quite small:

memchr/src/memmem/vector.rs

Lines 21 to 32 in 186ac04

 pub(crate) trait Vector: Copy + core::fmt::Debug { 

 /// _mm_set1_epi8 or _mm256_set1_epi8 

 unsafe fn splat(byte: u8) -> Self; 

 /// _mm_loadu_si128 or _mm256_loadu_si256 

 unsafe fn load_unaligned(data: *const u8) -> Self; 

 /// _mm_movemask_epi8 or _mm256_movemask_epi8 

 unsafe fn movemask(self) -> u32; 

 /// _mm_cmpeq_epi8 or _mm256_cmpeq_epi8 

 unsafe fn cmpeq(self, vector2: Self) -> Self; 

 /// _mm_and_si128 or _mm256_and_si256 

 unsafe fn and(self, vector2: Self) -> Self; 

 }

OK, so what's this issue about? I think ideally, we would push the Vector trait up a level in the module hierarchy, port the existing x86 SIMD memchr implementation to a "generic" version, and then replace the existing implementations with shims that call out to the generic version.

This will hopefully let us easily add a WASM implementation of memchr, but adding other implementations in the future would be good too once more intrinsics (e.g., for ARM) are added to std.

(One wonders whether we should just wait for portable SIMD to land in std, but I don't know when that will happen.)

could we have a UTF-8 memchr?

This could branch out to {memchr, memchr2, memchr3} depending on the first 4 bits of the needle: char argument. Alas, we have no memchr4 function.

Consider using `cfg-if` crate

Hello,

While reading through the code I noticed all the #[cfg(...)] statements, and I was wondering if you knew cfg-if existed? https://github.com/alexcrichton/cfg-if

This could help clean up the code and make sure that only 1 fn ever gets compiled in the code.

I'd be willing to implement this.

memchr 2.5.0 fails to compile on Android

After upgrading to 2.5.0, I'm getting build errors on Android like this:

[CONTEXT] stderr: error[E0531]: cannot find tuple struct or tuple variant `GenericSIMD128` in this scope
[CONTEXT]    --> third-party/rust/vendor/memchr-2.5.0/src/memmem/mod.rs:885:13
[CONTEXT]     |
[CONTEXT] 885 |             GenericSIMD128(gs) => GenericSIMD128(gs),
[CONTEXT]     |             ^^^^^^^^^^^^^^ not found in this scope
[CONTEXT]

It appears that the GenericSIMD128 enum variant is defined with cfg target_arg = "x86_64" or memchr_runtime_wasm128, but then it is used in the code without a cfg check limiting it to those platforms, causing it to fail to compile.

error[E0428]: the name `imp` is defined multiple times

[192.168.18.146] out: error[E0428]: the name imp is defined multiple times
[192.168.18.146] out: --> /home/aram/.cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.3.1/src/lib.rs:148:5
[192.168.18.146] out: |
[192.168.18.146] out: 139 | fn imp(n1: u8, haystack: &[u8]) -> Option {
[192.168.18.146] out: | ------------------------------------------------ previous definition of the value imp here
[192.168.18.146] out: ...
[192.168.18.146] out: 148 | fn imp(n1: u8, haystack: &[u8]) -> Option {
[192.168.18.146] out: | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ imp redefined here
[192.168.18.146] out: |
[192.168.18.146] out: = note: imp must be defined only once in the value namespace of this block

Seems [cfg(all(target_arch = "x86_64", memchr_runtime_simd, not(miri)))] and [cfg(all(memchr_libc, not(all(target_arch = "x86_64", memchr_runtime_simd, miri))))] are not mutually exclusive.

Duplicate lang items with 2/more crates depended on this in Cargo.toml and -Zbuild-std

To reproduce the error,

Create a lib crate
Add 2 or more crates depended on memchr to the Cargo.toml. For me, it would be nom = "7.0" and object = "0.26"
Execute cargo build -Z build-std=core,compiler_builtins,alloc -Z build-std-features=compiler-builtins-mem --target x86_64-unknown-linux-gnu

And there'll go thousands of lines of output of cargo with duplicate lang items or so.

My rust version:
nightly-x86_64-unknown-linux-gnu (default)
rustc 1.56.0-nightly (b03ccace5 2021-08-24)

Compiler error when building on custom x86_64 no_std target

I ran into this error when trying to add the cstr_core crate to my no_std OS project, which depends on memchr and disables its default features.

Building in debug mode:

LLVM ERROR: Do not know how to split this operator's operand!

Building in release mode:

error: Could not compile `memchr`.

Caused by:
  process didn't exit successfully: `rustc --crate-name memchr /home/kevin/.cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.2.1/src/lib.rs --color always --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=ebf07e953eacc227 -C extra-filename=-ebf07e953eacc227 --out-dir /home/kevin/my_os/target/x86_64-my_os/release/deps --target x86_64-my_os -L dependency=/home/kevin/my_os/target/x86_64-my_os/release/deps -L dependency=/home/kevin/my_os/target/release/deps --cap-lints allow --emit=obj -C debuginfo=2 -C code-model=large -C relocation-model=static -D unused-must-use -Z merge-functions=disabled -Z share-generics=no --sysroot /home/kevin/.xargo --cfg memchr_runtime_simd --cfg memchr_runtime_sse2 --cfg memchr_runtime_sse42 --cfg memchr_runtime_avx` (signal: 11, SIGSEGV: invalid memory reference)

For reference and in case it matters, my compiler target .json file is:

{
  "llvm-target": "x86_64-unknown-none-gnu",
  "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
  "linker-flavor": "gcc",
  "target-endian": "little",
  "target-pointer-width": "64",
  "target-c-int-width": "32",
  "arch": "x86_64",
  "os": "none",
  "features": "-mmx,-sse,+soft-float",
  "disable-redzone": true,
  "panic": "abort"
}

My rustc version is rustc 1.38.0-nightly (78ca1bda3 2019-07-08).

I've never encountered an error like this before, so I'm not sure what else to say. If more information is needed, I am happy to provide it.

Typo in the doc

/// optimized routine that can be up to an order of magnitude master in some

should be

/// optimized routine that can be up to an order of magnitude faster in some

https://github.com/BurntSushi/rust-memchr/blob/451d16d440928b5547b92317fd83fc4c937b7b91/src/lib.rs#L223

The next function in memchr*_iter doesn't seem to compute efficiently.

I am trying to make an efficient csv parser using memchr.

I want to search all commas when there is a line like the one below.
a1,b1,c1,d2
In this case, I want to use the code below.

let sep_iter = memchr3_iter(col_sep, row_sep, b''', &buffer[..]);
...
loop {
let next_sep_pos_wrap = self.sep_iter.next();
.....
}

If sep_iter.next() is calculated here, it seems that the already calculated result can be used again, but it seems to recalculate from the beginning.
When sep_iter.next() operates for the first time, it seems that the result value is already stored in another bit. But now it seems to only use trailing_zeros.

Changelog

It would be nice if there were somewhere that listed changes between different releases. Currently it seems the only way to do that is by comparing tags, but that's not such a great experience. I know it's a lot of work to make a log for 20 releases, but I think it's worth having.

Failure to build on Apple Silicon M1

Using my site code I attempted to build memchr using cargo build. The build.rs file fails to run with the following error:

$ cargo build
   Compiling proc-macro2 v1.0.24
   Compiling libc v0.2.73
   Compiling syn v1.0.48
   Compiling memchr v2.3.3
   Compiling log v0.4.11
   Compiling bitflags v1.2.1
   Compiling ryu v1.0.5
   Compiling serde_derive v1.0.117
error: failed to run custom build command for `memchr v2.3.3`

Caused by:
  process didn't exit successfully: `/Users/cadey/Code/site/target/debug/build/memchr-08053740b12295b3/build-script-build` (signal: 9, SIGKILL: kill)
warning: build failed, waiting for other jobs to finish...
error: build failed

I assume this may be a rustc bug as the error in question is identical to the error you get when you attempt to run an unsigned binary on an M1 Mac. I attempted to sign the binary manually with the codesign tool and it failed with this error:

$ codesign -s - /Users/cadey/Code/site/target/debug/build/memchr-08053740b12295b3/build-script-build
/Users/cadey/Code/site/target/debug/build/memchr-08053740b12295b3/build-script-build: replacing existing signature
/Users/cadey/Code/site/target/debug/build/memchr-08053740b12295b3/build-script-build: the codesign_allocate helper tool cannot be found or used

Should I file this as a rustc bug?

Undefined symbols ld error when enabling `use_std`

Not sure if this is an issue with memchr, nom, elastic-rs/elastic (where I am using nom) or even std/core/compiler but I thought I would start here...

I am getting undefined symbols ld errors to various symbols in core when I enable the memchr/use_std feature of nom, see elastic-rs/elastic/pull/389 for a little more background + error logs and this or this Travis build.

I have reproduced it on macOS 10.14 & 10.15 and Ubuntu 19.04 (and for the sake of completeness; various Linux via Docker) with rustc 1.38.0 (625451e37 2019-09-23), 1.39.0-beta.6 (224f0bc90 2019-10-15), 1.40.0-nightly (4a8c5b20c 2019-10-23)—and a few other nightlies—and when cross compiling to x86_64-unknown-linux-musl from macOS and Linux hosts.

If you think this doesn't belong here, please let me know where you think I should file this. I can also upload the current Cargo.lock if that would help.

PS thanks for all your awesome work—I have been using rg practically daily for years and love it.

Purpose of linking against 'std'

~~What's the purpose of inking against the standard library? Everything that this crate needs from std is provided in core, which in turn is re-exported by std.~~

~~The libc crate can also be used as a dependency even in a #![no_std] environment, however this crate only enables it when use_std is enabled.~~

~~For crate authors who want to utilize memchr and provide #![no_std], they're left with the fallback implementation. This is slower on macOS than using libc::memchr.~~

Edit: I didn't try testing with enabling the libc feature by itself. Apologies 😅

`memrchr` implementations may conflict with stacked borrows

The reverse search implementations (memrchr) seem illegal under stacked borrows. They all follow the same pattern, so here I'll only annotate one. It retrieves a raw pointer to the end of the haystack from a reference to an empty slice, but then uses that pointer to iterate backwards by offsetting it with negative indices. Under strict rules, that pointer would however only be valid for access to the bytes that the reference covered from which it was cast, i.e. a zero-length slice at the end.

To my understanding, this is very likely illegal but not yet caught by MIRI since it does not strictly track the source for raw pointers (^source). @RalfJung might be able to provide more definitive insights.

Relevant code (inserted comments marked as // !):

pub fn memrchr(n1: u8, haystack: &[u8]) -> Option<usize> {
    // [...]
    let start_ptr = haystack.as_ptr();
    // ! This pointer only covers the same slice that the reference does.
    // ! Would need to create these manually from offsetting the start pointer
    // ! which covers the whole array.
    let end_ptr = haystack[haystack.len()..].as_ptr();
    let mut ptr = end_ptr;

    unsafe {
        // [...]
        ptr = (end_ptr as usize & !align) as *const u8;
        // [...]
        while loop_size == LOOP_SIZE && ptr >= ptr_add(start_ptr, loop_size) {
            // [...]
            // ! These are outside the bounds of the reference from which ptr was created.
            let a = *(ptr_sub(ptr, 2 * USIZE_BYTES) as *const usize);
            let b = *(ptr_sub(ptr, 1 * USIZE_BYTES) as *const usize);
            // [...]
            ptr = ptr_sub(ptr, loop_size);
        }
        // [...]
    }
}

Library code reduced to that version of memrchr.

The fix is simple, create ptr from manually offsetting haystack.as_ptr() which is valid for the whole haystack. I also don't expect any miscompilation.

Add aarch64 neon simd support

Currently there are only an x86_64-specific implementations using SIMD instructions.

As described in #75 I want to try to port the implementation to aarch64 using the code in /src/x86/sse42.rs as a template and the NEON intrinsics from core::arch::aarch64 (currently nightly only).

Feature request: no-std + alloc

I was looking through the code of this crate. I have a need for something like this on a no-std + alloc target, but it seems several features (such as using Cow from alloc) are missing. That should be possible to support.

Build script flagged as Virus by Kaspersky Endpoint Security

Unable to use any crates that have memchr as a dependency (such as the regex module) as Kaspersky Endpoint Security flags the build script as a virus:

memchr2_iter and memchr3_iter do not properly advance slice position

Hi @BurntSushi I'm the author of Artichoke Ruby. We met on Twitter.

Playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7f18fe26ae4413e95f1e350d77f86b28

The iter_next! macro hard codes how much to advance the haystack position by.

https://github.com/BurntSushi/rust-memchr/blob/1ec5ecce03c220c762dd9a8b08f7a3d95522b765/src/iter.rs#L17

This means that memchr*_iter functions on more than one byte incorrectly scan. For example, this code outputs 2 when it should output 1:

extern crate memchr; // 2.2.1

fn main() {
    let haystack = b"abcdefghijklmnopqrstuvwxyz";
    println!("{}", memchr::memchr2_iter(b'a', b'b', haystack.as_ref()).count());
}

Provide const implementation

As const evaluation is slowly becoming powerful enough for wide use, memchr not providing a const version is slowly becoming an issue.

For example, in Amanieu/cstr_core#25 (an embedded version of cstr), it would be convenient to construct a &str out of const array data taken from C in mixed C-and-Rust environments. The workaround there is to have an own simple-stupid-but-const memchr implementation and dispatch through const_eval_select between that and an actually runtime-friendly memchr from this crate.

On the long run, it would be great if this crate would just provide its memchr as const -- obviously that's not gonna fly any time soon (especially considering the MSRV), but there could be steps:

This crate could provide a memchr_const function (feel free to take the one from there if you like, it's under a different license but I wrote it and hereby also license it under this crate's license).
This crate could provide an auto-dispatching memchr under a nightly-only feature gate.

The full solution (memchr "just" being const) is likely to be tricky, because while a "regular" memchr can be const on stable in the forseeable future, I don't expect that to happen for vectorized or even libc-calling versions. Thus, once any of these optimizations are on, dispatch between the runtime-and-optimized and the const version will still need to happen, and the only way to do that currently AFAIK is through const_eval_select which there are no plans for stabilization for. This issue could be a use case to start stabilizing const_eval_select, or explore alternative avenues.

Use of `AtomicPtr` in `unsafe_ifunc` prevents memchr from being inlined when compiled with avx enabled

is_x86_feature_detected! resolves to just a return true when -C target-cpu or -C target-feature is set to a value that enables the feature. When using just a simple is_x86_feature_detected! and -C target-cpu=native (or whatever), the compiler can inline the function and completely avoid the machinery of the atomic operations and calling a function pointer. However, when using AtomicPtr, it is impossible for the compiler to inline the function at all.

It would be great if there was some way to automatically disable the runtime feature detection if avx (or whatever the corresponding CPU feature set is) is already enabled at compile-time.

The road to add simd support?

Now simd landed on nightly, do we have any plan to add the simd support?

Tests fail with --no-default-features [published memchr 2.3.2]

Just hit this doing various linux-vendor side testing. It's probably not important, but it would be nice to be able to ensure this crate works in this configuration.

cargo +stable test --no-default-features

error[E0433]: failed to resolve: maybe a missing crate `std`?
 --> src/tests/mod.rs:1:5
  |
1 | use std::iter::repeat;
  |     ^^^ maybe a missing crate `std`?

error: cannot find macro `eprintln` in this scope
 --> src/tests/mod.rs:9:5
  |
9 |     eprintln!("LITTLE ENDIAN");
  |     ^^^^^^^^

error[E0433]: failed to resolve: use of undeclared type or module `Vec`
   --> src/tests/iter.rs:167:27
    |
167 |     let mut found_front = Vec::new();
    |                           ^^^ use of undeclared type or module `Vec`

error[E0433]: failed to resolve: use of undeclared type or module `Vec`
   --> src/tests/iter.rs:168:26
    |
168 |     let mut found_back = Vec::new();
    |                          ^^^ use of undeclared type or module `Vec`

error[E0433]: failed to resolve: use of undeclared type or module `Box`
   --> src/tests/iter.rs:201:5
    |
201 |     Box::new(it)
    |     ^^^ use of undeclared type or module `Box`

error[E0433]: failed to resolve: use of undeclared type or module `Box`
   --> src/tests/iter.rs:214:5
    |
214 |     Box::new(it)
    |     ^^^ use of undeclared type or module `Box`

error[E0433]: failed to resolve: use of undeclared type or module `Box`
   --> src/tests/iter.rs:228:5
    |
228 |     Box::new(it)
    |     ^^^ use of undeclared type or module `Box`

error[E0433]: failed to resolve: use of undeclared type or module `Vec`
  --> src/tests/mod.rs:20:21
   |
20 |     let mut tests = Vec::new();
   |                     ^^^ use of undeclared type or module `Vec`

error[E0433]: failed to resolve: use of undeclared type or module `Vec`
   --> src/tests/mod.rs:295:24
    |
295 |         let mut more = Vec::new();
    |                        ^^^ use of undeclared type or module `Vec`

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:52:27
   |
52 |         needle: u8, data: Vec<u8>, take_side: Vec<bool>
   |                           ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:52:47
   |
52 |         needle: u8, data: Vec<u8>, take_side: Vec<bool>
   |                                               ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:66:41
   |
66 |         needle1: u8, needle2: u8, data: Vec<u8>, take_side: Vec<bool>
   |                                         ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:66:61
   |
66 |         needle1: u8, needle2: u8, data: Vec<u8>, take_side: Vec<bool>
   |                                                             ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:81:15
   |
81 |         data: Vec<u8>, take_side: Vec<bool>
   |               ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:81:35
   |
81 |         data: Vec<u8>, take_side: Vec<bool>
   |                                   ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:97:30
   |
97 |     fn qc_memchr1_iter(data: Vec<u8>) -> bool {
   |                              ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:103:34
    |
103 |     fn qc_memchr1_rev_iter(data: Vec<u8>) -> bool {
    |                                  ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:109:30
    |
109 |     fn qc_memchr2_iter(data: Vec<u8>) -> bool {
    |                              ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:116:34
    |
116 |     fn qc_memchr2_rev_iter(data: Vec<u8>) -> bool {
    |                                  ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:123:30
    |
123 |     fn qc_memchr3_iter(data: Vec<u8>) -> bool {
    |                              ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:131:34
    |
131 |     fn qc_memchr3_rev_iter(data: Vec<u8>) -> bool {
    |                                  ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:139:40
    |
139 |     fn qc_memchr1_iter_size_hint(data: Vec<u8>) -> bool {
    |                                        ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:162:58
    |
162 | fn double_ended_take<I, J>(mut iter: I, take_side: J) -> Vec<I::Item>
    |                                                          ^^^ not found in this scope

error[E0412]: cannot find type `Box` in this scope
   --> src/tests/iter.rs:195:6
    |
195 | ) -> Box<dyn DoubleEndedIterator<Item = usize> + 'a> {
    |      ^^^ not found in this scope

error[E0412]: cannot find type `Box` in this scope
   --> src/tests/iter.rs:208:6
    |
208 | ) -> Box<dyn DoubleEndedIterator<Item = usize> + 'a> {
    |      ^^^ not found in this scope

error[E0412]: cannot find type `Box` in this scope
   --> src/tests/iter.rs:222:6
    |
222 | ) -> Box<dyn DoubleEndedIterator<Item = usize> + 'a> {
    |      ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/memchr.rs:92:49
   |
92 |     fn qc_memchr1_matches_naive(n1: u8, corpus: Vec<u8>) -> bool {
   |                                                 ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/memchr.rs:98:57
   |
98 |     fn qc_memchr2_matches_naive(n1: u8, n2: u8, corpus: Vec<u8>) -> bool {
   |                                                         ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/memchr.rs:106:17
    |
106 |         corpus: Vec<u8>
    |                 ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/memchr.rs:113:50
    |
113 |     fn qc_memrchr1_matches_naive(n1: u8, corpus: Vec<u8>) -> bool {
    |                                                  ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/memchr.rs:119:58
    |
119 |     fn qc_memrchr2_matches_naive(n1: u8, n2: u8, corpus: Vec<u8>) -> bool {
    |                                                          ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/memchr.rs:127:17
    |
127 |         corpus: Vec<u8>
    |                 ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/mod.rs:19:22
   |
19 | fn memchr_tests() -> Vec<MemchrTest> {
   |                      ^^^ not found in this scope

error[E0412]: cannot find type `String` in this scope
   --> src/tests/mod.rs:144:13
    |
144 |     corpus: String,
    |             ^^^^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:152:14
    |
152 |     needles: Vec<u8>,
    |              ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:154:16
    |
154 |     positions: Vec<usize>,
    |                ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:276:26
    |
276 |             it.collect::<Vec<usize>>(),
    |                          ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:278:63
    |
278 |             self.needles.iter().map(|&b| b as char).collect::<Vec<char>>(),
    |                                                               ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:294:25
    |
294 |     fn expand(&self) -> Vec<MemchrTest> {
    |                         ^^^ not found in this scope

error[E0412]: cannot find type `String` in this scope
   --> src/tests/mod.rs:300:33
    |
300 |             let mut new_corpus: String = repeat('%').take(i).collect();
    |                                 ^^^^^^ not found in this scope

error[E0425]: cannot find function `repeat` in this scope
   --> src/tests/mod.rs:300:42
    |
300 |             let mut new_corpus: String = repeat('%').take(i).collect();
    |                                          ^^^^^^ not found in this scope
    |
help: possible candidate is found in another module, you can import it into scope
    |
1   | use core::iter::repeat;
    |

error[E0412]: cannot find type `String` in this scope
   --> src/tests/mod.rs:309:26
    |
309 |             let padding: String = repeat('%').take(i).collect();
    |                          ^^^^^^ not found in this scope

error[E0425]: cannot find function `repeat` in this scope
   --> src/tests/mod.rs:309:35
    |
309 |             let padding: String = repeat('%').take(i).collect();
    |                                   ^^^^^^ not found in this scope
    |
help: possible candidate is found in another module, you can import it into scope
    |
1   | use core::iter::repeat;
    |

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:330:47
    |
330 |     fn needles(&self, count: usize) -> Option<Vec<u8>> {
    |                                               ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:348:57
    |
348 |     fn positions(&self, align: usize, reverse: bool) -> Vec<usize> {
    |                                                         ^^^ not found in this scope

error[E0599]: no method named `to_string` found for type `&'static str` in the current scope
  --> src/tests/mod.rs:28:36
   |
28 |             corpus: statict.corpus.to_string(),
   |                                    ^^^^^^^^^ method not found in `&'static str`
   |
   = help: items from traits can only be used if the trait is in scope
   = note: the following trait is implemented but not in scope; perhaps add a `use` for it:
           `use alloc::string::ToString;`

error: aborting due to 46 previous errors

Some errors have detailed explanations: E0412, E0425, E0433, E0599.
For more information about an error, try `rustc --explain E0412`.
error: could not compile `memchr`.

To learn more, run the command again with --verbose.

Use rustc_layout_scalar_valid_range_end(usize::MAX - 1) for the index

From rust-lang/rust#73139 (comment):

The index is always less than the length. So even if the length is usize::MAX, the index will be at most MAX - 1 and so cannot overflow.

It would be great to tell rustc this is the case by using rustc_layout_scalar_valid_range_end(usize::MAX - 1). That would allow storing Option<index> in usize instead of needing an extra bit, which in some cases could double the size of the struct due to alignment requirements.

Failing that (since rustc_layout_scalar_valid_range_end is unstable and likely will never be stabilized), would it be possible to document that the index is always less than usize::MAX?

Feature request: memrnchr

It'd be nice to have a fast implementation that finds the first character that isn't the needle (like find_last_not_of in C++). This comes up in path parsing, e.g.:

https://github.com/danielpclark/faster_path/blob/master/src/path_parsing.rs#L21-L32

Support memchr4, memchr5, etc.

I'm writing a routine that escapes HTML special characters. To do that, I have to search for five different characters (&, <, >, ', ") simultaneously. I can do this using two calls to memchr2 or memchr3, but that doesn't seem elegant. It would be nice if there was a function that could do this search in one go.

From the implementation side: the PCMPESTRI instruction in SSE4.2 supports searching for up to 16 different needles in parallel. It would be nice if we could expose this somehow.

[Windows 10] Failed to run custom build command for 'memchr v2.4.1'

Hi, whenever I try to install or build from source distant the process fails on memchr compilation.

$ cargo build --release
  Downloaded anyhow v1.0.44
  Downloaded polling v2.1.0
  Downloaded tokio-util v0.6.8
  Downloaded instant v0.1.11
  Downloaded libc v0.2.103
  Downloaded openssl-src v111.16.0+1.1.1l
  Downloaded blocking v1.0.2
  Downloaded proc-macro2 v1.0.29
  Downloaded async-process v1.2.0
  Downloaded serde_json v1.0.68
  Downloaded mio v0.7.13
  Downloaded openssl-sys v0.9.67
  Downloaded structopt v0.3.23
  Downloaded tokio-macros v1.4.1
  Downloaded thiserror v1.0.29
  Downloaded whoami v1.1.5
  Downloaded slab v0.4.4
  Downloaded pkg-config v0.3.20
  Downloaded thiserror-impl v1.0.29
  Downloaded cc v1.0.71
  Downloaded tokio v1.12.0
  Downloaded syn v1.0.80
  Downloaded half v1.7.1
  Downloaded structopt-derive v0.4.16
  Downloaded zeroize v1.4.2
  Downloaded 25 crates (7.1 MB) in 7.25s (largest was `openssl-src` at 5.1 MB)
   Compiling winapi v0.3.9
   Compiling proc-macro2 v1.0.29
   Compiling autocfg v1.0.1
   Compiling unicode-xid v0.2.2
   Compiling syn v1.0.80
   Compiling cfg-if v1.0.0
   Compiling memchr v2.4.1
   Compiling libc v0.2.103
   Compiling futures-core v0.3.17
   Compiling cc v1.0.71
   Compiling version_check v0.9.3
   Compiling pin-project-lite v0.2.7
   Compiling log v0.4.14
   Compiling futures-io v0.3.17
   Compiling once_cell v1.8.0
   Compiling vcpkg v0.2.15
   Compiling pkg-config v0.3.20
   Compiling cache-padded v1.1.1
   Compiling typenum v1.14.0
   Compiling parking_lot_core v0.8.5
   Compiling parking v2.0.0
   Compiling fastrand v1.5.0
   Compiling slab v0.4.4
   Compiling waker-fn v1.1.0
   Compiling event-listener v2.5.1
   Compiling lazy_static v1.4.0
   Compiling scopeguard v1.1.0
   Compiling smallvec v1.7.0
   Compiling async-task v4.0.3
   Compiling proc-macro-hack v0.5.19
   Compiling ntapi v0.3.6
   Compiling bitflags v1.3.2
   Compiling unicode-segmentation v1.8.0
   Compiling futures-sink v0.3.17
   Compiling atomic-waker v1.0.0
   Compiling proc-macro-nested v0.1.7
   Compiling serde_derive v1.0.130
   Compiling futures-channel v0.3.17
   Compiling futures-task v0.3.17
   Compiling serde v1.0.130
   Compiling anyhow v1.0.44
   Compiling bytes v1.1.0
   Compiling unicode-width v0.1.9
   Compiling regex-syntax v0.6.25
   Compiling pin-utils v0.1.0
   Compiling ryu v1.0.5
   Compiling cpufeatures v0.2.1
   Compiling subtle v2.4.1
   Compiling serde_json v1.0.68
   Compiling zeroize v1.4.2
   Compiling camino v1.0.5
   Compiling ppv-lite86 v0.2.10
   Compiling strsim v0.8.0
   Compiling opaque-debug v0.3.0
   Compiling regex-automata v0.1.10
   Compiling shell-words v1.0.0
   Compiling convert_case v0.4.0
   Compiling itoa v0.4.8
   Compiling half v1.7.1
   Compiling base64 v0.13.0
   Compiling whoami v1.1.5
   Compiling hex v0.4.3
   Compiling yansi v0.5.0
   Compiling glob v0.3.0
   Compiling instant v0.1.11
   Compiling getrandom v0.2.3
   Compiling futures-macro v0.3.17
error: failed to run custom build command for `memchr v2.4.1`

Caused by:
  could not execute process `C:\Users\modzmi01\Documents\projects\2021\distant\target\release\build\memchr-34b704a4017ecdea\build-script-build` (never executed)

Caused by:
  Access is denied. (os error 5)
warning: build failed, waiting for other jobs to finish...
error: build failed

$ cargo --version
cargo 1.56.0 (4ed5d137b 2021-10-04)


$ rustc --version
rustc 1.56.1 (59eed8a2a 2021-11-01)

I have tried to run the command in powershell run with admin rights but I get the same error. I don't know any rust, is there something I can do to get more details what command from build.rs is causing the issue?

kaspersky antivirus check

at the time of building, the antivirus thinks it has found a virus
VHO:Trojan-Banker.Win32.ClipBanker.gen

	/// # Safety
	///
	/// Since this is meant to be used with vector functions, callers need to
	/// specialize this inside of a function with a `target_feature` attribute.
	/// Therefore, callers must ensure that whatever target feature is being used
	/// supports the vector functions that this function is specialized for. (For
	/// the specific vector functions used, see the Vector trait implementations.)
	#[inline(always)]
	pub(crate) unsafe fn fwd_find<V: Vector>(
	fwd: &Forward,
	haystack: &[u8],
	needle: &[u8],
	) -> Option<usize> {

	pub(crate) trait Vector: Copy + core::fmt::Debug {
	/// _mm_set1_epi8 or _mm256_set1_epi8
	unsafe fn splat(byte: u8) -> Self;
	/// _mm_loadu_si128 or _mm256_loadu_si256
	unsafe fn load_unaligned(data: *const u8) -> Self;
	/// _mm_movemask_epi8 or _mm256_movemask_epi8
	unsafe fn movemask(self) -> u32;
	/// _mm_cmpeq_epi8 or _mm256_cmpeq_epi8
	unsafe fn cmpeq(self, vector2: Self) -> Self;
	/// _mm_and_si128 or _mm256_and_si256
	unsafe fn and(self, vector2: Self) -> Self;
	}

burntsushi / memchr Goto Github PK

memchr's Issues

Recommend Projects

Recommend Topics

Recommend Org