Git Product home page Git Product logo

memchr's Introduction

memchr

This library provides heavily optimized routines for string search primitives.

Build status Crates.io

Dual-licensed under MIT or the UNLICENSE.

Documentation

https://docs.rs/memchr

Overview

  • The top-level module provides routines for searching for 1, 2 or 3 bytes in the forward or reverse direction. When searching for more than one byte, positions are considered a match if the byte at that position matches any of the bytes.
  • The memmem sub-module provides forward and reverse substring search routines.

In all such cases, routines operate on &[u8] without regard to encoding. This is exactly what you want when searching either UTF-8 or arbitrary bytes.

Compiling without the standard library

memchr links to the standard library by default, but you can disable the std feature if you want to use it in a #![no_std] crate:

[dependencies]
memchr = { version = "2", default-features = false }

On x86_64 platforms, when the std feature is disabled, the SSE2 accelerated implementations will be used. When std is enabled, AVX2 accelerated implementations will be used if the CPU is determined to support it at runtime.

SIMD accelerated routines are also available on the wasm32 and aarch64 targets. The std feature is not required to use them.

When a SIMD version is not available, then this crate falls back to SWAR techniques.

Minimum Rust version policy

This crate's minimum supported rustc version is 1.61.0.

The current policy is that the minimum Rust version required to use this crate can be increased in minor version updates. For example, if crate 1.0 requires Rust 1.20.0, then crate 1.0.z for all values of z will also require Rust 1.20.0 or newer. However, crate 1.y for y > 0 may require a newer minimum version of Rust.

In general, this crate will be conservative with respect to the minimum supported version of Rust.

Testing strategy

Given the complexity of the code in this crate, along with the pervasive use of unsafe, this crate has an extensive testing strategy. It combines multiple approaches:

  • Hand-written tests.
  • Exhaustive-style testing meant to exercise all possible branching and offset calculations.
  • Property based testing through quickcheck.
  • Fuzz testing through cargo fuzz.
  • A huge suite of benchmarks that are also run as tests. Benchmarks always confirm that the expected result occurs.

Improvements to the testing infrastructure are very welcome.

Algorithms used

At time of writing, this crate's implementation of substring search actually has a few different algorithms to choose from depending on the situation.

  • For very small haystacks, Rabin-Karp is used to reduce latency. Rabin-Karp has very small overhead and can often complete before other searchers have even been constructed.
  • For small needles, a variant of the "Generic SIMD" algorithm is used. Instead of using the first and last bytes, a heuristic is used to select bytes based on a background distribution of byte frequencies.
  • In all other cases, Two-Way is used. If possible, a prefilter based on the "Generic SIMD" algorithm linked above is used to find candidates quickly. A dynamic heuristic is used to detect if the prefilter is ineffective, and if so, disables it.

Why is the standard library's substring search so much slower?

We'll start by establishing what the difference in performance actually is. There are two relevant benchmark classes to consider: prebuilt and oneshot. The prebuilt benchmarks are designed to measure---to the extent possible---search time only. That is, the benchmark first starts by building a searcher and then only tracking the time for using the searcher:

$ rebar rank benchmarks/record/x86_64/2023-08-26.csv --intersection -e memchr/memmem/prebuilt -e std/memmem/prebuilt
Engine                       Version                   Geometric mean of speed ratios  Benchmark count
------                       -------                   ------------------------------  ---------------
rust/memchr/memmem/prebuilt  2.5.0                     1.03                            53
rust/std/memmem/prebuilt     1.73.0-nightly 180dffba1  6.50                            53

Conversely, the oneshot benchmark class measures the time it takes to both build the searcher and use it:

$ rebar rank benchmarks/record/x86_64/2023-08-26.csv --intersection -e memchr/memmem/oneshot -e std/memmem/oneshot
Engine                      Version                   Geometric mean of speed ratios  Benchmark count
------                      -------                   ------------------------------  ---------------
rust/memchr/memmem/oneshot  2.5.0                     1.04                            53
rust/std/memmem/oneshot     1.73.0-nightly 180dffba1  5.26                            53

NOTE: Replace rebar rank with rebar cmp in the above commands to explore the specific benchmarks and their differences.

So in both cases, this crate is quite a bit faster over a broad sampling of benchmarks regardless of whether you measure only search time or search time plus construction time. The difference is a little smaller when you include construction time in your measurements.

These two different types of benchmark classes make for a nice segue into one reason why the standard library's substring search can be slower: API design. In the standard library, the only APIs available to you require one to re-construct the searcher for every search. While you can benefit from building a searcher once and iterating over all matches in a single string, you cannot reuse that searcher to search other strings. This might come up when, for example, searching a file one line at a time. You'll need to re-build the searcher for every line searched, and this can really matter.

NOTE: The prebuilt benchmark for the standard library can't actually avoid measuring searcher construction at some level, because there is no API for it. Instead, the benchmark consists of building the searcher once and then finding all matches in a single string via an iterator. This tends to approximate a benchmark where searcher construction isn't measured, but it isn't perfect. While this means the comparison is not strictly apples-to-apples, it does reflect what is maximally possible with the standard library, and thus reflects the best that one could do in a real world scenario.

While there is more to the story than just API design here, it's important to point out that even if the standard library's substring search were a precise clone of this crate internally, it would still be at a disadvantage in some workloads because of its API. (The same also applies to C's standard library memmem function. There is no way to amortize construction of the searcher. You need to pay for it on every call.)

The other reason for the difference in performance is that the standard library has trouble using SIMD. In particular, substring search is implemented in the core library, where platform specific code generally can't exist. That's an issue because in order to utilize SIMD beyond SSE2 while maintaining portable binaries, one needs to use dynamic CPU feature detection, and that in turn requires platform specific code. While there is an RFC for enabling target feature detection in core, it doesn't yet exist.

The bottom line here is that core's substring search implementation is limited to making use of SSE2, but not AVX.

Still though, this crate does accelerate substring search even when only SSE2 is available. The standard library could therefore adopt the techniques in this crate just for SSE2. The reason why that hasn't happened yet isn't totally clear to me. It likely needs a champion to push it through. The standard library tends to be more conservative in these things. With that said, the standard library does use some SSE2 acceleration on x86-64 added in this PR. However, at the time of writing, it is only used for short needles and doesn't use the frequency based heuristics found in this crate.

NOTE: Another thing worth mentioning is that the standard library's substring search routine requires that both the needle and haystack have type &str. Unless you can assume that your data is valid UTF-8, building a &str will come with the overhead of UTF-8 validation. This may in turn result in overall slower searching depending on your workload. In contrast, the memchr crate permits both the needle and the haystack to have type &[u8], where &[u8] can be created from a &str with zero cost. Therefore, the substring search in this crate is strictly more flexible than what the standard library provides.

memchr's People

Contributors

alexcrichton avatar atouchet avatar bluss avatar burntsushi avatar cbreeden avatar cholcombe973 avatar dflemstr avatar glandium avatar hdhoang avatar heroickatora avatar hywan avatar ignatenkobrain avatar jethrogb avatar ldesgoui avatar lucretiel avatar marcusgrass avatar mbrubeck avatar mgeisler avatar michalsrb avatar mkroening avatar nicokoch avatar oliver-piorun avatar stuartcarnie avatar tafia avatar taiki-e avatar thebluematt avatar timotree3 avatar tshepang avatar vmx avatar waywardmonkeys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

memchr's Issues

Use of `AtomicPtr` in `unsafe_ifunc` prevents memchr from being inlined when compiled with avx enabled

is_x86_feature_detected! resolves to just a return true when -C target-cpu or -C target-feature is set to a value that enables the feature. When using just a simple is_x86_feature_detected! and -C target-cpu=native (or whatever), the compiler can inline the function and completely avoid the machinery of the atomic operations and calling a function pointer. However, when using AtomicPtr, it is impossible for the compiler to inline the function at all.

It would be great if there was some way to automatically disable the runtime feature detection if avx (or whatever the corresponding CPU feature set is) is already enabled at compile-time.

port the memchr implementations to "generic SIMD" code

When I wrote the new memmem implementation earlier this year, one thing I did was write the implementation as something that was generic over the vector type:

/// # Safety
///
/// Since this is meant to be used with vector functions, callers need to
/// specialize this inside of a function with a `target_feature` attribute.
/// Therefore, callers must ensure that whatever target feature is being used
/// supports the vector functions that this function is specialized for. (For
/// the specific vector functions used, see the Vector trait implementations.)
#[inline(always)]
pub(crate) unsafe fn fwd_find<V: Vector>(
fwd: &Forward,
haystack: &[u8],
needle: &[u8],
) -> Option<usize> {

where an example of it being called, e.g. for AVX2, is:

genericsimd::Forward::new(ninfo, needle).map(Forward)

So basically, the idea here is, you write the nasty SIMD code once, and then write some trivial shims for each target feature you want to support.

The actual use of SIMD in this crate is reasonably simple, so it turns out that the trait defining the API of a vector is quite small:

pub(crate) trait Vector: Copy + core::fmt::Debug {
/// _mm_set1_epi8 or _mm256_set1_epi8
unsafe fn splat(byte: u8) -> Self;
/// _mm_loadu_si128 or _mm256_loadu_si256
unsafe fn load_unaligned(data: *const u8) -> Self;
/// _mm_movemask_epi8 or _mm256_movemask_epi8
unsafe fn movemask(self) -> u32;
/// _mm_cmpeq_epi8 or _mm256_cmpeq_epi8
unsafe fn cmpeq(self, vector2: Self) -> Self;
/// _mm_and_si128 or _mm256_and_si256
unsafe fn and(self, vector2: Self) -> Self;
}

OK, so what's this issue about? I think ideally, we would push the Vector trait up a level in the module hierarchy, port the existing x86 SIMD memchr implementation to a "generic" version, and then replace the existing implementations with shims that call out to the generic version.

This will hopefully let us easily add a WASM implementation of memchr, but adding other implementations in the future would be good too once more intrinsics (e.g., for ARM) are added to std.

(One wonders whether we should just wait for portable SIMD to land in std, but I don't know when that will happen.)

Support memchr4, memchr5, etc.

I'm writing a routine that escapes HTML special characters. To do that, I have to search for five different characters (&, <, >, ', ") simultaneously. I can do this using two calls to memchr2 or memchr3, but that doesn't seem elegant. It would be nice if there was a function that could do this search in one go.

From the implementation side: the PCMPESTRI instruction in SSE4.2 supports searching for up to 16 different needles in parallel. It would be nice if we could expose this somehow.

Memchr causes LLVM ERROR: Do not know how to split this operator's operand!

Minimal reproduction

Memchr 2.2 and up don't compile on the x86_64-unknown-uefi target using the latest nightly, currently rustc 1.47.0-nightly (e15510ca3 2020-08-20), with "LLVM ERROR: Do not know how to split this operator's operand!"

memchr 2.1 and below compile, however.

Related:

Issue #57, but that was a year ago and with a custom target, and this is an upstream one? Sorry if duplicate idk where else this should be reported,

Error building for arm64 on a am64 with docker

👋 Hi!

I'm working on docker-activity and memchr is one of my dependencies.
I'm building docker-activity with docker buildx in order to have a single image for both platform and I end up having a weird behavior and I'm not sure if I open this issue in your repo, or in buildx or even in qemu, but I'll try here.

When I build on a real arm64 machine (RPi4), the image buildx perfectly but when I use docker buildx build --platform linux/arm64 I end up having this issue due to memchr apparently.

#18 33.61    Compiling futures-core v0.3.18
#18 35.36 error: could not compile `memchr` due to previous error
#18 35.37 warning: build failed, waiting for other jobs to finish...
#18 43.90 error: linking with `cc` failed: exit status: 1
#18 43.90   |
#18 43.90   = note: "cc" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crt1.o" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crti.o" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crtbegin.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.0.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.1.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.10.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.11.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.12.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.13.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.14.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.15.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.2.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.3.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.4.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.5.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.6.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.7.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.8.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.build_script_build.9009e48d-cgu.9.rcgu.o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d.25mgmn3yz3lxkcgb.rcgu.o" "-Wl,--as-needed" "-L" "/code/target/release/deps" "-L" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib" "-Wl,--start-group" "-Wl,-Bstatic" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libstd-bb69598673ac6378.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libpanic_unwind-347c34ae82bb4da0.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libminiz_oxide-86fc36b502bfb8aa.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libadler-cb14375f652e6e86.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libobject-9e87208331b99476.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libmemchr-ebe0ff89d9e37134.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libaddr2line-b0f16d22595fdd3b.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libgimli-57bd3e568b1b69be.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libstd_detect-d2296608bd767c8a.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/librustc_demangle-e4d26fe9e39d3be6.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libhashbrown-8322f07825c42064.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/librustc_std_workspace_alloc-403fa8d4a1124a0d.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libunwind-29e90d90171d4117.rlib" "-lunwind" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libcfg_if-ed66653f82293f20.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/liblibc-ad350ff50825d4f2.rlib" "-lc" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/liballoc-98d6df8d800ab2ff.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/librustc_std_workspace_core-e0db88e40d9c7e0b.rlib" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libcore-fcedc0d4b8cb02ca.rlib" "-Wl,--end-group" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/libcompiler_builtins-0c2242734ae54219.rlib" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-znoexecstack" "-nostartfiles" "-L" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib" "-L" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained" "-o" "/code/target/release/build/futures-core-34f97c22b245c83d/build_script_build-34f97c22b245c83d" "-Wl,--gc-sections" "-static" "-no-pie" "-Wl,-zrelro,-znow" "-nodefaultlibs" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crtend.o" "/usr/local/rustup/toolchains/nightly-aarch64-unknown-linux-musl/lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained/crtn.o"
#18 43.90   = note:
#18 43.90
#18 46.48 error: build failed
------
Dockerfile:16
--------------------
  14 |     COPY src/exporter /code/src/exporter
  15 |     COPY src/format /code/src/format
  16 | >>> RUN cargo build --release --offline
  17 |
  18 |     FROM alpine
--------------------
error: failed to solve: process "/dev/.buildkit_qemu_emulator /bin/sh -c cargo build --release --offline" did not complete successfully: exit code: 101

Do you have any idea where that could come from? If you want, here is the Dockerfile I use.

Is there a way to also support memrchr?

I was wondering if it would be possible for this library to also support memrchr?
I can see this might be difficult because memrchr is not a posix-defined functionality.

And another (kinda related) question: Does this crate work on all platforms (especially windows)?

Compilation Issue on rust 1.41.0

error[E0428]: the name `imp` is defined multiple times
   --> .cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.3.1/src/lib.rs:148:5
    |
139 |     fn imp(n1: u8, haystack: &[u8]) -> Option<usize> {
    |     ------------------------------------------------ previous definition of the value `imp` here
...
148 |     fn imp(n1: u8, haystack: &[u8]) -> Option<usize> {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `imp` redefined here
    |
    = note: `imp` must be defined only once in the value namespace of this block

Duplicate lang items with 2/more crates depended on this in Cargo.toml and -Zbuild-std

To reproduce the error,

  1. Create a lib crate
  2. Add 2 or more crates depended on memchr to the Cargo.toml. For me, it would be nom = "7.0" and object = "0.26"
  3. Execute cargo build -Z build-std=core,compiler_builtins,alloc -Z build-std-features=compiler-builtins-mem --target x86_64-unknown-linux-gnu

And there'll go thousands of lines of output of cargo with duplicate lang items or so.

My rust version:
nightly-x86_64-unknown-linux-gnu (default)
rustc 1.56.0-nightly (b03ccace5 2021-08-24)

Changelog

It would be nice if there were somewhere that listed changes between different releases. Currently it seems the only way to do that is by comparing tags, but that's not such a great experience. I know it's a lot of work to make a log for 20 releases, but I think it's worth having.

Plans for searching for 4 or 5 bytes or for u16?

This isn't quite a feature request at this time, just a question at this point:

Is providing memchr4 considered to be within scope for this crate? What about analogs of the byte searching functions that search up to 4 u16 values in a &[u16]?

(Context: Escaping a string according to HTML.)

What about up to 5? (Context: the Data state of the HTML parsing algorithm taking into account CR and LF for end-of-line handling.)

[Windows 10] Failed to run custom build command for 'memchr v2.4.1'

Hi, whenever I try to install or build from source distant the process fails on memchr compilation.

$ cargo build --release
  Downloaded anyhow v1.0.44
  Downloaded polling v2.1.0
  Downloaded tokio-util v0.6.8
  Downloaded instant v0.1.11
  Downloaded libc v0.2.103
  Downloaded openssl-src v111.16.0+1.1.1l
  Downloaded blocking v1.0.2
  Downloaded proc-macro2 v1.0.29
  Downloaded async-process v1.2.0
  Downloaded serde_json v1.0.68
  Downloaded mio v0.7.13
  Downloaded openssl-sys v0.9.67
  Downloaded structopt v0.3.23
  Downloaded tokio-macros v1.4.1
  Downloaded thiserror v1.0.29
  Downloaded whoami v1.1.5
  Downloaded slab v0.4.4
  Downloaded pkg-config v0.3.20
  Downloaded thiserror-impl v1.0.29
  Downloaded cc v1.0.71
  Downloaded tokio v1.12.0
  Downloaded syn v1.0.80
  Downloaded half v1.7.1
  Downloaded structopt-derive v0.4.16
  Downloaded zeroize v1.4.2
  Downloaded 25 crates (7.1 MB) in 7.25s (largest was `openssl-src` at 5.1 MB)
   Compiling winapi v0.3.9
   Compiling proc-macro2 v1.0.29
   Compiling autocfg v1.0.1
   Compiling unicode-xid v0.2.2
   Compiling syn v1.0.80
   Compiling cfg-if v1.0.0
   Compiling memchr v2.4.1
   Compiling libc v0.2.103
   Compiling futures-core v0.3.17
   Compiling cc v1.0.71
   Compiling version_check v0.9.3
   Compiling pin-project-lite v0.2.7
   Compiling log v0.4.14
   Compiling futures-io v0.3.17
   Compiling once_cell v1.8.0
   Compiling vcpkg v0.2.15
   Compiling pkg-config v0.3.20
   Compiling cache-padded v1.1.1
   Compiling typenum v1.14.0
   Compiling parking_lot_core v0.8.5
   Compiling parking v2.0.0
   Compiling fastrand v1.5.0
   Compiling slab v0.4.4
   Compiling waker-fn v1.1.0
   Compiling event-listener v2.5.1
   Compiling lazy_static v1.4.0
   Compiling scopeguard v1.1.0
   Compiling smallvec v1.7.0
   Compiling async-task v4.0.3
   Compiling proc-macro-hack v0.5.19
   Compiling ntapi v0.3.6
   Compiling bitflags v1.3.2
   Compiling unicode-segmentation v1.8.0
   Compiling futures-sink v0.3.17
   Compiling atomic-waker v1.0.0
   Compiling proc-macro-nested v0.1.7
   Compiling serde_derive v1.0.130
   Compiling futures-channel v0.3.17
   Compiling futures-task v0.3.17
   Compiling serde v1.0.130
   Compiling anyhow v1.0.44
   Compiling bytes v1.1.0
   Compiling unicode-width v0.1.9
   Compiling regex-syntax v0.6.25
   Compiling pin-utils v0.1.0
   Compiling ryu v1.0.5
   Compiling cpufeatures v0.2.1
   Compiling subtle v2.4.1
   Compiling serde_json v1.0.68
   Compiling zeroize v1.4.2
   Compiling camino v1.0.5
   Compiling ppv-lite86 v0.2.10
   Compiling strsim v0.8.0
   Compiling opaque-debug v0.3.0
   Compiling regex-automata v0.1.10
   Compiling shell-words v1.0.0
   Compiling convert_case v0.4.0
   Compiling itoa v0.4.8
   Compiling half v1.7.1
   Compiling base64 v0.13.0
   Compiling whoami v1.1.5
   Compiling hex v0.4.3
   Compiling yansi v0.5.0
   Compiling glob v0.3.0
   Compiling instant v0.1.11
   Compiling getrandom v0.2.3
   Compiling futures-macro v0.3.17
error: failed to run custom build command for `memchr v2.4.1`

Caused by:
  could not execute process `C:\Users\modzmi01\Documents\projects\2021\distant\target\release\build\memchr-34b704a4017ecdea\build-script-build` (never executed)

Caused by:
  Access is denied. (os error 5)
warning: build failed, waiting for other jobs to finish...
error: build failed
$ cargo --version
cargo 1.56.0 (4ed5d137b 2021-10-04)


$ rustc --version
rustc 1.56.1 (59eed8a2a 2021-11-01)

I have tried to run the command in powershell run with admin rights but I get the same error. I don't know any rust, is there something I can do to get more details what command from build.rs is causing the issue?

Compiler error when building on custom x86_64 no_std target

I ran into this error when trying to add the cstr_core crate to my no_std OS project, which depends on memchr and disables its default features.

Building in debug mode:

LLVM ERROR: Do not know how to split this operator's operand!

Building in release mode:

error: Could not compile `memchr`.

Caused by:
  process didn't exit successfully: `rustc --crate-name memchr /home/kevin/.cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.2.1/src/lib.rs --color always --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=ebf07e953eacc227 -C extra-filename=-ebf07e953eacc227 --out-dir /home/kevin/my_os/target/x86_64-my_os/release/deps --target x86_64-my_os -L dependency=/home/kevin/my_os/target/x86_64-my_os/release/deps -L dependency=/home/kevin/my_os/target/release/deps --cap-lints allow --emit=obj -C debuginfo=2 -C code-model=large -C relocation-model=static -D unused-must-use -Z merge-functions=disabled -Z share-generics=no --sysroot /home/kevin/.xargo --cfg memchr_runtime_simd --cfg memchr_runtime_sse2 --cfg memchr_runtime_sse42 --cfg memchr_runtime_avx` (signal: 11, SIGSEGV: invalid memory reference)

For reference and in case it matters, my compiler target .json file is:

{
  "llvm-target": "x86_64-unknown-none-gnu",
  "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
  "linker-flavor": "gcc",
  "target-endian": "little",
  "target-pointer-width": "64",
  "target-c-int-width": "32",
  "arch": "x86_64",
  "os": "none",
  "features": "-mmx,-sse,+soft-float",
  "disable-redzone": true,
  "panic": "abort"
}

My rustc version is rustc 1.38.0-nightly (78ca1bda3 2019-07-08).

I've never encountered an error like this before, so I'm not sure what else to say. If more information is needed, I am happy to provide it.

Error when building `memchr` with custom rustflags `-static`

error: failed to run custom build command for `memchr v2.4.1`

Caused by:
  could not execute process `/mnt/c/Users/asaff/Documents/Dev/Github/tool/target/debug/build/memchr-b9330b1f01949571/build-script-build` (never executed)

Caused by:
  No such file or directory (os error 2)

config:

[build]
rustflags = ["-C", "link-arg=-nostdlib", "-C", "link-arg=-static", "-C", "relocation-model=pic"]

When compiling only with link-arg=-nostdlib I get the follwoing crash:

error: failed to run custom build command for `memchr v2.4.1`

Caused by:
  process didn't exit successfully: `/mnt/c/Users/asaff/Documents/Dev/Github/tool/target/debug/build/memchr-b9330b1f01949571/build-script-build` (signal: 11, SIGSEGV: invalid memory reference)
warning: build failed, waiting for other jobs to finish...
error: build failed

The next function in memchr*_iter doesn't seem to compute efficiently.

I am trying to make an efficient csv parser using memchr.

I want to search all commas when there is a line like the one below.
a1,b1,c1,d2
In this case, I want to use the code below.

let sep_iter = memchr3_iter(col_sep, row_sep, b''', &buffer[..]);
...
loop {
let next_sep_pos_wrap = self.sep_iter.next();
.....
}

If sep_iter.next() is calculated here, it seems that the already calculated result can be used again, but it seems to recalculate from the beginning.
When sep_iter.next() operates for the first time, it seems that the result value is already stored in another bit. But now it seems to only use trailing_zeros.

memchr2_iter and memchr3_iter do not properly advance slice position

Hi @BurntSushi I'm the author of Artichoke Ruby. We met on Twitter.

Playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7f18fe26ae4413e95f1e350d77f86b28

The iter_next! macro hard codes how much to advance the haystack position by.

https://github.com/BurntSushi/rust-memchr/blob/1ec5ecce03c220c762dd9a8b08f7a3d95522b765/src/iter.rs#L17

This means that memchr*_iter functions on more than one byte incorrectly scan. For example, this code outputs 2 when it should output 1:

extern crate memchr; // 2.2.1

fn main() {
    let haystack = b"abcdefghijklmnopqrstuvwxyz";
    println!("{}", memchr::memchr2_iter(b'a', b'b', haystack.as_ref()).count());
}

Tests fail with --no-default-features [published memchr 2.3.2]

Just hit this doing various linux-vendor side testing. It's probably not important, but it would be nice to be able to ensure this crate works in this configuration.

cargo +stable test --no-default-features
error[E0433]: failed to resolve: maybe a missing crate `std`?
 --> src/tests/mod.rs:1:5
  |
1 | use std::iter::repeat;
  |     ^^^ maybe a missing crate `std`?

error: cannot find macro `eprintln` in this scope
 --> src/tests/mod.rs:9:5
  |
9 |     eprintln!("LITTLE ENDIAN");
  |     ^^^^^^^^

error[E0433]: failed to resolve: use of undeclared type or module `Vec`
   --> src/tests/iter.rs:167:27
    |
167 |     let mut found_front = Vec::new();
    |                           ^^^ use of undeclared type or module `Vec`

error[E0433]: failed to resolve: use of undeclared type or module `Vec`
   --> src/tests/iter.rs:168:26
    |
168 |     let mut found_back = Vec::new();
    |                          ^^^ use of undeclared type or module `Vec`

error[E0433]: failed to resolve: use of undeclared type or module `Box`
   --> src/tests/iter.rs:201:5
    |
201 |     Box::new(it)
    |     ^^^ use of undeclared type or module `Box`

error[E0433]: failed to resolve: use of undeclared type or module `Box`
   --> src/tests/iter.rs:214:5
    |
214 |     Box::new(it)
    |     ^^^ use of undeclared type or module `Box`

error[E0433]: failed to resolve: use of undeclared type or module `Box`
   --> src/tests/iter.rs:228:5
    |
228 |     Box::new(it)
    |     ^^^ use of undeclared type or module `Box`

error[E0433]: failed to resolve: use of undeclared type or module `Vec`
  --> src/tests/mod.rs:20:21
   |
20 |     let mut tests = Vec::new();
   |                     ^^^ use of undeclared type or module `Vec`

error[E0433]: failed to resolve: use of undeclared type or module `Vec`
   --> src/tests/mod.rs:295:24
    |
295 |         let mut more = Vec::new();
    |                        ^^^ use of undeclared type or module `Vec`

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:52:27
   |
52 |         needle: u8, data: Vec<u8>, take_side: Vec<bool>
   |                           ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:52:47
   |
52 |         needle: u8, data: Vec<u8>, take_side: Vec<bool>
   |                                               ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:66:41
   |
66 |         needle1: u8, needle2: u8, data: Vec<u8>, take_side: Vec<bool>
   |                                         ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:66:61
   |
66 |         needle1: u8, needle2: u8, data: Vec<u8>, take_side: Vec<bool>
   |                                                             ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:81:15
   |
81 |         data: Vec<u8>, take_side: Vec<bool>
   |               ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:81:35
   |
81 |         data: Vec<u8>, take_side: Vec<bool>
   |                                   ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/iter.rs:97:30
   |
97 |     fn qc_memchr1_iter(data: Vec<u8>) -> bool {
   |                              ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:103:34
    |
103 |     fn qc_memchr1_rev_iter(data: Vec<u8>) -> bool {
    |                                  ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:109:30
    |
109 |     fn qc_memchr2_iter(data: Vec<u8>) -> bool {
    |                              ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:116:34
    |
116 |     fn qc_memchr2_rev_iter(data: Vec<u8>) -> bool {
    |                                  ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:123:30
    |
123 |     fn qc_memchr3_iter(data: Vec<u8>) -> bool {
    |                              ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:131:34
    |
131 |     fn qc_memchr3_rev_iter(data: Vec<u8>) -> bool {
    |                                  ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:139:40
    |
139 |     fn qc_memchr1_iter_size_hint(data: Vec<u8>) -> bool {
    |                                        ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/iter.rs:162:58
    |
162 | fn double_ended_take<I, J>(mut iter: I, take_side: J) -> Vec<I::Item>
    |                                                          ^^^ not found in this scope

error[E0412]: cannot find type `Box` in this scope
   --> src/tests/iter.rs:195:6
    |
195 | ) -> Box<dyn DoubleEndedIterator<Item = usize> + 'a> {
    |      ^^^ not found in this scope

error[E0412]: cannot find type `Box` in this scope
   --> src/tests/iter.rs:208:6
    |
208 | ) -> Box<dyn DoubleEndedIterator<Item = usize> + 'a> {
    |      ^^^ not found in this scope

error[E0412]: cannot find type `Box` in this scope
   --> src/tests/iter.rs:222:6
    |
222 | ) -> Box<dyn DoubleEndedIterator<Item = usize> + 'a> {
    |      ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/memchr.rs:92:49
   |
92 |     fn qc_memchr1_matches_naive(n1: u8, corpus: Vec<u8>) -> bool {
   |                                                 ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/memchr.rs:98:57
   |
98 |     fn qc_memchr2_matches_naive(n1: u8, n2: u8, corpus: Vec<u8>) -> bool {
   |                                                         ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/memchr.rs:106:17
    |
106 |         corpus: Vec<u8>
    |                 ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/memchr.rs:113:50
    |
113 |     fn qc_memrchr1_matches_naive(n1: u8, corpus: Vec<u8>) -> bool {
    |                                                  ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/memchr.rs:119:58
    |
119 |     fn qc_memrchr2_matches_naive(n1: u8, n2: u8, corpus: Vec<u8>) -> bool {
    |                                                          ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/memchr.rs:127:17
    |
127 |         corpus: Vec<u8>
    |                 ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
  --> src/tests/mod.rs:19:22
   |
19 | fn memchr_tests() -> Vec<MemchrTest> {
   |                      ^^^ not found in this scope

error[E0412]: cannot find type `String` in this scope
   --> src/tests/mod.rs:144:13
    |
144 |     corpus: String,
    |             ^^^^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:152:14
    |
152 |     needles: Vec<u8>,
    |              ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:154:16
    |
154 |     positions: Vec<usize>,
    |                ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:276:26
    |
276 |             it.collect::<Vec<usize>>(),
    |                          ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:278:63
    |
278 |             self.needles.iter().map(|&b| b as char).collect::<Vec<char>>(),
    |                                                               ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:294:25
    |
294 |     fn expand(&self) -> Vec<MemchrTest> {
    |                         ^^^ not found in this scope

error[E0412]: cannot find type `String` in this scope
   --> src/tests/mod.rs:300:33
    |
300 |             let mut new_corpus: String = repeat('%').take(i).collect();
    |                                 ^^^^^^ not found in this scope

error[E0425]: cannot find function `repeat` in this scope
   --> src/tests/mod.rs:300:42
    |
300 |             let mut new_corpus: String = repeat('%').take(i).collect();
    |                                          ^^^^^^ not found in this scope
    |
help: possible candidate is found in another module, you can import it into scope
    |
1   | use core::iter::repeat;
    |

error[E0412]: cannot find type `String` in this scope
   --> src/tests/mod.rs:309:26
    |
309 |             let padding: String = repeat('%').take(i).collect();
    |                          ^^^^^^ not found in this scope

error[E0425]: cannot find function `repeat` in this scope
   --> src/tests/mod.rs:309:35
    |
309 |             let padding: String = repeat('%').take(i).collect();
    |                                   ^^^^^^ not found in this scope
    |
help: possible candidate is found in another module, you can import it into scope
    |
1   | use core::iter::repeat;
    |

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:330:47
    |
330 |     fn needles(&self, count: usize) -> Option<Vec<u8>> {
    |                                               ^^^ not found in this scope

error[E0412]: cannot find type `Vec` in this scope
   --> src/tests/mod.rs:348:57
    |
348 |     fn positions(&self, align: usize, reverse: bool) -> Vec<usize> {
    |                                                         ^^^ not found in this scope

error[E0599]: no method named `to_string` found for type `&'static str` in the current scope
  --> src/tests/mod.rs:28:36
   |
28 |             corpus: statict.corpus.to_string(),
   |                                    ^^^^^^^^^ method not found in `&'static str`
   |
   = help: items from traits can only be used if the trait is in scope
   = note: the following trait is implemented but not in scope; perhaps add a `use` for it:
           `use alloc::string::ToString;`

error: aborting due to 46 previous errors

Some errors have detailed explanations: E0412, E0425, E0433, E0599.
For more information about an error, try `rustc --explain E0412`.
error: could not compile `memchr`.

To learn more, run the command again with --verbose.

Feature request: no-std + alloc

I was looking through the code of this crate. I have a need for something like this on a no-std + alloc target, but it seems several features (such as using Cow from alloc) are missing. That should be possible to support.

memchr 2.5.0 fails to compile on Android

After upgrading to 2.5.0, I'm getting build errors on Android like this:

[CONTEXT] stderr: error[E0531]: cannot find tuple struct or tuple variant `GenericSIMD128` in this scope
[CONTEXT]    --> third-party/rust/vendor/memchr-2.5.0/src/memmem/mod.rs:885:13
[CONTEXT]     |
[CONTEXT] 885 |             GenericSIMD128(gs) => GenericSIMD128(gs),
[CONTEXT]     |             ^^^^^^^^^^^^^^ not found in this scope
[CONTEXT] 

It appears that the GenericSIMD128 enum variant is defined with cfg target_arg = "x86_64" or memchr_runtime_wasm128, but then it is used in the code without a cfg check limiting it to those platforms, causing it to fail to compile.

approaching 1.0

This crate provides a small reasonably well defined API and probably won't ever see any breaking changes. While it may still need new additions or improved platform support, I expect it won't require backwards incompatible changes. Therefore, I propose cutting a 1.0 release in the next few weeks.

cc @bluss @nicokoch

`memrchr` implementations may conflict with stacked borrows

The reverse search implementations (memrchr) seem illegal under stacked borrows. They all follow the same pattern, so here I'll only annotate one. It retrieves a raw pointer to the end of the haystack from a reference to an empty slice, but then uses that pointer to iterate backwards by offsetting it with negative indices. Under strict rules, that pointer would however only be valid for access to the bytes that the reference covered from which it was cast, i.e. a zero-length slice at the end.

To my understanding, this is very likely illegal but not yet caught by MIRI since it does not strictly track the source for raw pointers (^source). @RalfJung might be able to provide more definitive insights.

Relevant code (inserted comments marked as // !):

pub fn memrchr(n1: u8, haystack: &[u8]) -> Option<usize> {
    // [...]
    let start_ptr = haystack.as_ptr();
    // ! This pointer only covers the same slice that the reference does.
    // ! Would need to create these manually from offsetting the start pointer
    // ! which covers the whole array.
    let end_ptr = haystack[haystack.len()..].as_ptr();
    let mut ptr = end_ptr;

    unsafe {
        // [...]
        ptr = (end_ptr as usize & !align) as *const u8;
        // [...]
        while loop_size == LOOP_SIZE && ptr >= ptr_add(start_ptr, loop_size) {
            // [...]
            // ! These are outside the bounds of the reference from which ptr was created.
            let a = *(ptr_sub(ptr, 2 * USIZE_BYTES) as *const usize);
            let b = *(ptr_sub(ptr, 1 * USIZE_BYTES) as *const usize);
            // [...]
            ptr = ptr_sub(ptr, loop_size);
        }
        // [...]
    }
}

Library code reduced to that version of memrchr.

The fix is simple, create ptr from manually offsetting haystack.as_ptr() which is valid for the whole haystack. I also don't expect any miscompilation.

Performance on 64-bit arch

I'm writing a parser and looking for 4 needles. I tried adjusting the bithacks to analyze 8 bytes at a time since I'm using a 64-bit arch. I anticipated memchr to be faster but I am finding that is not true. Am I doing something wrong? The benchmarks are the following using rustc 1.31.0-nightly (fc403ad98 2018-09-30). I got similar results in both MacOS and Linux.

test tests::bench_hasvalue                   ... bench:           2 ns/iter (+/- 0)
test tests::bench_memchr                     ... bench:          13 ns/iter (+/- 1)

The code is the following.

extern crate byteorder;
extern crate memchr;

use self::byteorder::{ByteOrder, NativeEndian};

// Taken from "Determine if a word has a zero byte" at http://graphics.stanford.edu/~seander/bithacks.html
// and adjusted for 64-bits and Rust complaining of overflow.
fn haszero(x: u64) -> bool {
    (x.wrapping_sub(0x0101_0101_0101_0101) & !x & 0x8080_8080_8080_8080) != 0
}

// Taken from "Determine if a word has a byte equal to n" at http://graphics.stanford.edu/~seander/bithacks.html
// and adjusted for 64-bits.
fn hasvalue(haystack: &[u8], needles: &[u8]) -> bool {
    let x = NativeEndian::read_u64(haystack);
    let y = !0 as u64 / 255 as u64;
    for c in needles {
        if haszero(x ^ (y * u64::from(*c))) {
            return true;
        }
    }
    false
}

extern crate test;

#[cfg(test)]
mod tests {
    use self::test::Bencher;
    use super::*;

    #[bench]
    fn bench_hasvalue(b: &mut Bencher) {
        b.iter(|| {
            let haystack = test::black_box(b"01234567");
            hasvalue(haystack, &[b'a', b'b', b'c', b'd'])
        });
    }

    #[bench]
    fn bench_memchr(b: &mut Bencher) {
        b.iter(|| {
            let haystack = test::black_box(b"01234567");
            memchr::memchr(b'a', haystack).is_none()
                && memchr::memchr3(b'b', b'c', b'd', haystack).is_none()
        });
    }
}

kaspersky antivirus check

at the time of building, the antivirus thinks it has found a virus
VHO:Trojan-Banker.Win32.ClipBanker.gen

Off by one index in Memchr iterator

It seems surprising to me that when using the Memchr iterator, the position that is returned when calling .next() is the position of the needle + 1. Looking at the tests of the crate, this is expected behavior. Could you please explain why it is this way? I could then make a PR with additional documentation and an example to show usage of the Memchr (and Memchr2/3) iterator.

Clarity with regard to `--no-default-features` leading to `fallback::memchr`

First and foremost, thank you for the crate Andrew.

I’ve been employing it for a while, and I’ve recently made one of the libraries I’m working on no_std. Coincidentally while doing so, I altered my benchmarking habits, and never got hold of the fact --no-default-features leads to fallback::memchr due to std::is_x86_feature_detected, and only discovered the change via cargo asm.

What do you think of extending the README‘s no_std section to mention the otherwise “silent” requirement?

(I am now using std_detect with an extern no_std memchr. Considering you work on that too, I assume you’ll want it to be stable before adopting it.)

Consider implementing memchr with arbitrary char predicate

Could be something like this:

trait CharPred: CharPredSecret {}

struct CharEq(u8); impl CharPred for CharEq {}
struct CharLtSigned(i8); impl CharPred for CharLtSigned {} 
struct CharLtUnsigned(u8); impl CharPred for CharLtSigned {}
struct CharGtSigned(i8); impl CharPred for CharGtSigned {}
struct CharGtUnsigned(u8); impl CharPred for CharGtUnsigned {}

struct CharOr(CharPred, CharPred) {}
impl CharPred for CharOr {}

fn memchr_pred(needle: impl CharPred, haystack: &[u8]) -> Option<usize> { ... }

This could be useful in scenarios like this:

fn need_escape(s: &str) -> bool {
  let pred = CharOr(,
    CharOr(CharEq(b'"'), CharEq(b'\"')),
    CharOr(CharEq(b'\\'), CharLtSigned(32)),
  );
  memchr::char_pred(pred, s.as_bytes()).is_none()
}

fn c_escape(s: &str) -> String {
  if !need_escape(s) { return s.to_owned(); }
  // else slow iteration over character
}
// trait not exposed to user
trait CharPredSecret {
  fn eval<V: Vector>(self, arg: V) -> V;
}

impl CharPredSecret for CharEq {
  fn eval<V: Vector>(self, arg: V) {
    // compiler should be smart enough to move splat out of the loop
    let c = V::splat(self.0);
    V::cmpeq(arg, c)
  }
}

ifunc macro cannot be used in such memchr_pred function because there are no generic statics, but that is probably OK, it should be at least better that non-SIMD version.

memchr 2.4.0 msrv breaks nom 5.1.2 msrv of 1.37

See rust-bakery/nom#1313.

nom 5.1 depends on `memchr = "^2.0". It has a documented msrv of 1.37.

Since 2.4.0 is supposed to be semver-compatible, cargo will select memchr 2.4.0, which does not build on 1.37.

I understand that balancing msrv vs semver can be a drag on maintainer productivity, so I have no real expectations here other than documenting that it happened.

Undefined symbols ld error when enabling `use_std`

Not sure if this is an issue with memchr, nom, elastic-rs/elastic (where I am using nom) or even std/core/compiler but I thought I would start here...

I am getting undefined symbols ld errors to various symbols in core when I enable the memchr/use_std feature of nom, see elastic-rs/elastic/pull/389 for a little more background + error logs and this or this Travis build.

I have reproduced it on macOS 10.14 & 10.15 and Ubuntu 19.04 (and for the sake of completeness; various Linux via Docker) with rustc 1.38.0 (625451e37 2019-09-23), 1.39.0-beta.6 (224f0bc90 2019-10-15), 1.40.0-nightly (4a8c5b20c 2019-10-23)—and a few other nightlies—and when cross compiling to x86_64-unknown-linux-musl from macOS and Linux hosts.

If you think this doesn't belong here, please let me know where you think I should file this. I can also upload the current Cargo.lock if that would help.

PS thanks for all your awesome work—I have been using rg practically daily for years and love it.

error[E0428]: the name `imp` is defined multiple times

[192.168.18.146] out: error[E0428]: the name imp is defined multiple times
[192.168.18.146] out: --> /home/aram/.cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.3.1/src/lib.rs:148:5
[192.168.18.146] out: |
[192.168.18.146] out: 139 | fn imp(n1: u8, haystack: &[u8]) -> Option {
[192.168.18.146] out: | ------------------------------------------------ previous definition of the value imp here
[192.168.18.146] out: ...
[192.168.18.146] out: 148 | fn imp(n1: u8, haystack: &[u8]) -> Option {
[192.168.18.146] out: | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ imp redefined here
[192.168.18.146] out: |
[192.168.18.146] out: = note: imp must be defined only once in the value namespace of this block

Seems [cfg(all(target_arch = "x86_64", memchr_runtime_simd, not(miri)))] and [cfg(all(memchr_libc, not(all(target_arch = "x86_64", memchr_runtime_simd, miri))))] are not mutually exclusive.

Purpose of linking against 'std'

What's the purpose of inking against the standard library? Everything that this crate needs from std is provided in core, which in turn is re-exported by std.

The libc crate can also be used as a dependency even in a #![no_std] environment, however this crate only enables it when use_std is enabled.

For crate authors who want to utilize memchr and provide #![no_std], they're left with the fallback implementation. This is slower on macOS than using libc::memchr.

Edit: I didn't try testing with enabling the libc feature by itself. Apologies 😅

Poor performance on Zen 1/Threadripper due to loop unrolling

I have noticed that the SSE2 implementation of memchr in this crate unrolls the loop 4x. Unfortunately, this seems to lead to a significant performance drop on processors on the Zen 1 architecture. Benchmarked on a TR 1950x, I see about 50-60% better performance compared to this crate when avoiding loop unrolling altogether. Below is a benchmark demonstrating this. The memchr implementation in the repository linked below is written in such a way so that you just have to change one constant (UNROLL_SIZE) to change the amount of loop unrolling that the function uses for the main loop.

https://github.com/redzic/memchr-demonstration

Just clone the repository and run cargo run --release to run the benchmark.

Increasing UNROLL_SIZE leads to worse performance on my TR1950x, with 4x unrolling being basically the same speed as this crate which makes sense. However, when using UNROLL_SIZE = 8, the performance difference between this crate's implementation and the custom implementation spikes again to being about 10% faster than this crate (i.e., this crate's SSE2 memchr is 90.9% as fast).

Would it be possible to tune the unroll factor, or possibly even do something similar to OpenBLAS, so that we query information about the CPU such as cache size or even exact CPU model, and dispatch code accordingly? Perhaps this functionality could be implemented behind some kind of feature flag.

Support miri

The crate does not seem to work well with miri at the moment for multiple reasons. One reason is that the crate uses the x86 / C implementations over the fallback implementation, which miri doesn't support. But even when cfg'ing out those implementations via cfg(miri), the fallback implementation uses a lot of bit math on pointers, which miri also doesn't like.

Use rustc_layout_scalar_valid_range_end(usize::MAX - 1) for the index

From rust-lang/rust#73139 (comment):

The index is always less than the length. So even if the length is usize::MAX, the index will be at most MAX - 1 and so cannot overflow.

It would be great to tell rustc this is the case by using rustc_layout_scalar_valid_range_end(usize::MAX - 1). That would allow storing Option<index> in usize instead of needing an extra bit, which in some cases could double the size of the struct due to alignment requirements.

Failing that (since rustc_layout_scalar_valid_range_end is unstable and likely will never be stabilized), would it be possible to document that the index is always less than usize::MAX?

Provide const implementation

As const evaluation is slowly becoming powerful enough for wide use, memchr not providing a const version is slowly becoming an issue.

For example, in Amanieu/cstr_core#25 (an embedded version of cstr), it would be convenient to construct a &str out of const array data taken from C in mixed C-and-Rust environments. The workaround there is to have an own simple-stupid-but-const memchr implementation and dispatch through const_eval_select between that and an actually runtime-friendly memchr from this crate.

On the long run, it would be great if this crate would just provide its memchr as const -- obviously that's not gonna fly any time soon (especially considering the MSRV), but there could be steps:

  • This crate could provide a memchr_const function (feel free to take the one from there if you like, it's under a different license but I wrote it and hereby also license it under this crate's license).
  • This crate could provide an auto-dispatching memchr under a nightly-only feature gate.

The full solution (memchr "just" being const) is likely to be tricky, because while a "regular" memchr can be const on stable in the forseeable future, I don't expect that to happen for vectorized or even libc-calling versions. Thus, once any of these optimizations are on, dispatch between the runtime-and-optimized and the const version will still need to happen, and the only way to do that currently AFAIK is through const_eval_select which there are no plans for stabilization for. This issue could be a use case to start stabilizing const_eval_select, or explore alternative avenues.

Runtime configuration of byte frequency table used to classify rare bytes

memchr implements a generic SIMD accelerated search that is ideal for implementing something like CheatEngine where you scan the memory of an executable process to aid reverse engineering. This process involves repeatedly scanning for possibly millions of small values (u16, i32, ...) in the memory of that process. The user might have information ahead of time about the frequency distribution of bytes in the memory being scanned, which may vary wildly between executables. The user might also be able to control the program to ensure certain rare bytes appear in the program's memory at certain times.

There is an issue that prevents memchr from performing optimally when scanning binary executables - the byte frequency table. The core algorithm is based on detecting rare bytes with specific positions in haystack (the prefilter) and then testing these matches to check if the needle has been found. As mentioned in the incredibly detailed comments, the performance of this algorithm is highly dependent on the byte frequency table used to determine what is a rare byte. While the table that is included in memchr is optimal for the majority of cases, there are some specific data types that have very different byte frequency distributions, which causes memchr to perform worse on those inputs than it otherwise might with a different byte frequency table.

To illustrate this point, consider the following byte frequencies (where ideal is the ideal frequency for an x86 binary):

byte memchr ideal
\x00 55 255
\xdd 255 0
\x8b 80 186
H 150 254

Now, consider scanning for the needle H\x00\xdd\x8b in an x86 binary. memchr would identify \x00 and \x8b as the rarest bytes, when they are in fact common bytes. Even if memchr considered \x00 to be a frequent byte via configuration, it would still choose H and \x8b as the rarest bytes, which are both much more common than \xdd, the only actually rare byte. This would result in a lot of unnecessary false positives, decreasing the throughput. This is a simple case, but it is easy to extend this idea to many other pathological input sequences that defeat the default frequency table, and might also reasonably appear in an executable or be scanned for by a user.

Now consider a haystack that contains HHH\x00\xdd\x8b. The user might know in advance that searching for HHH\x00 and searching for H\x00\xdd\x8b will both return a single unique match, the sub-slice that was mentioned earlier (the exact indices are not identical but that is not the point). The user might also know that \xdd is a very rare byte in their dataset. The user should be able to choose scanning for \xdd instead of a more common byte to speed up their searches. I cannot imagine how to support something like this without providing the user a mechanism for customizing the byte frequency table.

The proposed solution is to allow the user to specify the byte frequency table at runtime by modifying the memchr::memmem::rarebytes::rank function. Currently, this function reads from the global byte frequency table.

My first idea was to create an enum that can be provided to a FinderBuilder and then forwarded to RareNeedleBytes to choose the table:

enum ByteFrequencies<'a> {
    Default,
    Custom(&'a [u8; 256]),   
}

This enum can be stored in the NeedleInfo struct and used at runtime to determine which byte frequency table to use. However, this introduces the lifetime 'a, which may or may not be the same as the needle ('n) and haystack ('h) lifetimes that are stored in related structs. Considering lifetime 'a to be separate and different requires the public API of Finder to be changed to add this lifetime.

  • ByteFrequencies -> NeedleInfo -> Searcher -> Finder

I believe that the extra lifetime might make life more difficult for the compiler, which is why I observed a small but noticeable (around 10%) impact on the performance of constructing a Finder with the default frequency table on my local machine.

Also, by introducing a new member on the struct NeedleInfo, the size/alignment properties of Finder, Searcher and NeedleInfo changed, which also might be the reason for the performance impact I observed. (if this sounds crazy to anyone I suggest you take a look at the wonderful performance talk by the legend Emery Berger titled 'Performance Matters' for more details https://www.youtube.com/watch?v=r-TLSBdHe1A).

An idea to remove the generic lifetime from ByteFrequencies:

enum ByteFrequencies {
    Default,
    Custom(&'static [u8; 256]),   
}

However, I believe this static API is logically inconsistent with the FinderBuilder API. You can construct millions of unique Finders at runtime and then discard them later, but the same cannot be said for static arrays.

Also, the user might want to perform analysis of their specific corpus at runtime to generate a specialized byte frequency table (like 'pre-training'). This is a very interesting use case in the context of the analysis of binary executables, as there is a lot of information that can only be obtained at runtime and can be useful in optimizing many kinds of searches. Forcing the user to use a static byte frequency table would necessarily prevent this use-case.

Another idea to remove the generic lifetime and also allow runtime generation of the byte frequency table:

enum ByteFrequencies {
    Default,
    Custom(Box<[u8; 256]>),
}

However, an issue with this approach is that the ByteFrequencies enum has a size of 16 bytes which is mostly wasted. Another issue is that it seems that conceptually we should be passing around some kind of reference to a byte table that can be reused, instead of copying the table for each construction, but that ultimately depends on benchmarks. Also, now the standard library and memory allocation are required for an operation that is unrelated to both of those things (Rc, Arc and others have similar issues).

I also tried storing the byte table inline, but this had disastrous results on performance. This is probably because this extra storage pushed important members on related structs into new cache lines, which affected subsequent operations on these members.

enum ByteFrequencies {
    Default,
    Custom([u8; 256]),
}

One thing I have not tried yet but might be interesting is trying to re-organize the members and memory layout of any struct that stores a ByteFrequencies object. This might allow using an inline byte frequency table for example, but would likely result in breaking changes to the layout of public structs in memchr. Even just introducing the ByteFrequencies object already changes the memory layout of certain structs, which I am not sure about whether it is something undesirable or not.

All of this culminated in the pull request I submitted, but I realize now it is better to just lay it all out here and figure out the best path forward together. I appreciate any feedback you may have on these suggestions.

P.S. I think memchr is an incredible library and the code quality and detail of documentation definitely helped me greatly in understanding the internals and even being able to suggest this in the first place, so kudos.

Failure to build on Apple Silicon M1

Using my site code I attempted to build memchr using cargo build. The build.rs file fails to run with the following error:

$ cargo build
   Compiling proc-macro2 v1.0.24
   Compiling libc v0.2.73
   Compiling syn v1.0.48
   Compiling memchr v2.3.3
   Compiling log v0.4.11
   Compiling bitflags v1.2.1
   Compiling ryu v1.0.5
   Compiling serde_derive v1.0.117
error: failed to run custom build command for `memchr v2.3.3`

Caused by:
  process didn't exit successfully: `/Users/cadey/Code/site/target/debug/build/memchr-08053740b12295b3/build-script-build` (signal: 9, SIGKILL: kill)
warning: build failed, waiting for other jobs to finish...
error: build failed

I assume this may be a rustc bug as the error in question is identical to the error you get when you attempt to run an unsigned binary on an M1 Mac. I attempted to sign the binary manually with the codesign tool and it failed with this error:

$ codesign -s - /Users/cadey/Code/site/target/debug/build/memchr-08053740b12295b3/build-script-build
/Users/cadey/Code/site/target/debug/build/memchr-08053740b12295b3/build-script-build: replacing existing signature
/Users/cadey/Code/site/target/debug/build/memchr-08053740b12295b3/build-script-build: the codesign_allocate helper tool cannot be found or used

Should I file this as a rustc bug?

could we have a UTF-8 memchr?

This could branch out to {memchr, memchr2, memchr3} depending on the first 4 bits of the needle: char argument. Alas, we have no memchr4 function.

Revisit Windows performance

I think it would be great if someone with access to a windows environment could run benchmarks to revisit the built-in vs this crate's fallback implementation of memchr.

Provide lines-iterator

I guess this came up before because it's so obvious: Provide an iterator akin to std::str::lines, but using memchr to search for line endings. The impl in core::memchr came up in my benchmarks and it turns out the code below is 2x to 3x faster than what std::str::lines() does, both in synthetic and real-life code.

pub fn lines(inp: &[u8]) -> impl Iterator<Item = &[u8]> {
    let mut inp = inp;
    std::iter::from_fn(move || {
        if inp.is_empty() {
            return None;
        }
        let ending = memchr::memchr(b'\n', inp).unwrap_or(inp.len() - 1) + 1;
        let (mut line, rest) = inp.split_at(ending);
        inp = rest;
        if let Some(b'\n') = line.last() {
            line = &line[..line.len() - 1];
            if let Some(b'\r') = line.last() {
                line = &line[..line.len() - 1];
            }
        }
        Some(line)
    })
}

pub fn str_lines(inp: &str) -> impl Iterator<Item = &str> {
    lines(inp.as_bytes()).map(|sl| unsafe { std::str::from_utf8_unchecked(sl) })
}

Is the performance difference (on my machine!) reason enough to include it in memchr ? Feel free if so :-)

BUG: Cannot compile with newest version of XCode Command Line Tools

I am on macOS Big Sur 11.1 and just updated to the newest version of XCode Command Line Tools. After I did (and I assume it is something to do with that, because everything with memchr worked fine before that), I got the following error message when I tried to cargo check anything that had a transitive dependency on memchr.

error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" "-m64" "-arch" "x86_64" "-L" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.0.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.1.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.10.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.11.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.12.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.13.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.14.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.15.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.2.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.3.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.4.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.5.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.6.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.7.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.8.rcgu.o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.build_script_build.55pqr4uh-cgu.9.rcgu.o" "-o" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/build/memchr-23bc3b7c5319ae9c/build_script_build-23bc3b7c5319ae9c.14i7c91is1si092v.rcgu.o" "-Wl,-dead_strip" "-nodefaultlibs" "-L" "/Users/cadenhaustein/MEGA/Coding_Projects/hedgehog/target/debug/deps" "-L" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libstd-cf45c391193686b0.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libpanic_unwind-bfb82cdc97bd35ea.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libobject-0e543fa90fe41090.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libaddr2line-f50981f4143e4c69.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libgimli-bbe9b2276f9fe948.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_demangle-c04e87d408a5de4c.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libhashbrown-3865f13d7ece40bb.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_std_workspace_alloc-83f3487f53b2e684.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libunwind-518f93c579715cca.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcfg_if-ab0ea20e972aeb4f.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liblibc-50e4694516c58a71.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liballoc-8171c7b795c55f62.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/librustc_std_workspace_core-8357f853e5f39333.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcore-80c77ff1434731cf.rlib" "/Users/cadenhaustein/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcompiler_builtins-8c8eeab435e54e85.rlib" "-lSystem" "-lresolv" "-lc" "-lm"
  = note: xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun


error: aborting due to previous error

error: could not compile `memchr`

Any advice on how to fix it?

Thanks!

I was working on something that can parse ninja.build files. It used to be twice as slow as ninja itself, but after replacing just a few lines with memchr and memchr3 from this crate, it is now almost twice as fast as ninja.

Thanks! :)

Potential use of `core::hint::unreachable_unchecked` to avoid bounds checks for all users

Consider the following code. It's used as a wrapper in one of my projects to avoid bounds checks in safe code.

#[inline]
fn chr(s: &[u8], b: u8) -> Option<usize> {
    memchr::memchr(b, s).map(|i| {
        if i >= s.len() {
            unsafe { core::hint::unreachable_unchecked() }
        }
        i
    })
}

I wonder if it's practical and sound to insert unreachable hints into memchr crate, so that all its users could get an increase in performance. It's just a tentative suggestion that needs more discussion :)

drop libc and use vendor intrinsics

I've spent the evening thinking about this, and I think it's possible to do and match glibc's performance, but I haven't experimented yet. I'm going to take a crack at this soon.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.