Comments (11)
Indeed:
#[inline]. This suggests that the function should be inlined, including across crate boundaries.
from simdutf8.
@lemire That is what I thought but benchmarks proved me otherwise. I will dig up the branch, run the benchmark again and post the comparison here.
from simdutf8.
Note also that you could just fallback on a scalar implementation when the platform is not supported. :-)
from simdutf8.
Note also that you could just fallback on a scalar implementation when the platform is not supported. :-)
That is what it does already. It works similar to simdjson as far as I understand it. The function pointer is initialized to the get_fastest()
function in validate_utf8_basic()
. On the first invocation it checks what CPU features are supported and then replaces the function pointer to itself with the fastest available implementation for further invocations.
On architectures for which there currently is no SIMD implementation (e.g. ARM right now), the fallback method is compiled in.
from simdutf8.
So the following advice might not be needed...
If there is no native implementation for your platform (yet), use the standard library instead.
from simdutf8.
So the following advice might not be needed...
If there is no native implementation for your platform (yet), use the standard library instead.
Correct, though there is a slight performance penalty with the default LTO setting as Rust does not inline function calls across crate boundaries while it does inline functions calls to the std library.
from simdutf8.
Ah. So that explains why you would see weak performance on short strings.
Any way to lift this 'no inlining' limitation? It seems quite substantial.
from simdutf8.
Compiling with lto=full
or lto=thin
enables cross-crate inlining. It is just not the default.
I will benchmark delegating the validation of byte sequences shorter than 64 bytes to the std library again with LTO enabled.
from simdutf8.
You might be able to get away with just slapping a #[inline]
on the pub fns to hint on inlining across crate boundaries.
from simdutf8.
You might be able to get away with just slapping a
#[inline]
on the pub fns to hint on inlining across crate boundaries.
You are both right of course, not sure how I missed that. I have added the #[inline
] attribute and will benchmark that and calling std::str::from_utf8()
for small strings next.
from simdutf8.
Fixed in v0.1.1.
from simdutf8.
Related Issues (20)
- Benchmarking error HOT 1
- Add streaming API which works with the basic and compat APIs HOT 2
- Add SIMD-enabled replacement for std::str::is_ascii() HOT 1
- The functions `validate_utf8_basic` should not be labeled unsafe HOT 3
- Experimental stdsimd implementation
- Add support for WebAssembly SIMD
- Add support for an x86/x64 SSSE3 variant
- Mislink on Windows with lld and thinlto HOT 3
- Benchmark against simdutf/simdutf
- UTF-8 reordering and deletion detector HOT 1
- Run Fuzzer on wasm32 Targeted Code HOT 3
- AArch64 SIMD intrinsics are now stable HOT 3
- Deserialising unicode escape gives non-UTF8 String HOT 2
- Heads-up: const_err lint is going away HOT 3
- wasm CI is broken on Windows
- Replacement for `String::from_utf8` HOT 4
- [Bug] Test failure on arm64 HOT 5
- Miri reports UB with simd_bitmask (FW) HOT 4
- Upstream into libcore/libstd? HOT 3
- Question. Speed on large inputs. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from simdutf8.