ByteDance's next generation universal high-performance Parquet Reader.
The Bolt Parquet Reader is a native Parquet Reader in Rust language.
This design supports steaming reading, which allows to read the whole batch in smaller batches and reduce the peak memory cost. And, as a result, it is able to increase the overall parallelism.
Moreover, Bolt Parquet Reader is designed with a sophisticated filter push down strategies and range selectivity operations. Considering the consecutive reading progress, this feature is able to reduce unnecessary branching operations.
This project is under actively development. You are more than welcomed to make contributions.
git clone https://github.com/bytedance/bolt-parquet-reader.git
# We recommend to use this version to enable zero copy features.
rustup install nightly-2023-11-13
cargo +nightly-2023-11-13 fmt --all -- --check
cargo +nightly-2023-11-13 build --package bolt-parquet-reader --lib
cargo +nightly-2023-11-13 test --verbose
cargo +nightly-2023-11-13 clippy --verbose
The Bolt Parquet Reader is licensed under Apache 2.0.
During the development, we referenced a lot to Rust Arrow 2 implementation and would like to express our appreciation the authors.