Comments (6)
Mine is a private cluster deployment, on disk, will this matter?Does it affect the application's ability to read the file?
from orc.
Hi, @liangrui1988 . Please refer ORC spec in our official website.
Here is the formula we use. Padding exists simply to match the underlying file system's block size historically. If you are using S3, your program will not read that part at all. So, there is no impact in the modern cloud infra.
orc/java/tools/src/java/org/apache/orc/tools/FileDump.java
Lines 771 to 780 in 6b053d4
from orc.
Does reading and writing different versions of ORC matter?For example, 1.5.12 to write files, 1.5.5 to read files?thank you
from orc.
Sorry but why don't you test your actual private cluster? It's really up to you in unknown cases. We have no recommendations for private cluster deployments because we don't know what you are using and talking about it, @liangrui1988 .
from orc.
This is caused by the ORC parameter problem, not the cluster problem. The writer reduces the stripe of the ORC and restores it to 256MB.
from orc.
Could you provide some reproducible example with the latest Apache ORC 1.7.x, @liangrui1988 ?
FYI, Apache ORC 1.5.x is EOL and 1.6.x will reach EOL soon.
from orc.
Related Issues (20)
- orc CXX fail to build if libgtest-dev is installed (debian-like systems) HOT 5
- ORC-1616: Upgrade aircompressor to 0.26 HOT 1
- In cpp/java sdk, SearchArgument looks like didn't use the footer and stripe stats. HOT 1
- ORC-1618: Disable building tests for snappy HOT 1
- ORC-1620: Add Apple Silicon Test Coverage HOT 1
- ORC-1621: Switch to `oraclelinux9` from `rocky9` HOT 1
- ORC-1621: Switch to `oraclelinux9` from `rocky9` HOT 1
- What's the meaning of EvaluatedRowGroupCount in ReaderMetrics HOT 5
- support new zstd library in java 8 HOT 5
- [C++] uniform identifiers naming style. HOT 7
- [Vcpkg] Add 2.0.0 to vcpkg versions HOT 6
- Release Apache ORC 1.9.3 HOT 4
- [C++] Store decimal values as strings instead of floats in the JSON output of `orc-contents` HOT 6
- [Python] Snappy 1.2.0 breaking release - `ImportError: .../../../.././liborc.so: undefined symbol: _ZN6snappy11RawCompressEPKcmPcPm` HOT 1
- Release Apache ORC 1.8.7
- Java orc-core 2.0.0:nohive doesn't relocate orc-format HOT 8
- ORC-1696: Fix ClassCastException when reading avro decimal type in bechmark HOT 1
- ORC-1696: Fix ClassCastException when reading avro decimal type in bechmark HOT 1
- ORC-1696: Fix ClassCastException when reading avro decimal type in bechmark HOT 1
- ORC-1699: Fix SparkBenchmark in Parquet format according to SPARK-40918 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from orc.