Describe the bug We've identified some cases with long strings co

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

[BUG] Prevent silent fallback to uncompressed when writing parquet files with ZSTD compression about cudf HOT 2 CLOSED

GregoryKimball commented on May 24, 2024 4

[BUG] Prevent silent fallback to uncompressed when writing parquet files with ZSTD compression

from cudf.

Comments (2)

GregoryKimball commented on May 24, 2024 3

Hello @mhaseeb123, after some investigation with @vuule and @etseidl, we think that a good option here could be changing the libcudf and cuDF-python default dictionary_policy to ADAPTIVE. Would you please create a draft PR to change the default? I would like to request evaluation by Spark in the next week or two (FYI @revans2 and @nvdbaranec)

from cudf.

vuule commented on May 24, 2024 1

That was fast!

Should we adjust the behavior of ADAPTIVE to be less restrictive?
One proposal that came out of discussion with Ed is to match compression block limit if it exists (if not, behavior is the same as ALWAYS) and follow user-specified limit if it's set. Just defaulting to ADAPTIVE and keeping a hard-coded limit might lead to larger files when we give up on dictionaries even when they don't interfere with compression.

from cudf.

[BUG] Prevent silent fallback to uncompressed when writing parquet files with ZSTD compression about cudf HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent