Is there any documentation for flzma2, e.g. max dictionary size, fast bytes, cycles, c

I'm using your excellent library inside the p7zip dev branch (<a href="https://github.

No documentation? about fast-lzma2 HOT 9 OPEN

ApexMods commented on August 26, 2024

No documentation?

from fast-lzma2.

Comments (9)

conor42 commented on August 26, 2024 1

Nice :)
In theory LZMA2 only supports a dictionary up to 1.5 Gb, but in practice the decoder can handle more than this. There may be decoders which balk at 2 Gb. I'll try to find out if raising it will cause any trouble.

from fast-lzma2.

ApexMods commented on August 26, 2024 1

Thanks, Conor. 👍🏻

With only 16 Gb RAM to test on, memory pressure was too big for extensive testing of the 2 GB dictionary (although decoding with e.g. standard p7zip 16.02 worked flawlessly). Running at 1.5 GB dictionary now (don't know how to set that limit in the source, as it's 2^n defined). Raising match finder cycles did not improve compression, so my max compression command currently looks like this:

7za a -mx -myx -ms1024t -mqs -m0=flzma2:a3:d1536m:fb273:mc64:mt16

Haven't figured out how exactly compression level (x), compression analysis (yx), and compression mode (a) influence each other in flzma2. Your source comments mention 3 compression modes, each with 10 different compression levels, and a plethora of other parameters which appear to be unalterable via command line switches.

While we're at it, what would the absolute fastest compression setting be? I'm currently using

7za a -mx1 -m0=flzma2:a0

(which performs north of 200GB/h, btw) but I have a feeling there's room for further improvement, as CPU load won't nearly reach 100% on that setting, and it's not disk speed bound, either.

from fast-lzma2.

conor42 commented on August 26, 2024 1

The 10 compression levels are made by combining a mode setting with a number of other settings, so there aren't 10 per mode. The problem with levels is the best combination of settings depends on the type of data, so results may not be consistent when comparing 1-10.

Fast compression will probably never compare well with hashing strategies because the algorithm has no advantage on small dictionaries. Also the initial step at depth 0 is single threaded.

from fast-lzma2.

ApexMods commented on August 26, 2024

From studying the source and initial trial and error, I've learned that -m0=flzma2:a(0-3) sets compression strategy, dictionary is indeed limited to 1024m (why?!?), fast bytes top out at 273, and matchfinder cycles go up to mc64. Compression levels seem to be set via standard -mx(0-9) switches, analysis levels -myx(0-9) remain original(?), same seems to be the case for literal context, literal position, and position bits. Will study source further for additional parameters to fiddle with.

On a side note: Wow, this thing is fast!!! Minimal memory use and very good compression.

Amazing job, Conor! Thank you for this gift to the world! =)

from fast-lzma2.

conor42 commented on August 26, 2024

Thanks for your comments :)

You must be referring to the 7-zip-zstd implementation. Yes there isn't much documentation that I recall. The Fast LZMA2 library documentation combined with the source for the 7-zip interface should cover everything. The 1024 Mb dictionary limit is a legacy of configuration code from Zstandard, which accepts only logarithmic sizes. I have updated the Fast LZMA2 library to fix this, but FL2_DICTSIZE_MAX still limits it to 1024 Mb. This needs to be fixed and tested.

from fast-lzma2.

ApexMods commented on August 26, 2024

I'm using your excellent library inside the p7zip dev branch (https://github.com/szcnick/p7zip). Had to wrangle with the source a bit, but finally got it to compile on macOS. I am truly amazed at the speed gains and almost laughably low memory requirements for multithreading. Beautiful.

from fast-lzma2.

ApexMods commented on August 26, 2024

Oh, and yes - the dictionary limit is the only thing holding it back. I'm maxing settings with 1G dictionary on 16 threads here on my machine, and it's barely using 7GB of memory. A dictionary size of 2GB would be perfect to compensate the slightly lower compression ratio (when compared to 1.5GB dictionary in memory-munching "slow" LZMA).

from fast-lzma2.

ApexMods commented on August 26, 2024

So, just for the fun of it, I compiled again with a modified 2GB dictionary limit (no other modifications). On a 20GB corpus, compression ratio was significantly increased, archive size went from 5.45GB (with 1GB dictionary) to 5.18GB (with 2GB). Compression time went up from 29 to 35 minutes, memory usage from 7GB to 14GB. Will try increasing radix cycles next. :)

from fast-lzma2.

ApexMods commented on August 26, 2024

Still unclear on some things. Does mode (a0-3) override level (x0-10), or is it the other way around? Does analysis level (yx) have any effect at all? Getting inconclusive results here, so a bit confused.

from fast-lzma2.

No documentation? about fast-lzma2 HOT 9 OPEN

Comments (9)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent