Git Product home page Git Product logo

Comments (13)

fdegros avatar fdegros commented on May 3, 2024

Trying to reproduce with a recent mount-zip 1.0.13 and libzip 1.10.1.

$ mount-zip --version
mount-zip version: 1.0.13
libzip version: 1.10.1
FUSE library version: 2.9.9
fusermount3 version: 3.14.0
using FUSE kernel interface version 7.19

I downloaded the file android-ndk-r26b-linux.zip.

$ ls -lh
total 639M
-rw-r--r-- 1 francois francois 639M Apr 18 18:32 android-ndk-r26b-linux.zip

For comparison, unzipping this ZIP with unzip takes 27 seconds on my computer. The unzipped archive contains 526 directories and 7925 files for a total of 16 GB.

$ time unzip -d out android-ndk-r26b-linux.zip
...
real    0m26.618s
user    0m21.912s
sys     0m4.273s

$ tree -a --du -h out
...
  16G used in 526 directories, 7925 files

$ rm -r out

Mounting the archive with mount-zip only takes 0.11 seconds.

$ time mount-zip android-ndk-r26b-linux.zip mnt

real    0m0.114s
user    0m0.062s
sys     0m0.053s

$ tree -a --du -h mnt
...
  16G used in 526 directories, 7925 files

Copying all the files from the mounted ZIP using cp -R takes 16 seconds. This recursive copy effectively opens, decompresses and copies every single file from the mounted ZIP. This exercises the whole FUSE + mount-zip + libzip stack. This is surprisingly faster than extracting the archive with unzip.

$ time cp -R mnt out

real    0m16.286s
user    0m0.219s
sys     0m3.473s

$ tree -a --du -h out
...
  16G used in 526 directories, 7925 files

$ rm -r out

$ umount mnt

So, I don't know why you observed some very slow access times. One of the hypotheses is that your access pattern repetitively decompresses a big file in the archive. If this is the case, it might be beneficial to use the --precache option with mount-zip. This preemptively decompresses every single file at mount time, and that takes about 16 seconds.

$ time mount-zip --precache android-ndk-r26b-linux.zip mnt
...
real    0m15.889s
user    0m11.559s
sys     0m2.928s

After that, copying all the files from the mounted ZIP only takes 10 seconds. And every access pattern should be equally fast, since there is no decompression involved in the process anymore.

$ time cp -R mnt out

real    0m10.142s
user    0m0.196s
sys     0m3.743s

$ tree -a --du -h out
...
  16G used in 526 directories, 7925 files

from mount-zip.

fdegros avatar fdegros commented on May 3, 2024

You use an "old" version of mount-zip (1.0.7), which does not feature the --precache option. This option was added in version 1.0.8.

I'm going to close this bug as "resolved", since I guess that the --precache option might solve your issue. Please reopen if you're still seeing slow access patterns with a recent version of mount-zip and while using the --precache option.

from mount-zip.

tenzap avatar tenzap commented on May 3, 2024

I can't use a more recent version, because libzip is not recent enough in Debian. (see also #20)

What does precache do? Does it store the data somewhere and as a consequence take some disk space? The reason I don't uncompress the zip is that there isn't enough disk space to do so, so if precache does that, I can't use it.

from mount-zip.

tenzap avatar tenzap commented on May 3, 2024

My access pattern is building a large application where the toolchain (clang, cmake, binutils, sysroot, libs, includes...) is accessed through fuse-zip/mount-zip

from mount-zip.

tenzap avatar tenzap commented on May 3, 2024

I just tried with master branch (ie. 1.0.13 with support for older libzip) + libzip 1.7.3 and without --precache it is still much slower that fuse-zip.

Since fuse-zip & mount-zip both use libzip 1.7.3, slowness seems to come from mount-zip, and not from the use of an old version of libzip.

precache is not an alternative because I don't want to consume disk space and it looks like precache uncompresses the zip to disk.

from mount-zip.

fdegros avatar fdegros commented on May 3, 2024

See the discussion on bug #20.

It seems that mount-zip fails to create a cache file with the O_TMPFILE flag in the tmp dir when the underlying filesystem is overlayfs. This impedes the caching mechanism and results in poor performance when faced with non-sequential access to the contained files. See also the documentation.

This also explains why the other ZIP mounter fuse-zip does not exhibit this performance degradation, since it caches all the uncompressed data in memory.

I can think of several solutions or workarounds:

  1. Use a suitable filesystem to host the tmp dir, such as ext2, ext3, ext4 or tmpfs.
  2. Modify the cache file creation code in mount-zip to avoid using O_TMPFILE.
  3. Modify the cache file creation code in mount-zip to use an anonymous in-memory file created by memfd_create, possibly as a backup solution if the cache file cannot be created in the tmp dir.

from mount-zip.

tenzap avatar tenzap commented on May 3, 2024

Performance is now satisfactory with latest master in my setup, whether I use --precache or not. Thank you.
I also updated the Launchpad's PPA and it includes all your recent changes.

from mount-zip.

fdegros avatar fdegros commented on May 3, 2024

Can you please check again with the latest changes at commit 87c5d16?

from mount-zip.

tenzap avatar tenzap commented on May 3, 2024

It is not that easy to check every little change. :/

from mount-zip.

tenzap avatar tenzap commented on May 3, 2024

However, a fast check on my system where it used to be slow, looks fine. (without --precache)

And with precache:

mount-zip: The filesystem of '/tmp' does not support O_TMPFILE
mount-zip: Created cache file '/tmp/AWWJ3o'

precache could be slower than without it, not sure though.

from mount-zip.

tenzap avatar tenzap commented on May 3, 2024

How is one supposed to invoke mount-zip to have "Using memory cache"? without --cache, or with --cache=? Maybe this should be documented somewhere.

from mount-zip.

fdegros avatar fdegros commented on May 3, 2024

Thanks for the verification. It looks good.

How is one supposed to invoke mount-zip to have "Using memory cache"? without --cache, or with --cache=?

Yes, that's right. You can use --cache= at the moment in order to experiment with the memory cache.

However, this feels a bit like an undocumented hack. I'm thinking about adding a separate and properly documented command-line option for that. Maybe something like --memcache.

from mount-zip.

fdegros avatar fdegros commented on May 3, 2024

I added the --memcache option. Feel free to experiment with it.

from mount-zip.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.