Git Product home page Git Product logo

Comments (4)

marius-enlock avatar marius-enlock commented on September 27, 2024

Ah, the problem is only present in the MINGW64 terminal that came installed with git.
On cmd/ powershell/ visual studio code terminal the speed is high for memfs

Edit: I am wrong in this comment. The problem still persists on powershell/ visual studio code terminal with memfs and with my custom file system.

The problem seems to be with dd as cp on memfs is fast on visual studio code terminal/ powershell

from cgofuse.

marius-enlock avatar marius-enlock commented on September 27, 2024

I can see that on windows the cp block size is 1 MiB, dd is faster using 1MiB block size with memfs, but still slower than cp

PS C:\Users\ThrowAway\go\bin\t1> dd if=C:\Users\ThrowAway\Desktop\1GFile of=1GFile status=progress bs=1048576
318767104 bytes (319 MB, 304 MiB) copied, 12 s, 26.5 MB/s
309+0 records in
309+0 records out
324009984 bytes (324 MB, 309 MiB) copied, 12.496 s, 25.9 MB/s

from cgofuse.

billziss-gh avatar billziss-gh commented on September 27, 2024

My first recommendation in such cases is to ensure that you are using -o FileInfoTimeout=-1 which ensures that kernel caching is enabled. But since you are trying your tests against the WinFsp MEMFS (which enables file caching by default), the problem must be something else.

I am not certain why dd would be slower than cp. I note that both programs use the POSIX API presented by Cygwin rather than the native Windows file API. There may be some issue in Cygwin's interpretation of POSIX APIs as Windows file APIs that causes the discrepancy. Or it might simply be that dd does something that is cheap on POSIX but expensive on Windows.

One way to explore the differences is to use FileSpy. This can be used to track all file system operations as posted to the WinFsp FSD (File System Driver). So you could run your experiment with dd and FileSpy enabled, and you could also run your experiment with cp and FileSpy enabled. We might then be able to tell why dd is slower by seeing how it is accessing the file system.

from cgofuse.

marius-enlock avatar marius-enlock commented on September 27, 2024

A bit unrelated with memfs, but optimizing my custom filesystem for a file block size of 16777216 bytes, yields a copy speed of 500MB/s on windows (compared to 40 MB/s for 128 KiB file block size optimization), which I am satisfied with for my use case.

Note that for linux (ubuntu) the optimium size from my experiments is 128 KiB, using higher sizes will downgrade the cp speed (the above one would yield 117 MB/s compared to 500 MB/s when I use 128 KiB), and on macOS it's the same as the above one from windows which yields 1400MB/s speeds.

Using: -o FileInfoTimeout=-1 just gave me permission denied at the start of the dd command - file got created but 0 bytes copied; but I wall clocked my Getattr and most of the time it takes 0 ns to return, and sometimes a few hundred nano seconds from time to time, totaling less than 0.2 seconds to copy the 1GFile.

For memfs:

PS C:\Users\ThrowAway\go\bin\t1>  Measure-Command { cp C:\Users\ThrowAway\Desktop\1GFile 1GFile}


Days              : 0
Hours             : 0
Seconds           : 0
Milliseconds      : 483

There is also an increase in speed for dd with this blocksize.

PS C:\Users\ThrowAway\go\bin\t1> dd if=C:\Users\ThrowAway\Desktop\1GFile of=1GFile status=progress bs=16777216
1056964608 bytes (1.1 GB, 1008 MiB) copied, 8 s, 132 MB/s
64+0 records in
64+0 records out

I find the discrepancy between dd and cp interesting, but I will take a break in investigating the issue, I didn't take a look at FileSpy yet.

from cgofuse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.