Git Product home page Git Product logo

Comments (44)

fhanau avatar fhanau commented on August 27, 2024 2

I investigated this using LLDB and found the underlying cause – this has been resolved in ec10745. The infinite loop could happen when ECT uses a fast path designed for long runs of repeated bytes. In the fast path the number of repeated bytes is counted and the repeated section is skipped to avoid updating the match finder at each position. The bug was that the match finder was not re-initialized properly afterwards, causing an infinite loop based on the binary structure being corrupted. I have published a new release (version 0.9.5) that contains the bug fix.

Thank you all for your input in investigating this! Feel free to reopen the issue if you still experience infinite loops, although they can sometimes be confused with ECT just running slowly when using a large number of iterations or aggressive filter options. The change above appears to fix the issue on my side, but it is difficult to rule out that there are still related problems due to its non-deterministic nature.

from efficient-compression-tool.

krishty avatar krishty commented on August 27, 2024 1

FWIW, I noticed ECT hanging on two PNGs of mine as well, and this problem surfaced last summer. I too use high optimization settings with multiple block-splitting passes (-30060).

I found one of the affected PNG and I’m trying to reproduce the hang right now with the latest build. If successful, I will attach the image tomorrow.

from efficient-compression-tool.

krishty avatar krishty commented on August 27, 2024 1

Sorry, can’t reproduce my earlier issues. I clearly remember ECT 0.9.3 being stuck on the files for 20+ days without even reaching genetic filtering, but now both 0.9.3 and 0.9.4 complete the files within half an hour. Please disregard my report.

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024 1

I'll take a look in the coming days, but this might still be hard to track down. Knowing that it happens starting with 0.9.2 is certainly useful though.

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024 1

I did some testing with Asan and UBsan but wasn't able to find anything immediately suspicious. Looking at the code itself 89d5622 is perhaps the most likely change set that could have introduced issues. I'll look at it in-depth later on, if I have found a potential cause I'll try making a new binary and you can test if that fixes the issues on your data set.

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024 1

Created a binary that disables GetMatches2() and related changes, I can't point to any specific cause why it would be responsible but it looks like the most likely candidate.
ect.exe.gz
Code is available on the freeze-debug branch.
Let me know if you are still having issues with it, if not it would really narrow down where the issue is happening.

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024 1

The problem is likely in the program logic itself – changing permissions or running in a VM will not affect this.

Since the given changes do not solve the problem, it'll be difficult for me to track it down without some local debugging –
if you are comfortable sharing the data set reproducing it locally within GDB should reveal where the issue occurs, otherwise I could run the problem with asan/ubsan for longer periods.
With that being said, I am relatively busy with other things currently, it might take a while for me to test and try to reproduce the issue.

from efficient-compression-tool.

MegaByte avatar MegaByte commented on August 27, 2024 1

I had the problem with both Intel and ARM Mac, so I don't think that makes a difference.

from efficient-compression-tool.

ts1985 avatar ts1985 commented on August 27, 2024 1

I had also the issue with an Intel. So it's not an processor related issue.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Note : Using ECT 0.9.1 in FileOptimizer actually resolves the issue so this is likely an issue with ECT

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024

I can't think of anything specific that might be causing this – it might be a multithreading issue, but ECT doesn't do anything specific to adopt to system load, that's handled by the OS.

Can you try to provide a minimal reproducible example? That will make this easier to track down, for example try if the issue still happens without --allfilters, without setting a custom mode, without the multithreading flags, when tested on a subset of files...

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

The command line is not my doing; it's been crafted by the FileOptimizer team.
I will however try to isolate the problem as per your instructions. This might take some time since the problem seems to be random. Lastly, since ECT goes into an infinite loop there is no actual error number to report.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Will be doing a test tonight. Since the issue occurs on what seems like random files (never the same one), I've elected to test the same files (there's 3375 of them so there's no lack different types). Note : they are all Mame screenshots to be more precise.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Sorry I was a bit busy. I'm running the batch files now. Will post the results tomorrow.
Here's the (DOS) batch file for the first run :

@Echo off
REM ECT but without --allfilters
SET count=1
FOR /F %%G in ('dir *.PNG /b') DO (CALL :Fonction "%%G")
Exit

:Fonction
Echo ==== %count% : %1% ====
Echo ==== %count% : %1% ==== >> ECT.LOG
ECT.exe --mt-deflate --mt-file -strip -90032 %1% >> ECT.LOG
set /a count+=1

Other batch files will use variations on line #10 as per your request

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Note : Batch file identifies filename because ECT doesn't. Might I suggest a "-verbose n" switch?

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Preliminary results, out of a set of 3375 .PNG files generated by MAME (Titles)

Original command line
--mt-deflate --mt-file --allfilters -strip -90032

removing the --allfilters argument
--mt-deflate --mt-file -strip -90032
--> Stopped at file #1177

removing the -90032 custom argument
--mt-deflate --mt-file --allfilters -strip
--> Completed all 3375 files, very quickly

Removing the --mt-deflate argument is a killer for speed. I will post the results as soon as possible.

So far the -90032 argument seems to be the issue; I'm assuming it means "compress level 9, repeat 32 times"?

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024

So far the -90032 argument seems to be the issue; I'm assuming it means "compress level 9, repeat 32 times"?

It does not, among other things it causes data to be re-compressed 9 times to optimize block splitting, which is very slow and not recommended for most use cases. Removing multithreading is bound to be slow with these options.

Based on the above it sounds like you don't think it's caused by any specific file, but can you check if it hangs when just running on file #1177?

Otherwise, try running with the -9 flag, if that works the feature mentioned above may be responsible for the issues you are encountering.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

All tests done with ECT 0.9.4 BTW
Ok finished testing.

Removing the --mt-deflate and --mt-file arguments
--allfilters -strip -90032
--> Stopped at file #149

The culprit is -90032 fer sure.
Please note ... it was working fine in 0.9.1 but doesn't as of 0.9.2 (I will double-check that too)

Tell you what. I'm going to do the same tests with ECT 0.9.1 just to make sure.
I'll post the results here.

BTW I am keeping a log but I get no error message. When ECT goes off the rails, it keeps on eating cycles but doesn't output any result (that I can see).

QUESTIONS :
1 : I looked at the documentation and it says "consult performance.html" which is nowhere to be found?
2 : There is no mention in the documentation of the --mt-file switch. What does it do?
3 : If -90032 repeats 9 times what is the 0032 for?

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Testing ECT 091 now with the following batch file :

@Echo off
REM ECT original command line from FileCompressor
SET count=1
FOR /F %%G in ('dir *.PNG /b') DO (CALL :Fonction "%%G")
Exit

:Fonction
Echo ==== %count% : %1% ====
Echo ==== %count% : %1% ==== >> ECT.LOG
Echo [%DATE%  %TIME:~0,8%]
Echo [%DATE%  %TIME:~0,8%] >> ECT.LOG
ECT091.exe --mt-deflate --mt-file --allfilters -strip -90032 %1% >> ECT.LOG
set /a count+=1

This will at least tell you where the issue IS NOT.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

I'm confirming here I had an infinite loop with ECT 0.92
Same 3375 .PNG files
This one went into an infinite loop on file # 23

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

So far ECT 0.91 is going strong, up to file 314 of 3375. It's going to take a while. Will let the program run the night.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

ECT 0.91 processed all 3375 files with no issue.
Using command line ECT091.exe --mt-deflate --mt-file --allfilters -strip -90032 %1%

So let's recap :
Symptom : recent versions of ECT will eventually try to compress the same file forever, requiring a break. It seems to happen randomly and is not caused by one specific file.

  • Issue affects ECT 092, ECT 093 and ECT 094
  • Seems to affect specific people so might be CPU-dependent (I'm using an AMD Ryzen 5 2400G CPU)
  • Caused by the customization string (switch with only numbers) (-90032)
  • ECT 091 works absolutely fine so temporary solution is to go back to that version
  • Not caused by FileOptimzer since all my current tests are done with a DOS batchfile
  • Not caused by multi-threading
  • Not caused by the --alfilters switch
  • Tested on a pack of 3375 files so results are pretty definitve.
  • Machine used to test is rock solid, ram tested with Memtest86+ so it's not a ram issue
  • ECT never outputs an error message so ECT doesn't crash per se.
  • Test system has 32GB of ram and 8 threads which should be plenty
  • No thermal issues and system is water cooled

@fhanau Let me know if there anything I can do on my side to help narrow down the issue even further. I'm available if you want me to run some code and output some extra info in a log file. At this point, this is in your hands.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

@fhanau Thank you so much for your time! If there's anything I can do please let me know.

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024

Looked at the commit in detail and the newly introduced GetMatches2() in LzFind.c looks like the most likely culprit. Finding the exact cause will be difficult for me without being able to reproduce the issue though – what I can do is disable the function on a branch and provide a binary for that, if it turns out to fix it on your side I can remove it for now so there's a stable version.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

I'm on it.
I'm stuck at home for a few days because of a nasty neck acke (torticolis?) so I should be good to start the batch file.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Test started. 2,5% done.
We'll know by tonight if the test is conclusive.
Thank you!

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

@fhanau ECT went into an infinite loop at 4,4% (file # 149)
Has been on the same file for an hour and a half. Every other file is compressed in 1-5 minutes.

Here is - again - the batch file used

@Echo off
REM ECT original command line from FileCompressor
SET count=1
FOR /F %%G in ('dir *.PNG /b') DO (CALL :Fonction "%%G")
Exit

:Fonction
Echo ==== %count% : %1% ====
Echo ==== %count% : %1% ==== >> ECT.LOG
Echo [%DATE%  %TIME:~0,8%]
Echo [%DATE%  %TIME:~0,8%] >> ECT.LOG
ect.beta.exe --mt-deflate --mt-file --allfilters -strip -90032 %1% >> ECT.LOG
set /a count+=1

Will restart the compression again with the same original files to make sure it wasn't a fluke but as mentionned previously it's not file-dependant (as strange as it might seem).

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

2nd attempt went into an infinite loop at file # 107
Both passes were run with an elevated command prompt.
If you want me to run another version I'm still good to go.
I'm also open to run a version that creates a log file and then send you the results.

Last but not least I have two virtual machine systems. VMWare and VirtualBox. Would it help any if I ran ECT in such a system? I'm assuming both virtual machines just call the processor the same way the command prompt does?

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Note : there is no error message in the windows log system, in case anyone asks.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

I've never used GDB (actually never heard of it!) but I'm willing to give it a shot if it can help you.
I'm an old timer (my first computer was an Atari 800) so I should be able to get it to work after fiddling around a bit :)
Note : for the kick of it I did actually try the same script and ECT version on a virtual machine ... and ECT got into a loop at file # 184.
Oh well. Was worth a try ...

from efficient-compression-tool.

tssajo avatar tssajo commented on August 27, 2024

@fhanau Could this be the reason for the infinite loop? Please see: tjko/jpegoptim#148

Forking a process with open file streams requires that the stream be fflushed before the fork (even read only streams). Otherwise when the child exit()s it will clobber the seek position of the stream in the parent process, often resulting in an infinite loop as the end of the file is never found.

Technical details at:
https://stackoverflow.com/questions/50110992/why-does-forking-my-process-cause-the-file-to-be-read-infinitely/50112169#50112169

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024

I don't think so – ECT doesn't fork(), parallelism is based on pthread and C++ std::thread.

from efficient-compression-tool.

hyperjuni avatar hyperjuni commented on August 27, 2024

I've been having the exact same issue described here using FileOptimizer with files randomly getting stuck forever on ECT since about last summer, went straight back to 0.91 since that seemed to be around the date it started, and indeed with 0.91 it doesn't happen.

Just from looking at the changes in 0.92, could it be perhaps somehow related to it not being able to write the file properly? I noticed that it would briefly show the optimized file size in the FileOptimizer window before getting stuck, but the size of the actual file doesn't change after it freezes up like that.

I usually compress multiple tiny files (<10kb), so I can fairly quickly reproduce the issue and help with testing if needed :)

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024

I usually compress multiple tiny files (<10kb), so I can fairly quickly reproduce the issue and help with testing if needed :)

That would be really helpful – if you can provide a file (or a small set of files) that reliably causes ECT to freeze that will make investigating much easier. My latest theory is that the issue is related to the match cache leading to issues when running ECT with several iterations, but it's hard to test without being able to reproduce the issue.

from efficient-compression-tool.

hyperjuni avatar hyperjuni commented on August 27, 2024

You can find a huge amount of tiny PNG files in one of my repositories:
https://github.com/hyperjuni/Neki

Download it, filter out the PNGs and try using the latest FileOptimizer to compress them.
They're already compressed, but it doesn't matter - ECT will still freeze up after a few files if you try to re-compress them.

Sorry for the delayed reply, didn't notice the notification :(

from efficient-compression-tool.

MegaByte avatar MegaByte commented on August 27, 2024

I have a file that sometimes goes into a loop with just -9, and I think -9xxxx just makes it more likely to hit the problem.

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024

I have a file that sometimes goes into a loop with just -9, and I think -9xxxx just makes it more likely to hit the problem.

Sorry for the delayed response – if you can send me the file I'll try reproducing it

from efficient-compression-tool.

MegaByte avatar MegaByte commented on August 27, 2024

xpromo-top-generic
Unfortunately, it's not guaranteed and may take many tries to run into the problem. I started to try to track it down and got as far as isolating it to somewhere in the area of ZopfliCalculateBlockSize / ZopfliCopyLZ77Store / CopyStats. It could happen during initial iterations or ultra paths.

from efficient-compression-tool.

Omniflux avatar Omniflux commented on August 27, 2024

When this occurs, active memory utilization (as reported by Task Manager) stops changing, despite CPU usage remaining high. When working correctly active memory utilization reported is very volatile.

I use this as an indicator to kill the process, and can then restart it. About half the time it will complete the file on the next try, but sometimes it takes 4-5 attempts.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Hey there.
A little update.
I'm planning an upgrade from a Ryzen 5 2400g to a Ryzen 7 5800x (whoohooo!).
If it is an architecture issue then this might make a difference.
Also the 2400g has an integrated graphics card and the 5800x does not. The 2400g has 4 cores and the 5800x has eight.
From what I can see, not everyone is having the issue so my natural guess goes to the problem being CPU related.
Worse case scenario ECT gets stuck faster.
Back in about a week with some data.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

So I successfully upgraded my CPU and all it did was make ECT lock up faster.
If anyone wants to try a bunch of .PNGs you have interesting sets here :
https://pleasuredome.github.io/pleasuredome/mame/index.html
Download file called "Set: [MAME - Update EXTRAs (v0.260 to v0.261)]"
And use the included .ZIP file called "snap.zip"
Plenty of PNGs to compress in there.

from efficient-compression-tool.

fhanau avatar fhanau commented on August 27, 2024

Thank you all for the input! After some trial and error I was able to reproduce ECT being stuck on @hyperjuni's image set – ECT was stuck on one file for 6 minutes when compressing the image set before I suspended it, even though compressing it alone takes a mere 10 seconds. Other files seem to be affected as well.
Unfortunately, this does not happen deterministically (ECT processes the data set successfully on some runs) and I have not been able to reproduce it in lldb, which could make debugging much easier. I'll see if ASAN finds any issues on the test set next, but this will take some more time to figure out.

from efficient-compression-tool.

ObiWanCeleri avatar ObiWanCeleri commented on August 27, 2024

Thank you all for the input! After some trial and error I was able to reproduce ECT being stuck on @hyperjuni's image set – ECT was stuck on one file for 6 minutes when compressing the image set before I suspended it, even though compressing it alone takes a mere 10 seconds. Other files seem to be affected as well. Unfortunately, this does not happen deterministically (ECT processes the data set successfully on some runs) and I have not been able to reproduce it in lldb, which could make debugging much easier. I'll see if ASAN finds any issues on the test set next, but this will take some more time to figure out.

Hey Fhanau, thanks for all your work.
Really really really appreciated :)
Please celebrate what you want for the new year ...
and eternal cheers to the great floppy in the sky and/or the flying spaghetti monster! (also: whatever you put your trust in)

from efficient-compression-tool.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.