Git Product home page Git Product logo

Comments (10)

hrj avatar hrj commented on May 27, 2024

I just realized that the original jar files didn't compress their contents and hence the huge difference in sizes. Presumably, this was done to avoid decompression overhead.

However, in my tests, the time taken to run doppio with compressed jar is almost same (and sometimes slightly less) than the non-compressed jar. I have tested with Chromium 47 and Firefox 43 on x86-64 on localhost. The speed improvement should be even more when loading over an external network.

from doppio_jcl.

jvilk avatar jvilk commented on May 27, 2024

If you use gzip compression on the entire thing, it goes down to ~37MB total:

https://github.com/plasma-umass/doppio_jcl/releases/tag/v3.2

Presumably, this was done to avoid decompression overhead.

Yup. The official JCL isn't compressed either for the same reason. I noticed considerable overhead in my tests and the unit tests.

I figured most users would do something like the following the first time a new user visits a site using DoppioJVM:

  1. Download a gzip-compressed JCL (~30MB) (could directly use the GItHub release URL).
  2. Ungzip + untar the JCL in JavaScript, and store the files in IndexedDB/browser-local storage.

Users benefit from a faster JCL and no compression once the loading phase completes.

I'm planning to do this whenever I get around to releasing a new demo.

Do you think that's a better solution? What benchmarks did you run / what numbers did you get? I'm curious. Compression only has an impact on JVM startup, when the JVM is decompressing + loading class files / resources.

from doppio_jcl.

hrj avatar hrj commented on May 27, 2024

I noticed considerable overhead in my tests and the unit tests.

Out of curiosity, what hardware was it on? An Atom or ARM CPU might give different results than a core-i7.

Regardless of that, if it causes significant overhead on any architecture, then zipping the jars is probably not a good idea.

Do you think that's a better solution?

From a performance perspective, yes, that sounds good. Though I am not sure about it from other angles (if any).

What benchmarks did you run / what numbers did you get? I'm curious. Compression only has an impact on JVM startup, when the JVM is decompressing + loading class files / resources.

Nothing very fancy.. I just loaded doppio and ran a "Hello world" app. And measured the time from start to finish. Will try to upload the test and numbers somewhere.

from doppio_jcl.

jvilk avatar jvilk commented on May 27, 2024

It was a MacBook Pro, 1.5 GHz Core i5 with 16GB of RAM.

I haven't benchmarked since I revamped the zip file native methods, so maybe it has changed? I honestly don't have the time right now to check.

from doppio_jcl.

hrj avatar hrj commented on May 27, 2024

Quick question: Would it be possible to use pack200 format for the rt.jar?

pack200 compresses the rt.jar to 17MB and gzip compresses it further to 7.3Mb! This will not only save network transmit time, the uncompression can be done by the browser natively.

If it sounds feasible, I can try to create benchmark, pack200 indexer, etc.

from doppio_jcl.

jvilk avatar jvilk commented on May 27, 2024

@hrj from my understanding and experimentation (and I could be wrong! I would like to be wrong), gzip decompression in the browser only works on text data. I couldn't find a way to trigger the browser to gunzip binary data.

I suspect you would have to gzip base64'd data, and then ask the browser to un-base64 it... which would be terrible for 60MB of data. :(

But if I'm wrong, and that works, that would be really, really nice. In the worst case, we could use a JS gunzip program, and simply gunzip to IndexedDB using BrowserFS as a one-time setup cost.

from doppio_jcl.

jvilk avatar jvilk commented on May 27, 2024

Also, thanks for the link to pack200. I'd never heard of it!

from doppio_jcl.

jvilk avatar jvilk commented on May 27, 2024

Or, rather, I guess it's done by MIME type. I was looking into how I could get this working on GitHub pages. Now that I look at it again, maybe I could trick GitHub into thinking the file is something it should compress. shrugs

from doppio_jcl.

hrj avatar hrj commented on May 27, 2024

@jvilk browser can handle encoded (compressed) binary data. This is managed via http headers: the client sends accept-encoding request header, and the server sets the content-encoding response header when the response is encoded.

Most browsers support gzip and deflate encodings, so that's not a problem. On the server side, it depends on the specific server. In most of the automatic config servers (like github (backed by S3)) static files that are compressible are automatically served with compression. In manually configured servers, you may need to tweak some options. Sometimes, these servers detect the extension and serve accordingly. So, if there is a file called xyz.gz, and a request for xyz comes through, then the xyz.gz file is served with content-encoding set to gzip.

Anyway, on the http side, this is a very well established protocol; it should work fine.

My question is more on the doppio side: can a pack200 file be added to the system class path? Will it clash with some JVM standard? (If it doesn't clash, why doesn't regular JVM use it?)

from doppio_jcl.

hrj avatar hrj commented on May 27, 2024

Also note that gzip compression in http will work with the .jar file as well. rt.jar.gz is 21MB. So, gzip idea is orthogonal to the pack200 idea. It's just that pack200 is optimised for the .class files and gzip combination, and uncompressed pack200 is smaller than the uncompressed jar, which makes it a sweet deal overall.

from doppio_jcl.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.