Comments (10)
I just realized that the original jar files didn't compress their contents and hence the huge difference in sizes. Presumably, this was done to avoid decompression overhead.
However, in my tests, the time taken to run doppio with compressed jar is almost same (and sometimes slightly less) than the non-compressed jar. I have tested with Chromium 47 and Firefox 43 on x86-64 on localhost. The speed improvement should be even more when loading over an external network.
from doppio_jcl.
If you use gzip
compression on the entire thing, it goes down to ~37MB total:
https://github.com/plasma-umass/doppio_jcl/releases/tag/v3.2
Presumably, this was done to avoid decompression overhead.
Yup. The official JCL isn't compressed either for the same reason. I noticed considerable overhead in my tests and the unit tests.
I figured most users would do something like the following the first time a new user visits a site using DoppioJVM:
- Download a gzip-compressed JCL (~30MB) (could directly use the GItHub release URL).
- Ungzip + untar the JCL in JavaScript, and store the files in IndexedDB/browser-local storage.
Users benefit from a faster JCL and no compression once the loading phase completes.
I'm planning to do this whenever I get around to releasing a new demo.
Do you think that's a better solution? What benchmarks did you run / what numbers did you get? I'm curious. Compression only has an impact on JVM startup, when the JVM is decompressing + loading class files / resources.
from doppio_jcl.
I noticed considerable overhead in my tests and the unit tests.
Out of curiosity, what hardware was it on? An Atom or ARM CPU might give different results than a core-i7.
Regardless of that, if it causes significant overhead on any architecture, then zipping the jars is probably not a good idea.
Do you think that's a better solution?
From a performance perspective, yes, that sounds good. Though I am not sure about it from other angles (if any).
What benchmarks did you run / what numbers did you get? I'm curious. Compression only has an impact on JVM startup, when the JVM is decompressing + loading class files / resources.
Nothing very fancy.. I just loaded doppio and ran a "Hello world" app. And measured the time from start to finish. Will try to upload the test and numbers somewhere.
from doppio_jcl.
It was a MacBook Pro, 1.5 GHz Core i5 with 16GB of RAM.
I haven't benchmarked since I revamped the zip file native methods, so maybe it has changed? I honestly don't have the time right now to check.
from doppio_jcl.
Quick question: Would it be possible to use pack200 format for the rt.jar?
pack200 compresses the rt.jar
to 17MB and gzip compresses it further to 7.3Mb! This will not only save network transmit time, the uncompression can be done by the browser natively.
If it sounds feasible, I can try to create benchmark, pack200 indexer, etc.
from doppio_jcl.
@hrj from my understanding and experimentation (and I could be wrong! I would like to be wrong), gzip
decompression in the browser only works on text data. I couldn't find a way to trigger the browser to gunzip binary data.
I suspect you would have to gzip base64'd data, and then ask the browser to un-base64 it... which would be terrible for 60MB of data. :(
But if I'm wrong, and that works, that would be really, really nice. In the worst case, we could use a JS gunzip program, and simply gunzip to IndexedDB using BrowserFS as a one-time setup cost.
from doppio_jcl.
Also, thanks for the link to pack200. I'd never heard of it!
from doppio_jcl.
Or, rather, I guess it's done by MIME type. I was looking into how I could get this working on GitHub pages. Now that I look at it again, maybe I could trick GitHub into thinking the file is something it should compress. shrugs
from doppio_jcl.
@jvilk browser can handle encoded (compressed) binary data. This is managed via http headers: the client sends accept-encoding
request header, and the server sets the content-encoding
response header when the response is encoded.
Most browsers support gzip
and deflate
encodings, so that's not a problem. On the server side, it depends on the specific server. In most of the automatic config servers (like github (backed by S3)) static files that are compressible are automatically served with compression. In manually configured servers, you may need to tweak some options. Sometimes, these servers detect the extension and serve accordingly. So, if there is a file called xyz.gz
, and a request for xyz
comes through, then the xyz.gz
file is served with content-encoding
set to gzip
.
Anyway, on the http side, this is a very well established protocol; it should work fine.
My question is more on the doppio side: can a pack200 file be added to the system class path? Will it clash with some JVM standard? (If it doesn't clash, why doesn't regular JVM use it?)
from doppio_jcl.
Also note that gzip compression in http will work with the .jar file as well. rt.jar.gz
is 21MB. So, gzip idea is orthogonal to the pack200 idea. It's just that pack200 is optimised for the .class
files and gzip
combination, and uncompressed pack200 is smaller than the uncompressed jar, which makes it a sweet deal overall.
from doppio_jcl.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from doppio_jcl.