Comments (5)
Sounds plausible, I'll have a look. Thanks!
from java-merge-sort.
Quick comment on memory size calculations: assumption is that usually most memory is not used for Object[] itself, but rather for entries contained. Heuristics are approximate of course.
from java-merge-sort.
you are correct, but if the estimated size is grossly off (we originally estimated 24 bytes/object, then sat down a bit and really looked at it, this was a good reference: http://www.ibm.com/developerworks/java/library/j-codetoheap/index.html) we realised for an Object wrapping a String and an long, it actually adds up to 106 bytes or something, so we were 4x wrong! this then makes java-merge-sort choose to use 4x as many to fit the buffer.
If we combine that silly original estimate, with the fact that we were running Java 6 U31 64bit which used Compressed OOPs and_then had to downgrade to a JVM prior to the Compressed OOPS support (long story on the why, irrelevant here), plus the maybe-doubling of the Object[] above, our 2Gb heap pretty much vanished! :)
The 2nd copy of the array is a problem still I believe, because the user of the library will allocate memory based on their understanding of the JVM environment they're running in, and if it's now doubling it, that may well be a penalty they can't afford (particularly if their object size calcs are pretty dodgy.. :) )
from java-merge-sort.
Yes, forgot to mention that the estimate obviously has to be close enough -- and Strings are expensive for multiple reasons.
I also think you may well be right that there is unintended reference retention, and that would be good to eliminate.
Especially if there are duplicate Strings from previous rounds.
from java-merge-sort.
Ok, yes. Some of things were not getting cleared as they should have, including one you pointed out. I added two smaller clean up fixes; I hope those help a bit too.
I released a new version (actually, well, two, finding one more thing to fix), see if 0.7.1 behaves any better for you?
Further suggestions for improvements are gladly accepted. :-)
from java-merge-sort.
Related Issues (13)
- Add helper methods for calculating approximate object mem usage HOT 1
- loss data after sort HOT 1
- Long lines are corrupted when read by _readNextSlow HOT 4
- ENHANCEMENT: Performance ideas: pipelining and compression HOT 1
- Temp files that result from two-phase merge are not deleted HOT 1
- Parallelize main memory sort phase HOT 2
- copyright clarity - license boilerplate issue HOT 1
- Increase JDK baseline from Java 6 to Java 8
- Use PriorityQueue for k-way merge (perf) HOT 10
- RawTextLineReader doesn't skip line feeds correctly on Windows HOT 1
- Sorter#_merge() not closing streams correctly. HOT 2
- Sort/Merge process that removes duplicates HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from java-merge-sort.