Comments (4)
Oh. That's obviously not intentional.
Thank you for reporting this!
I'll be happy to fix that. Do you happen to have a simple reproduction? (I assume you encountered this with some code so in case you have anything that'd help -- I will need a regression test at any rate, to ensure it won't break again).
from java-merge-sort.
Reproduction is very simple, create a file with at least one line of length 40000 (or anything > 2 x 16000).
example.txt
The attaches example file has a single line composed of 16000 '0', 16000 '1' and 15990 '2'.
When you sort this file with TextFileSorter which calls RawTextLineReader (and _readNextSlow), the result is a file with a single line containing only the '0's and '2's, the '1's have been discarded as they occupy the second 16000 block.
from java-merge-sort.
How embarrassing. :)
Thank you for reporting the issue; I added a simple test, fixed the issue, and will release 1.0.1 next.
Should be available via Maven Central within couple of hours.
from java-merge-sort.
Thanks for the quick fix.
from java-merge-sort.
Related Issues (13)
- Add helper methods for calculating approximate object mem usage HOT 1
- loss data after sort HOT 1
- ENHANCEMENT: Performance ideas: pipelining and compression HOT 1
- Temp files that result from two-phase merge are not deleted HOT 1
- Parallelize main memory sort phase HOT 2
- copyright clarity - license boilerplate issue HOT 1
- Increase JDK baseline from Java 6 to Java 8
- Use PriorityQueue for k-way merge (perf) HOT 10
- Possible 'overallocation' of memory HOT 5
- RawTextLineReader doesn't skip line feeds correctly on Windows HOT 1
- Sorter#_merge() not closing streams correctly. HOT 2
- Sort/Merge process that removes duplicates HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from java-merge-sort.