Comments (3)
@spichardo thanks for pointing out the efficiency issue from string concatenations.
As I mentioned in the File Exchange discussions, we have been focusing on optimizing loadjson
because it was the (much) slower one between the two. We haven't paid much attention to savejson
because most data I tested were simple data structures.
I was able to download and run your test cases, and I confirm that the provided savejson_fastfile
script is about 2x faster than the latest savejson
in the git (45 s vs 92 s on an i7 Ubuntu box with ssd). However, the implementation in savejson_fastfile
has limitations. It requires users to provide a file and writes to disk with file IO, and it can not return JSON strings directly in memory.
Since the key issue for the performance difference is the string concatenation, I was able to update savejson
to use a more efficient way to combine strings. Here is my latest commit:
In this update, I replaced all sprintf()
based string concatenations to string cell operations, and flattened the string cell once at the end of each sub-function. Such change essentially eliminates the overhead due to string concatenations (except for matdata2json
because we have user specified printing format).
With this updated savejson
, I was able to cut the run time for your benchmark from 92 seconds to 39 seconds, ~15% faster than savejson_fastfile
. The outputs are identical (except for a pair of empty square brackets due to other prior changes).
Can you try out the new savejson
and let me know if this works for you?
Again, thank you for helping making savejson
faster!
from jsonlab.
Excellent, I just tested it, and now definitively using cell concatenation and then flattening gives a huge boost. Funny because I should have thought on that first myself since I used cell arrays precisely to facilitate dynamic growing with minimal impact to performance. Since myself I was purely focused in output files, that's the reason I thought in simply redirecting the output to files.
In my conditions with a double Xeon proc, the updated savejson takes 48.3s vs 53 s with savejson_fastfile, more importantly, when doubling the cell array to 2000, the penalty remains linear: 96 s with savejson vs 104.9s with savejson_fastfile
Thanks a lot for taking a look on this
Cheers
from jsonlab.
terrific, thanks for confirming.
I am closing this issue for now. I am pretty sure there are places in savejson can be further accelerated. Patches and suggestions are always welcome!
from jsonlab.
Related Issues (20)
- improve UTF8 support HOT 1
- 房老师,您好,我是汽车V2X方向的开发工程师 HOT 1
- Struct Arrays importing as cell array (again)
- Msgpack data types HOT 1
- Potential for arbitrary code execution
- Question: use loadmsgpack HOT 1
- FloatFormat Default savejson HOT 1
- 0-length string key throws an error for both loadjson and loadbj
- Error making binary JSON for a matlab structure HOT 1
- UBJSON broken by matlab `fwrite()` change HOT 2
- Slash (/) written as "\/" in the resulting file HOT 5
- loadjson: complex value was replaced by its conjugate HOT 1
- accept string filename in `savejson`
- Can't encode an empty struct array HOT 1
- Change README.txt from Latin-1 to UTF-8? HOT 2
- special characters in strings are not escaped HOT 3
- Problem with long JSON keys and special characters HOT 2
- `null` converted to `[]` HOT 3
- Arrays with one element are converted to scalars HOT 2
- unable to create nested struct in matlab HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jsonlab.