Git Product home page Git Product logo

Comments (12)

ptaoussanis avatar ptaoussanis commented on September 27, 2024

Hi Vadali,

You may want to look at the freeze-to-out! and thaw-from-in! utils, they support streaming.

from nippy.

shlomiv avatar shlomiv commented on September 27, 2024

Hey, thanks for your response!

I indeed used freeze-to-out!, and I tried digging in further, with complete disregard to the type of stream I feed into it, it seems to fail on serializable? call..

here is the error message:

Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit, compiling:(/tmp/form-init5631194680796807197.clj:1:72)
at clojure.lang.Compiler.load(Compiler.java:7142)
at clojure.lang.Compiler.loadFile(Compiler.java:7086)
at clojure.main$load_script.invoke(main.clj:274)
at clojure.main$init_opt.invoke(main.clj:279)
at clojure.main$initialize.invoke(main.clj:307)
at clojure.main$null_opt.invoke(main.clj:342)
at clojure.main$main.doInvoke(main.clj:420)
at clojure.lang.RestFn.invoke(RestFn.java:421)
at clojure.lang.Var.invoke(Var.java:383)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.Var.applyTo(Var.java:700)
at clojure.main.main(main.java:37)
Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
at java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1188)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
at taoensso.nippy.utils$fn__905.invoke(utils.clj:34)
at taoensso.nippy.utils$memoize_type_test$fn__895$test__896.invoke(utils.clj:18)
at taoensso.nippy.utils$memoize_type_test$fn__895$fn__898.invoke(utils.clj:23)
at clojure.lang.Delay.deref(Delay.java:37)
at clojure.core$deref.invoke(core.clj:2200)
at taoensso.nippy.utils$memoize_type_test$fn__895.invoke(utils.clj:23)
at taoensso.nippy$fn__1223.invoke(nippy.clj:284) <--- I think it starts here
at taoensso.nippy$fn__860$G__855__867.invoke(nippy.clj:108)
at taoensso.nippy$fn__1034.invoke(nippy.clj:217)
at taoensso.nippy$fn__860$G__855__867.invoke(nippy.clj:108)
at taoensso.nippy$freeze_to_out_BANG_.doInvoke(nippy.clj:318)
at clojure.lang.RestFn.invoke(RestFn.java:425)
at reverse_index.core$store_to_file.invoke(core.clj:94)
at reverse_index.core$_main.invoke(core.clj:115)
at clojure.lang.Var.invoke(Var.java:394)
at user$eval5.invoke(form-init5631194680796807197.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:6703)
at clojure.lang.Compiler.eval(Compiler.java:6693)
at clojure.lang.Compiler.load(Compiler.java:7130)
at clojure.lang.Compiler.loadFile(Compiler.java:7086)
at clojure.main$load_script.invoke(main.clj:274)
at clojure.main$init_opt.invoke(main.clj:279)
at clojure.main$initialize.invoke(main.clj:307)
at clojure.main$null_opt.invoke(main.clj:342)

you think this is something i am doing wrong?

this is the code i am trying to run

(defn store-to-file [idx out-filename]
   (with-open [w (FileOutputStream. out-filename)]
      (nippy/freeze-to-out! (DataOutputStream. w) idx)))

Thanks!

from nippy.

ptaoussanis avatar ptaoussanis commented on September 27, 2024

Non-Clojure data types that implement Serializable are currently subject to a de/serialization test, which is trying to allocate the whole thing.

Is it necessary that your data structure contain large non-Clojure data types?

from nippy.

shlomiv avatar shlomiv commented on September 27, 2024

hmmm i see.. my datatype contains lots of native int arrays, I tried using clojure datatypes, but processing time was longer by a lot. Is there a way around it? if not, no worries :)

from nippy.

ptaoussanis avatar ptaoussanis commented on September 27, 2024

hmmm i see.. my datatype contains lots of native int arrays,

If you have only a few non-Clojure types, and they're relatively simple - then you can use extend-freeze, extend-thaw to write de/serializers for them. An int array would be particularly easy, you could use something like (doseq [i <array>] (.writeInt out i)), etc.

That way Nippy will immediately recognize the datatype and it won't try fall back to the Serializable interface.

Does that help / make sense?

from nippy.

shlomiv avatar shlomiv commented on September 27, 2024

Yes, indeed
Ill give this a shot tomorrow, since its 5:30 am here :P
Thanks!

from nippy.

shlomiv avatar shlomiv commented on September 27, 2024

well, seriously, who could sleep when there's a serializer to write?;)
it works perfectly now, thank you!

from nippy.

ptaoussanis avatar ptaoussanis commented on September 27, 2024

:-) Awesome, happy to hear that!

from nippy.

shlomiv avatar shlomiv commented on September 27, 2024

I have another question for you:

I didnt test it, but some people swear that using FileChannel would be a whole lot faster than DataOutputStream. Did you ever consider using it?

I am only asking because when working with really large objects, writing and reading times become important.

Thanks!

from nippy.

ptaoussanis avatar ptaoussanis commented on September 27, 2024

I'm not familiar with FileChannel, sorry. I remember looking into the nio Channel stuff way-back-when and coming to the conclusion then that for Nippy's particular use-case, it was somehow inferior (either inappropriate somehow, or no faster - can't recall).

Would be open to an issue/PR that explores this more fully, but it'd need at least a working proof-of-concept and comparison benchmarks.

Cheers! :-)

from nippy.

mpenet avatar mpenet commented on September 27, 2024

You're talking about #8 ? I guess it really depends on the shape/size of the data. Maybe my PR had some flaw(s) too.

from nippy.

ptaoussanis avatar ptaoussanis commented on September 27, 2024

Ahh, thanks Max! Yeah, that's what I had in mind - couldn't remember if it was something on here, or just something I'd experimented with privately.

I'm quite happy with the performance of Nippy at the moment for my own purposes, and have my hands full with work - but if either of you felt like resurrecting that PR to see how the numbers look against the current version, I'd be happy to take a fresh look.

Due to release v2.7.0 final soon, so this is something we could potentially get in the pipeline for v2.8.0.

Cheers! :-)

from nippy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.