Git Product home page Git Product logo

Comments (17)

zer0n avatar zer0n commented on June 11, 2024

It looks like that the backward pass is implemented here.

If you don't implement full forward/backward for new layers, how does Torch know what to compute? As far as I know, it doesn't have auto-diff.

Please correct if I'm missing something. Thanks @soumith !

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

it's because he manually manages his rnn (as a design choice for his project), instead of adding it into a standard container.
If he used standard nn or nngraph containers he would only need a :backward call that handles everything for him.

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

The only difference between Theano's autograd and Torch's autograd used to be the granularity at which autograd was being done. Theano does it at individual operation level, torch does it at nn.* operations. Torch now has a package that does autograd similar to theano: https://github.com/twitter/torch-autograd which takes in an arbitrary function of forward operations and generates the backward function

from deepframeworks.

zer0n avatar zer0n commented on June 11, 2024

OK, I've withheld my comment about Torch's RNNs until further exams. I think we both agree that Theano and Torch does autograd at different granularity. Since Theano does at operation level, it gives higher flexibility although the network definition could be more verbose.

I'm aware of autograd but it's an extension (which I haven't looked into yet). As said in the abstract, I only compared the current state of the libraries.

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

@zer0n if you "only compared the current state of the libraries.", Torch is an ndarray library and nothing more. nn is an extension, cutorch is an extension, nngraph is an extension and so on. Torch is nothing without it's ecosystem, neither is Caffe, ignoring them is actually a fairly unfair comparison just because they are packaged in a modular way.

from deepframeworks.

zer0n avatar zer0n commented on June 11, 2024

I hear you. I'll dig into autograd and update later. Give me time.

from deepframeworks.

pranv avatar pranv commented on June 11, 2024

You should also give some points to torch for having stateful RNNs. It's sort of hard to do in Theano and popular theano frameworks don't support it fully.

from deepframeworks.

zer0n avatar zer0n commented on June 11, 2024

@soumith I've been running this example of torch autograd. The performance is horrendous. I haven't dug into the code but it seems that torch autograd doesn't do symbolic differentiation a priori. Can you verify the performance on your system?

Mine takes about 1h just to go through 1 epoch. No GPU (I'm testing on a VM) but it's eating all the CPU cores.

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

@zer0n just ran it. Are you looking at performance wrt speed or accuracy?

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

it seems to be taking 2.2s per 1000 samples, quite horribly slow. The example is also doing batch size = 1 on a tiny network 3 layers, 16 feature maps. I dont know what realistic benchmark can be run for that. Modifying it to:

  • mini-batch inputs
  • a slightly larger network

will see the throughput increase by a non-trivial amount.

Modifying the existing network for mini-batch size 100, takes 0.5 s / 1000 samples.

I haven't dug into the code but it seems that torch autograd doesn't do symbolic differentiation a priori.

By a priori, do you mean, compiling the backward graph before-hand? No it doesn't do that. It runs backward on the fly.

from deepframeworks.

zer0n avatar zer0n commented on June 11, 2024

Yeah, I saw that it was using mini-batch size 1 as well.

Yes, I meant that it doesn't compile the backprop before-hand. So I don't see how it is competitive with the symbolic differentiation approach.

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

Hmm, I dont quite understand. It is doing symbolic differentiation. It can do symbolic differentiation at the granularity of either torch.* operations (which are any ndarray operations) or at the granularity of nn.* operations (which have compiled backward calls that are already written in the nn package).

I hope I didn't misunderstand "I don't see how it is competitive with the symbolic differentiation approach."

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

it doesn't optimize it's symbolic graph by fusing symbols together etc., if that's what you mean. In that sense, yea it doesn't have those optimizations yet, looks like it.

from deepframeworks.

zer0n avatar zer0n commented on June 11, 2024

@soumith I didn't even mean that level of optimization yet. The lack of compilation means that many computational steps are repeated for every pass over the network (this and this).

They also highlighted those performance issues here, in the Performance section.

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

@zer0n yea it seems to suffer perf issues from being an early release, just like TF. Coincidentally, they seemed to have JUST overhauled the library with better perf and debugging. Maybe it's a rapid work in progress. https://twitter.com/clmt/status/666348268327186433

from deepframeworks.

zer0n avatar zer0n commented on June 11, 2024

FYI, I've just revamped the review. Please check if my comments on Torch are fair.

from deepframeworks.

soumith avatar soumith commented on June 11, 2024

@zer0n the review overall looks correct now. Not just for Torch, but for TensorFlow and for Theano. Thanks a lot for revamping it. And thanks for spending all the time on it.

from deepframeworks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.