Git Product home page Git Product logo

Comments (7)

tbekolay avatar tbekolay commented on June 12, 2024

I think I've tracked this down to a bug in Nengo OCL's implementation of the Reset op: it assumes that the value being reset is a vector, when it can be any shape. In the learning rule stuff, we reset a few ndarrays with 2 dimensions. I'll see if I can fix it.

from nengo-ocl.

tbekolay avatar tbekolay commented on June 12, 2024

A bit more detail on this... the way Nengo OCL implements Reset is to see it as a special case of the monolithic MultiProdUpdate op. I know that originally, RaggedArrays and making as many things as possible a MultiProdUpdate was the Big Idea behind how this would be super fast. At this point, though, we've done separate implementations of several ops, so I wonder if it's worth trying to write specific implementations of the other ops that map onto MultiProdUpdate to see if that ends up making things faster or slower. It can say with some confidence that it would definitely make things more readable ;)

For reference, MultiProdUpdate is used to implement Nengo's Reset, Copy and DotInc. It seems to me that dedicated implementations of Reset and Copy would likely be faster than MultiProdUpdate, but I'm sure there are weird OCL reasons why that might not be true.

from nengo-ocl.

tbekolay avatar tbekolay commented on June 12, 2024

Last update before I leave this alone for the day. Here's the test function I'm using that's failing:

import nengo
from nengo.builder.signal import Signal
from nengo.builder.operator import Reset
import numpy as np
import nengo_ocl

def test_reset(rng):
    model = nengo.builder.Model(dt=0.001)
    sig = Signal(rng.rand(2, 3))
    model.add_op(Reset(sig))

    sim = nengo_ocl.Simulator(None, model=model)
    print(sim.signals[sig])
    sim.step()
    print(sim.signals[sig])
    assert np.allclose(sim.signals[sig], 0.0)

Which prints

[[ 0.89384794  0.42370555  0.52896756]
 [ 0.98856562  0.26257017  0.55930668]]
[[ 0.          0.          0.52896756]
 [ 0.98856562  0.26257017  0.55930668]]

The kernel being executed is returned from many_dots_impl. It only makes 2 work items for the signal, which is shape (2, 3). Either it should be making 6 work items (one for each element) or each work item should be going across a whole row, rather than a single item.

Note that if you set the shape to be (5,), it creates five work items.

from nengo-ocl.

hunse avatar hunse commented on June 12, 2024

So the reason Reset and Copy are put into MultiProdUpdate is that then they can all be done with the same kernel, along with the actual GEMV operations. Currently, Nengo OCL is only set up to have one kernel running at a time, so if we made separate kernels the Copy kernel would have to wait for the Reset kernel, etc. This could result in much lower occupancy than having everything as one kernel. That said, there is a tradeoff: specialized kernels should be at least a bit faster individually, so that could make up in part for the loss in occupancy. The real effect on speed is going to depend on the device and on the model.

Speed questions aside, we obviously have to fix the PES problem somehow. The first thing I would do is add a check in MultiProdUpdate to make sure that no matrices are being passed in, since the kernels aren't built for that. Then, the easiest thing would be just to separate out all the matrix resets, and do them with a separate kernel (I'm sure PyOpenCL has all the functions we need, so we shouldn't need a custom kernel). I think resets never depend on any other ops, so maybe it is better in this case to do them separately from the MultiProdUpdates, all together, at the beginning of the step. I would keep copies as part of MultiProdUpdates for now, especially since that's not relevant for this issue.

from nengo-ocl.

tbekolay avatar tbekolay commented on June 12, 2024

That makes sense! I can definitely see one complicated kernel being faster than a bunch of simple ones.

Your plan makes sense to me. Resets should be independent (we only allow one set op per signal and it's a set). And yeah, copies aren't a problem right now, though I recommend that we test whether copies work with matrices; I started a little test_operators set of tests that I'll get to a PR-worthy state hopefully this weekend, and add that test in there.

from nengo-ocl.

hunse avatar hunse commented on June 12, 2024

@tbekolay: Can you put those tests up anyway, so I can use them?

from nengo-ocl.

tbekolay avatar tbekolay commented on June 12, 2024

Done; in the test_ops branch.

from nengo-ocl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.