Comments (7)
I think I've tracked this down to a bug in Nengo OCL's implementation of the Reset
op: it assumes that the value being reset is a vector, when it can be any shape. In the learning rule stuff, we reset a few ndarrays with 2 dimensions. I'll see if I can fix it.
from nengo-ocl.
A bit more detail on this... the way Nengo OCL implements Reset
is to see it as a special case of the monolithic MultiProdUpdate
op. I know that originally, RaggedArray
s and making as many things as possible a MultiProdUpdate
was the Big Idea behind how this would be super fast. At this point, though, we've done separate implementations of several ops, so I wonder if it's worth trying to write specific implementations of the other ops that map onto MultiProdUpdate
to see if that ends up making things faster or slower. It can say with some confidence that it would definitely make things more readable ;)
For reference, MultiProdUpdate
is used to implement Nengo's Reset
, Copy
and DotInc
. It seems to me that dedicated implementations of Reset
and Copy
would likely be faster than MultiProdUpdate
, but I'm sure there are weird OCL reasons why that might not be true.
from nengo-ocl.
Last update before I leave this alone for the day. Here's the test function I'm using that's failing:
import nengo
from nengo.builder.signal import Signal
from nengo.builder.operator import Reset
import numpy as np
import nengo_ocl
def test_reset(rng):
model = nengo.builder.Model(dt=0.001)
sig = Signal(rng.rand(2, 3))
model.add_op(Reset(sig))
sim = nengo_ocl.Simulator(None, model=model)
print(sim.signals[sig])
sim.step()
print(sim.signals[sig])
assert np.allclose(sim.signals[sig], 0.0)
Which prints
[[ 0.89384794 0.42370555 0.52896756]
[ 0.98856562 0.26257017 0.55930668]]
[[ 0. 0. 0.52896756]
[ 0.98856562 0.26257017 0.55930668]]
The kernel being executed is returned from many_dots_impl
. It only makes 2 work items for the signal, which is shape (2, 3)
. Either it should be making 6 work items (one for each element) or each work item should be going across a whole row, rather than a single item.
Note that if you set the shape to be (5,)
, it creates five work items.
from nengo-ocl.
So the reason Reset and Copy are put into MultiProdUpdate is that then they can all be done with the same kernel, along with the actual GEMV operations. Currently, Nengo OCL is only set up to have one kernel running at a time, so if we made separate kernels the Copy kernel would have to wait for the Reset kernel, etc. This could result in much lower occupancy than having everything as one kernel. That said, there is a tradeoff: specialized kernels should be at least a bit faster individually, so that could make up in part for the loss in occupancy. The real effect on speed is going to depend on the device and on the model.
Speed questions aside, we obviously have to fix the PES problem somehow. The first thing I would do is add a check in MultiProdUpdate to make sure that no matrices are being passed in, since the kernels aren't built for that. Then, the easiest thing would be just to separate out all the matrix resets, and do them with a separate kernel (I'm sure PyOpenCL has all the functions we need, so we shouldn't need a custom kernel). I think resets never depend on any other ops, so maybe it is better in this case to do them separately from the MultiProdUpdates, all together, at the beginning of the step. I would keep copies as part of MultiProdUpdates for now, especially since that's not relevant for this issue.
from nengo-ocl.
That makes sense! I can definitely see one complicated kernel being faster than a bunch of simple ones.
Your plan makes sense to me. Resets should be independent (we only allow one set
op per signal and it's a set). And yeah, copies aren't a problem right now, though I recommend that we test whether copies work with matrices; I started a little test_operators
set of tests that I'll get to a PR-worthy state hopefully this weekend, and add that test in there.
from nengo-ocl.
@tbekolay: Can you put those tests up anyway, so I can use them?
from nengo-ocl.
Done; in the test_ops
branch.
from nengo-ocl.
Related Issues (20)
- nengo_ocl installation HOT 2
- Add a way to register new plans
- Not compatible with Nengo 2.8 HOT 1
- Unexpected behaviour when using indexed connections from nengo.Nodes to EnsembleArray neurons HOT 5
- nengo_ocl in the nengo GUI HOT 1
- Support LIF.min_voltage HOT 1
- nengoocl profile HOT 1
- Incompatible with Nengo 3 HOT 16
- Multiple GPU support HOT 1
- Multiple sliced Neuron->Neuron connections do not work properly
- Adopt Nengo Bones HOT 1
- Write documentation HOT 1
- Document supported/tested environments
- Support BsrDotInc in order to support Nengo's optimizer HOT 2
- Pre-slicing with a transform matrix leads to incorrect connections HOT 2
- Host documentation on nengo.ai HOT 3
- Set up TravisCI
- block_impl geometry is slow
- ELLPACK for sparse multiply HOT 3
- `np.bool` deprecated in Numpy >= 1.20
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nengo-ocl.