Hi there, I'm attempting to run an ONNX model containing at Loop operation. Howeve

Here's an example ONNX files containing a loop: <a href="https://github.com/snipsc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

UnimplementedOp: Loop about tract HOT 9 CLOSED

sonos commented on September 27, 2024

UnimplementedOp: Loop

from tract.

Comments (9)

Steake commented on September 27, 2024 1

Any updates or insight on this?

from tract.

bio43 commented on September 27, 2024

Here's an example ONNX files containing a loop:
loop.zip
(had to ZIP it since GitHub wouldn't allow me to attach a .onnx file directly)

from tract.

kali commented on September 27, 2024

Ha, Loop is a bit more challenging that the Cast we just dealt with.

And there is a serious issue with it, the output size of the scan output is (at least in some cases) not known at network analysis time.

Implementing a Loop that works on cases where the output size is known (or where the scan outputs can be eliminated because only the final value is actually used) is a bit of work already, stealing/reusing code done for the Scan op, but is feasible. I would not necessarily recommend you jump into it right away... the Scan code is a bit difficult.

But implementing Loop in the general case need to find a way to deal with variable size tensors in tract. We actually already have something like that for a specific case (the streaming/pulsing) but it would need to be generalized here.

So, if you want to help, what can we do ?
1/ Try and see what modeling pattern lead to the use of Loop. Can it be rewritten in terms of Scan ? If yes, we can make the model work by rewritting it without changing tract.
2/ figure out
a) if the scan outputs are actually used
b) if the output size can be inferred at network analysis time (like, the loop is actually a trivial for loop in disguise)
3/ make test-cases (I see you send me something already), preferably with output data (either in a form similar to onnx test cases, or in the form we use in tract (model.onxx + io.npz file with inner tensor names matching the input and output names in the model)
4/ implement Loop for the easy case.
5/ decide if/how to deal with the general case.

from tract.

bio43 commented on September 27, 2024

@kali ah, yes, I was afraid this was not a straight forward fix. Thanks for the detailed reply.

My model uses a conditional random field, which during decode applies Viterbi: https://github.com/kmkurn/pytorch-crf/blob/master/torchcrf/__init__.py#L283 . Essentially is loops through a (variable length) sequence, at each time step calculate a value based on the previous time step. I'm can't really figure out how that would be done without a for-loop. As far as I know, PyTorch don't support the Scan operation and can't export it, so that's not an option for me.
The output size is very predictable (though it depends on the input sequence length, which is only known at runtime)
The test case I sent you is the simplest model using a loop I could think of. Here's the PyTorch code for it:

class LoopModel(torch.nn.Module):
    def forward(self, x):
        res = torch.zeros(3)
        for i in range(x.size(0)):
            res = torch.cat((res,res))
        return res

It simply takes in a 1-dimensional array of size X and then returns a 1-dimensional array of size 3*(X+1) of all zeros.
4) Did the above help clarify if this is "the easy case"? :)

from tract.

kali commented on September 27, 2024

So... I think we are in a somewhat "middle" case :)

This actually seems expressible in term of Scan with a streaming dimension, a use case that tract has good support for, as our use cases are revolving around real-time voice.

I think the best way forward is to test this: you can get your model to run if you "fix" it by rewriting the Loop as a Scan in a post-precessor: you can do it in python, open the protobuf model file and adjust the offending node. I am not sure if you need to do this directly with a protobuf api of if some higher-level onnx manupulation libs may help. Then use tract streaming / pulsing mode (I will help, the streaming mode works well but this part of the api is a proper minefield).

It would confirm your model stays in the category of what we can compute without digging a huge hole in tract, leaving the more generic Loop case out of scope.

The next step would then be to try to have tract detecting the pattern and doing the transformation on its own, so you (or anybody else later on) would no longer have to "fix" their model.

How does it sound ?

from tract.

bio43 commented on September 27, 2024

Hi @kali ,
I looked into this, but the nodes that PyTorch generates for Loop is not easily detectable by a specific pattern (the loop is over an integer, but this integer does not easily map to the length of tensor). FYI, I'll put this one ice for now and try to find a work-around.

from tract.

kali commented on September 27, 2024

Closing for now.

from tract.

kali commented on September 27, 2024

Closing for now.

from tract.

kali commented on September 27, 2024

Closing for now.

from tract.

UnimplementedOp: Loop about tract HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent