🐛 Describe the bug Looks like it's dispatching to efficient atten

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hmm <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Probably you can remove that check. It does not make sense to have <code class="notran

`test_dummy_mha_with_nt_cuda` fails on `sm70`, `sm75` about pytorch HOT 4 OPEN

eqy commented on July 18, 2024

`test_dummy_mha_with_nt_cuda` fails on `sm70`, `sm75`

from pytorch.

Comments (4)

drisspg commented on July 18, 2024 1

this is weird, is this only happening under some compile context?

from pytorch.

jbschlosser commented on July 18, 2024 1

@drisspg yes this test is for torch.compile() behavior with NJT + SDPA in a way that emulates what FIRST is doing

from pytorch.

drisspg commented on July 18, 2024

Hmm @danthe3rd do you know if when max_seq_len > sum(seq_len) is it possible to iterate into bad memory? I think the max_seq_len sets a max iteration bound and but there is still checks to ensure that current token indexes are valid right?

from pytorch.

danthe3rd commented on July 18, 2024

Probably you can remove that check. It does not make sense to have max_seq_len > sum(seq_len) tho, as it's always bounded by the sum, but it should be supported by the kernel I guess?
This is code I wrote some time ago so I don't have the context from the top of my head.

from pytorch.

`test_dummy_mha_with_nt_cuda` fails on `sm70`, `sm75` about pytorch HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent