Git Product home page Git Product logo

Comments (6)

Z-yq avatar Z-yq commented on June 8, 2024

这个问题是必出现的吗?还是偶尔某几个数据的问题?

indices[1] = [1 , 776, 22] does not index into shape [4, 776, 23]
从这里看,更像是网络的输出不匹配原始数据的batch了

from tensorflowasr.

phecda-xu avatar phecda-xu commented on June 8, 2024

这个问题是必出现的吗?还是偶尔某几个数据的问题?

indices[1] = [1 , 776, 22] does not index into shape [4, 776, 23]
从这里看,更像是网络的输出不匹配原始数据的batch了

我觉得是必然出现的,因为试过多次,每次都会报错,且每次出问题时的数据形状都不一样,不可能是偶尔的某几个数据(同样的数据在训练你提供的网络是没问题的)。

至于下面的你说看上去网络的输出不必配原始数据的batch,实际上是 tf.scatter_nd这个函数的的内部计算过程,
先生成 indices =
[[0 397 12]
[1 776 22]
[2 377 12]
[3 776 22]]

而 [1 , 776, 22] 是indices其中的一行,然后 tf.scatter_nd执行时会在一个tensor的内部按照[1 , 776, 22] 的索引去插入0值,实际上应该是775(776 - 1),但是生成的是776(777 -1),超界限了。(用mel_layer是777-1,不用是776-1)

具体rnnt-loss的计算过程我还没看懂,不明白这个一步的目的,但是可以确信错误是在这里出现的。

另外,我怀疑,没有帧压缩的网络应该都会出现这种情况(比如纯RNN类的),我的TDNN,输入N帧的声学特征输出的是N帧的高维特征,帧数上没变,用mel_layer后恰好就出界了(只出界1帧)。而其它网络结构(CRNN/CNN等)出现帧数压缩的网络,大概不会出现这个问题,因为高维特征的帧数肯定低于声学特征帧数(此时mel_layer多1帧不影响)。

from tensorflowasr.

Z-yq avatar Z-yq commented on June 8, 2024

model 里的self.time_reduction_factor 有重新设置吗?
并且和am_data.yml里保持一致?

from tensorflowasr.

phecda-xu avatar phecda-xu commented on June 8, 2024

model 里的self.time_reduction_factor 有重新设置吗?
并且和am_data.yml里保持一致?

设置了,都为1,实际上网络内部没用到这参数。
我在fork的仓库里上传了 我的网络和配置tdnn,你可以试试看。

from tensorflowasr.

Z-yq avatar Z-yq commented on June 8, 2024

这就非常有意思了,在我这里,报错是因为卷积的空洞操作引起的。取消空洞操作之后一切正常。
自己暂时没有时间去跟这个问题。

from tensorflowasr.

phecda-xu avatar phecda-xu commented on June 8, 2024

这就非常有意思了,在我这里,报错是因为卷积的空洞操作引起的。取消空洞操作之后一切正常。
自己暂时没有时间去跟这个问题。

哦对,忘了说,要tensorflow2.3以上才可以,2.2以及以下的版本空洞卷积是有问题的。先这样吧,你有时间再说,谢谢了!

from tensorflowasr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.