Git Product home page Git Product logo

Comments (11)

tz301 avatar tz301 commented on September 21, 2024 3

@tz301 Thanks, will update the documents.

@pkufool I have reproduce the results, excellent!

Some advice is that valid loss is missed, since the data per epoch is so small and valid loss computation will be skiped. During my reproduction, I have change valid log interval to 1000 and we can see valid loss on tensorboard below. CER is nearly the same (4.24%), compared with master (4.26%).
pd

from icefall.

csukuangfj avatar csukuangfj commented on September 21, 2024

There is an ongoing PR for the AISHELL dataset. Will be ready soon, I think. @pkufool

from icefall.

pingfengluo avatar pingfengluo commented on September 21, 2024

there is a PR #123

from icefall.

liyongze avatar liyongze commented on September 21, 2024

there is a PR #123

thanks~

from icefall.

liyongze avatar liyongze commented on September 21, 2024

There is an ongoing PR for the AISHELL dataset. Will be ready soon, I think. @pkufool

thanks~

from icefall.

liyongze avatar liyongze commented on September 21, 2024

There is an ongoing PR for the AISHELL dataset. Will be ready soon, I think. @pkufool

thanks~

from icefall.

csukuangfj avatar csukuangfj commented on September 21, 2024

It has been merged. You can find it at https://github.com/k2-fsa/icefall/tree/master/egs

from icefall.

tz301 avatar tz301 commented on September 21, 2024

That's very cool work.

When I see the aishell document and reproduce it, I find my att loss has obvious mismatch (much higher) with that in the tutorial docs (ctc loss is ok). I found that was caused by the change of LabelSmoothing algorithm in #109. Maybe @pkufool can help to update that? I'm a little confusing at first glance. So maybe also for other new users.

Also, if there will be some scripts to deal with both Chinese and English, that will be very great. Maybe combine bpe for English and char for Chinese?. Chinese with English (code-switch) is so common today!

from icefall.

csukuangfj avatar csukuangfj commented on September 21, 2024

When I see the aishell document and reproduce it, I find my att loss has obvious mismatch (much higher) with that in the tutorial docs (ctc loss is ok)

Please see #107

Before the change, the label smoothing loss has an additional term

q * log(q)

which is negative as q is in the range (0, 1); therefore the loss before the change is smaller, but it should not affect your final WER.

from icefall.

pkufool avatar pkufool commented on September 21, 2024

@tz301 Thanks, will update the documents.

from icefall.

pkufool avatar pkufool commented on September 21, 2024

Also, if there will be some scripts to deal with both Chinese and English, that will be very great. Maybe combine bpe for English and char for Chinese?. Chinese with English (code-switch) is so common today!

Thanks, it is a good idea, will consider adding code-switch recipe. You are very welcome to contribute this recipe as well, if you have time.

from icefall.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.