Git Product home page Git Product logo

Comments (11)

LauraGPT avatar LauraGPT commented on May 28, 2024 1

@Langgz 我在做这样一件事情:在本地电脑读取麦克风,麦克风实时采集语音数据,通过websockets发送实时采集的数据到ASR服务端,由于咱们的asr模型是离线的,所有暂时以20s为界限,超过20s的时候会送入vad做检测,如果有声音片段,则将整个片段送入ASR解码。不知道有没有更好的建议。

3月发布,我们会发布流式vad来代替webrtcvad,和流式paraformer,到时候可以再试试

from funasr.

LauraGPT avatar LauraGPT commented on May 28, 2024

需要外接一个vad模型来过滤掉静音,只把有效音频输入到asr模型里面计算

from funasr.

v-yunbin avatar v-yunbin commented on May 28, 2024

如果有背景人声的话,应该也过滤不掉吧?

from funasr.

LauraGPT avatar LauraGPT commented on May 28, 2024

是的,但是有vad的话,只是背景人声的话,模型会把背景人声识别成文字,不会输出无意义的文字,你现在是环境噪声导致的无意义输出,这个vad是可以过滤掉的

from funasr.

v-yunbin avatar v-yunbin commented on May 28, 2024

是的,但是有vad的话,只是背景人声的话,模型会把背景人声识别成文字,不会输出无意义的文字,你现在是环境噪声导致的无意义输出,这个vad是可以过滤掉的

加了VAD似乎不起作用。

from funasr.

LauraGPT avatar LauraGPT commented on May 28, 2024

是的,但是有vad的话,只是背景人声的话,模型会把背景人声识别成文字,不会输出无意义的文字,你现在是环境噪声导致的无意义输出,这个vad是可以过滤掉的

加了VAD似乎不起作用。

用的是webrtcvad,还是我们的vad? https://www.modelscope.cn/models?page=1&tasks=voice-activity-detection&type=audio

from funasr.

v-yunbin avatar v-yunbin commented on May 28, 2024

是的,但是有vad的话,只是背景人声的话,模型会把背景人声识别成文字,不会输出无意义的文字,你现在是环境噪声导致的无意义输出,这个vad是可以过滤掉的

加了VAD似乎不起作用。

用的是webrtcvad,还是我们的vad? https://www.modelscope.cn/models?page=1&tasks=voice-activity-detection&type=audio
webrtcvad和阿里的vad都用了。

from funasr.

v-yunbin avatar v-yunbin commented on May 28, 2024

@Langgz 我在做这样一件事情:在本地电脑读取麦克风,麦克风实时采集语音数据,通过websockets发送实时采集的数据到ASR服务端,由于咱们的asr模型是离线的,所有暂时以20s为界限,超过20s的时候会送入vad做检测,如果有声音片段,则将整个片段送入ASR解码。不知道有没有更好的建议。

from funasr.

v-yunbin avatar v-yunbin commented on May 28, 2024

solved. @Langgz

from funasr.

zhuhao528 avatar zhuhao528 commented on May 28, 2024

现在有流式的vad了不

from funasr.

MooWeii avatar MooWeii commented on May 28, 2024

现在有流式的vad了不

如果有了喊我一声

from funasr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.