Git Product home page Git Product logo

stream-whisper's Introduction

使用 Faster-whisper 模拟实时语音转写

visitors GitHub

使用方法

1. 拆分服务端与客户端

适合 GPU 在云端的场景。

服务端

负责接收客户端发送的音频数据,进行语音识别,然后把识别结果返回给客户端。

git clone https://github.com/ultrasev/stream-whisper
apt -y install libcublas11
cd stream-whisper
pip3 install -r requirements.txt

注:

  • libcublas11 是 NVIDIA CUDA Toolkit 的依赖,如果需要使用 CUDA Toolkit,需要安装。
  • @muzian666 提示,aioredis 包目前仍然不支持 Python3.11,Python 版本建议 3.8 ~ 3.10

.env 文件中的 REDIS_SERVER 改成自己的 Redis 地址,然后运行 python3 -m src.server,服务端就启动了。 第一次执行时,会从 huggingface 上下载语音识别模型,需要等待一段时间。Huggingface 已经被防火墙特别对待了,下载速度很慢,建议使用代理。

客户端

负责录音,然后把音频数据发送给服务端,接收服务端返回的识别结果。

git clone https://github.com/ultrasev/stream-whisper
apt -y install portaudio19-dev
cd stream-whisper
pip3 install -r requirements.txt

注:

  • portaudio19-dev 是 pyaudio 的依赖,如果系统已安装,可以忽略。

同样需要把 .env 文件中的 REDIS_SERVER 改成自己的 Redis 地址,在本地机器上运行 python3 -m src.client,客户端就启动了。运行前先测试一下麦克风是否正常工作,确认能够正常录音。

2. 本地直接运行

如果本地有 GPU,可以直接运行 src/local_deploy.py,这样就可以在本地直接运行服务端和客户端了。

git clone https://github.com/ultrasev/stream-whisper
apt -y install portaudio19-dev  libcublas11
python3 src/local_deploy.py

Docker 一键部署自己的 whisper 转写服务

docker run -d --name whisper \
    -e MODEL
    -p 8000:8000 ghcr.io/ultrasev/whisper

接口兼容 OpenAI 的 API 规范,可以直接使用 OpenAI 的 SDK 进行调用。

from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000")

audio_file= open("/path/to/file/audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
  model="whisper-1",
  file=audio_file
)
print(transcription.text)

stream-whisper's People

Contributors

ultrasev avatar sniperm99 avatar

Stargazers

Xianqi LIU avatar huanlongbi avatar WangKe avatar Leon avatar  avatar  avatar Hanwen Zhao avatar Flipped avatar  avatar  avatar 青春不是恋爱天 avatar  avatar  avatar  avatar mengnan avatar Clifford Woodbeck avatar  avatar Teh Chen Ming avatar banlilili avatar  avatar  avatar  avatar Jansanw avatar Jerry Tang avatar  avatar skyRolly avatar allen tern avatar JunXiang avatar msclock avatar LostWheels avatar Demigod_WT avatar  avatar 江晚 avatar JRT avatar  avatar  avatar  avatar  avatar Hongyu Chen avatar  avatar Shirun Li avatar Drinky avatar  avatar  avatar Tismagic avatar Edward avatar Damon avatar Astroking avatar lifenghua avatar zhengxianjun avatar Wang Liang avatar  avatar  avatar  avatar  avatar lee avatar 不神秘的神秘人 avatar zhangzhongyuan avatar tico Ag avatar  avatar davidwei_001 avatar GNIJ avatar  avatar ΚαΑρ avatar Allen Hsu avatar YL10 avatar  avatar Qianlong avatar Hunter avatar coswind avatar  avatar  avatar  avatar  avatar  avatar Zifeng Wang avatar WinSun avatar  avatar  avatar  avatar 吴题 avatar  avatar 冰封 avatar JackZeng avatar  avatar Raykie avatar  avatar Bili avatar Bo Wei avatar paladin avatar biongo avatar 7887ddff avatar stevensunzh avatar Leo Wu avatar  avatar  avatar Bai Feng avatar  avatar  avatar jjin avatar

Watchers

Kenn avatar  avatar  avatar

stream-whisper's Issues

Kaggle的疑问

(G~90TW%ALC95V97C$%2BVJ
为啥右边没有更换GPU加速的那个Accelerator加速的框框呢

一个运行时的小建议

在执行过程中我发现在录取音频时如果发生这种情况:
1.还没想好要说什么,但已经检测到了静音,于是就往下执行程序了
2.说的没有静音检测来得快,导致先静音检测了,往下执行了
最好在start recording之后 有一个检测内容的操作,如果在xx秒内没有检测到内容,则执行静音检测; 避免录音程序和静音检测重复的执行

代码里面没有 .env 文件

问题1 :不知道在哪里找 .env 文件。

问题2: 你说 redis 地址不是必须的。 那么用它来做什么? 如何替代?

代码执行疑问

apt -y install portaudio19-dev
这一句linux的命令在windows的vscode怎么执行呢,我想在vscode上执行客户端的代码

Could not load library libcudnn_ops_infer

Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

ldconfig -p | grep libcudnn
libcudnn_ops.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_ops.so.9
libcudnn_ops.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_ops.so
libcudnn_heuristic.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_heuristic.so.9
libcudnn_heuristic.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_heuristic.so
libcudnn_graph.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_graph.so.9
libcudnn_graph.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_graph.so
libcudnn_engines_runtime_compiled.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9
libcudnn_engines_runtime_compiled.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so
libcudnn_engines_precompiled.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9
libcudnn_engines_precompiled.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so
libcudnn_cnn.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_cnn.so.9
libcudnn_cnn.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_cnn.so
libcudnn_adv.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_adv.so.9
libcudnn_adv.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_adv.so
libcudnn.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn.so.9
libcudnn.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn.so

这个怎么 降级

RuntimeError: Library libcublas.so.12 is not found or cannot be loaded

服务器在运行时,总是报这个错误。
似乎是 服务器也需要 cuda ?
这个在 requirement.txt 里面似乎没有。
我尝试自己 安装,但是不太确定自己是否能解决。

(py310) root@autodl-container-a62d4480bf-a8c3ae0a:~/autodl-tmp/whisper_osu/stream-whisper# python -m src.server

INFO:root:Model loaded
INFO:faster_whisper:Processing audio with duration 00:17.130
Traceback (most recent call last):
File "/root/miniconda3/envs/py310/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/envs/py310/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/server.py", line 62, in
asyncio.run(main())
File "/root/miniconda3/envs/py310/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/root/miniconda3/envs/py310/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/server.py", line 58, in main
await asyncio.gather(transcribe())
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/server.py", line 51, in transcribe
text, _period = await asyncformer(b_transcribe)
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/utils.py", line 10, in asyncformer
return await loop.run_in_executor(pool, sync_func, *args, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/server.py", line 26, in b_transcribe
segments, info = model.transcribe("chunk.mp3",
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 344, in transcribe
encoder_output = self.encode(segment)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 767, in encode
return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: Library libcublas.so.12 is not found or cannot be loaded
(py310) root@autodl-container-a62d4480bf-a8c3ae0a:~/autodl-tmp/whisper_osu/stream-whisper#

您好,我安装webrtcvad的时候出现了下面的错误,我是Windows系统,请问怎么解决?

(base) D:\work>pip install webrtcvad==2.0.10 -i https://pypi.tuna.tsinghua.edu.cn/simple/ --trusted-host pypi.tuna.tsinghua.edu.cn
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple/
Collecting webrtcvad==2.0.10
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/89/34/e2de2d97f3288512b9ea56f92e7452f8207eb5a0096500badf9dfd48f5e6/webrtcvad-2.0.10.tar.gz (66
kB)
Preparing metadata (setup.py) ... done
Building wheels for collected packages: webrtcvad
Building wheel for webrtcvad (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [9 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-39
copying webrtcvad.py -> build\lib.win-amd64-cpython-39
running build_ext
building '_webrtcvad' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-bu
ild-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for webrtcvad
Running setup.py clean for webrtcvad
Failed to build webrtcvad
Installing collected packages: webrtcvad
Running setup.py install for webrtcvad ... error
error: subprocess-exited-with-error

× Running setup.py install for webrtcvad did not run successfully.
│ exit code: 1
╰─> [22 lines of output]
running install
C:\Users\JUN\AppData\Roaming\Python\Python39\site-packages\setuptools_distutils\cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is depr
ecated.
!!

          ********************************************************************************
          Please avoid running ``setup.py`` directly.
          Instead, use pypa/build, pypa/installer or other
          standards-based tools.
 
          See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
          ********************************************************************************
 
  !!
    self.initialize_options()
  running build

下载包出现错误

在下载库的时候,出现下载错误:
Building wheel for webrtcvad (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [9 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-39
copying webrtcvad.py -> build\lib.win-amd64-cpython-39
running build_ext
building '_webrtcvad' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for webrtcvad
Running setup.py clean for webrtcvad
Failed to build webrtcvad
ERROR: Could not build wheels for webrtcvad, which is required to install pyproject.toml-based projects

Python环境依赖的问题和模型删除的问题

1.Could not load library cudnn_ops_infer64_8.dll. Error code 126
Please make sure cudnn_ops_infer64_8.dll is in your library path!
在网上找依赖包的添加教程时很模糊,能否指点一下

2.之前在本地电脑上下载过large v3的model 如何把他删除掉呢

readme中的描述有错误

如题,readme中说 .env 中的 REDIS_URL 应该修改为 REDIS_SERVER,或者将src/config.py中的os.getenv('REDIS_SERVER')修改为os.getenv('REDIS_URL'),不然会一直提示找不到URL

另外,建议可以加一个python的版本要求,aioredis包目前仍然不支持python3.11,使用python3.11会出现duplicate base class TimeoutError,建议使用python3.9或3.10

数据传输协议的问题

项目的数据传输是不是只有音频数据的传输,博主提到的通过Redis的数据传输有什么独特的优点嘛,没用过不是很清楚,其他协议像socket或mqtt协议可以胜任项目里的数据传输的操作嘛

remove redis dependency in client

@muzian666 提示,aioredis 包目前仍然不支持 Python3.11,Python 版本建议 3.8 ~ 3.10
把 .env 文件中的 REDIS_SERVER 改成自己的 Redis 地址,

client 为啥要 redis 啊

还不支持不支持 Python3.11

搞的mac 不好部署client

麻烦改进一下

你好,我将requirements文件copy到colab后,运行!pip3 install -r /content/requirements.txt,出现了如下错误,请帮忙,谢谢。

Building wheels for collected packages: PyAudio, webrtcvad, faster-whisper
error: subprocess-exited-with-error

× Building wheel for PyAudio (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Building wheel for PyAudio (pyproject.toml) ... error
ERROR: Failed building wheel for PyAudio
Building wheel for webrtcvad (setup.py) ... done
Created wheel for webrtcvad: filename=webrtcvad-2.0.10-cp310-cp310-linux_x86_64.whl size=73465 sha256=f69cc565de2fb14351556dbae3035362e38df81be565e7495380f7cd7a117f89
Stored in directory: /root/.cache/pip/wheels/2a/2b/84/ac7bacfe8c68a87c1ee3dd3c66818a54c71599abf308e8eb35
Building wheel for faster-whisper (setup.py) ... done
Created wheel for faster-whisper: filename=faster_whisper-0.10.0-py3-none-any.whl size=1539726 sha256=b9e57aeb0f7a805f9530a5b9e5fbe8208d0633f1329158df96eb8836f3962378
Stored in directory: /root/.cache/pip/wheels/b3/4e/9a/bd36d2645cb73f909a3a65a2e317fec5c6a79c8121ab9eb42f
Successfully built webrtcvad faster-whisper
Failed to build PyAudio
ERROR: Could not build wheels for PyAudio, which is required to install pyproject.toml-based projects

client启动问题

请问启动client后会出现以下错误,这是因为没有声卡?
image
查看声卡,确实是有的
image

Kaggle平台环境配置和库安装问题

在运行时出现了如下报错,似乎是Library libcublas.so.11 is not found or cannot be loaded

之前我执行了sudo apt-get install libcublas-11-0,但并不作用
1706924248080

env环境问题

当我在vscode中正常调用.env文件里的变量时会显示错误,而在pycharm中正常调用则不会
在vscode中使用txt的格式之后则不会出错,但是使用env格式就会出错
1706888188737
是因为在vscode缺少了env的什么插件的原因嘛

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.