ultrasev / stream-whisper Goto Github PK

View Code? Open in Web Editor NEW

150.0 3.0 25.0 4.65 MB

基于 faster-whisper 的伪实时语音转写服务

Python 98.77% Dockerfile 1.23%

stream-whisper's Introduction

使用 Faster-whisper 模拟实时语音转写

使用方法

1. 拆分服务端与客户端

适合 GPU 在云端的场景。

服务端

负责接收客户端发送的音频数据，进行语音识别，然后把识别结果返回给客户端。

git clone https://github.com/ultrasev/stream-whisper
apt -y install libcublas11
cd stream-whisper
pip3 install -r requirements.txt

注：

libcublas11 是 NVIDIA CUDA Toolkit 的依赖，如果需要使用 CUDA Toolkit，需要安装。
经 @muzian666 提示，aioredis 包目前仍然不支持 Python3.11，Python 版本建议 3.8 ~ 3.10

把 .env 文件中的 REDIS_SERVER 改成自己的 Redis 地址，然后运行 python3 -m src.server，服务端就启动了。第一次执行时，会从 huggingface 上下载语音识别模型，需要等待一段时间。Huggingface 已经被防火墙特别对待了，下载速度很慢，建议使用代理。

客户端

负责录音，然后把音频数据发送给服务端，接收服务端返回的识别结果。

git clone https://github.com/ultrasev/stream-whisper
apt -y install portaudio19-dev
cd stream-whisper
pip3 install -r requirements.txt

注：

portaudio19-dev 是 pyaudio 的依赖，如果系统已安装，可以忽略。

同样需要把 .env 文件中的 REDIS_SERVER 改成自己的 Redis 地址，在本地机器上运行 python3 -m src.client，客户端就启动了。运行前先测试一下麦克风是否正常工作，确认能够正常录音。

2. 本地直接运行

如果本地有 GPU，可以直接运行 src/local_deploy.py，这样就可以在本地直接运行服务端和客户端了。

git clone https://github.com/ultrasev/stream-whisper
apt -y install portaudio19-dev  libcublas11
python3 src/local_deploy.py

Docker 一键部署自己的 whisper 转写服务

docker run -d --name whisper \
    -e MODEL
    -p 8000:8000 ghcr.io/ultrasev/whisper

接口兼容 OpenAI 的 API 规范，可以直接使用 OpenAI 的 SDK 进行调用。

from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000")

audio_file= open("/path/to/file/audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
  model="whisper-1",
  file=audio_file
)
print(transcription.text)

stream-whisper's People

Contributors

Stargazers

Watchers

stream-whisper's Issues

能不能详细介绍下本地部署过程

本地WIN11 显卡4090 用了WSL安装了Linux 使用说明里的命令各种问题能不能详细介绍一下

一个运行时的小建议

在执行过程中我发现在录取音频时如果发生这种情况：
1.还没想好要说什么，但已经检测到了静音，于是就往下执行程序了
2.说的没有静音检测来得快，导致先静音检测了，往下执行了
最好在start recording之后有一个检测内容的操作，如果在xx秒内没有检测到内容，则执行静音检测；避免录音程序和静音检测重复的执行

代码里面没有 .env 文件

问题1 ：不知道在哪里找 .env 文件。

问题2：你说 redis 地址不是必须的。那么用它来做什么？如何替代？

代码执行疑问

apt -y install portaudio19-dev
这一句linux的命令在windows的vscode怎么执行呢，我想在vscode上执行客户端的代码

Could not load library libcudnn_ops_infer

Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

ldconfig -p | grep libcudnn
libcudnn_ops.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_ops.so.9
libcudnn_ops.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_ops.so
libcudnn_heuristic.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_heuristic.so.9
libcudnn_heuristic.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_heuristic.so
libcudnn_graph.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_graph.so.9
libcudnn_graph.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_graph.so
libcudnn_engines_runtime_compiled.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9
libcudnn_engines_runtime_compiled.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so
libcudnn_engines_precompiled.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9
libcudnn_engines_precompiled.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so
libcudnn_cnn.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_cnn.so.9
libcudnn_cnn.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_cnn.so
libcudnn_adv.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_adv.so.9
libcudnn_adv.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn_adv.so
libcudnn.so.9 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn.so.9
libcudnn.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcudnn.so

这个怎么降级

RuntimeError: Library libcublas.so.12 is not found or cannot be loaded

服务器在运行时，总是报这个错误。
似乎是服务器也需要 cuda ？
这个在 requirement.txt 里面似乎没有。
我尝试自己安装，但是不太确定自己是否能解决。

(py310) root@autodl-container-a62d4480bf-a8c3ae0a:~/autodl-tmp/whisper_osu/stream-whisper# python -m src.server

INFO:root:Model loaded
INFO:faster_whisper:Processing audio with duration 00:17.130
Traceback (most recent call last):
File "/root/miniconda3/envs/py310/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/envs/py310/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/server.py", line 62, in
asyncio.run(main())
File "/root/miniconda3/envs/py310/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/root/miniconda3/envs/py310/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/server.py", line 58, in main
await asyncio.gather(transcribe())
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/server.py", line 51, in transcribe
text, _period = await asyncformer(b_transcribe)
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/utils.py", line 10, in asyncformer
return await loop.run_in_executor(pool, sync_func, *args, **kwargs)
File "/root/miniconda3/envs/py310/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/root/autodl-tmp/whisper_osu/stream-whisper/src/server.py", line 26, in b_transcribe
segments, info = model.transcribe("chunk.mp3",
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 344, in transcribe
encoder_output = self.encode(segment)
File "/root/miniconda3/envs/py310/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 767, in encode
return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: Library libcublas.so.12 is not found or cannot be loaded
(py310) root@autodl-container-a62d4480bf-a8c3ae0a:~/autodl-tmp/whisper_osu/stream-whisper#

您好，我安装webrtcvad的时候出现了下面的错误，我是Windows系统，请问怎么解决？

(base) D:\work>pip install webrtcvad==2.0.10 -i https://pypi.tuna.tsinghua.edu.cn/simple/ --trusted-host pypi.tuna.tsinghua.edu.cn
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple/
Collecting webrtcvad==2.0.10
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/89/34/e2de2d97f3288512b9ea56f92e7452f8207eb5a0096500badf9dfd48f5e6/webrtcvad-2.0.10.tar.gz (66
kB)
Preparing metadata (setup.py) ... done
Building wheels for collected packages: webrtcvad
Building wheel for webrtcvad (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [9 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-39
copying webrtcvad.py -> build\lib.win-amd64-cpython-39
running build_ext
building '_webrtcvad' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-bu
ild-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for webrtcvad
Running setup.py clean for webrtcvad
Failed to build webrtcvad
Installing collected packages: webrtcvad
Running setup.py install for webrtcvad ... error
error: subprocess-exited-with-error

× Running setup.py install for webrtcvad did not run successfully.
│ exit code: 1
╰─> [22 lines of output]
running install
C:\Users\JUN\AppData\Roaming\Python\Python39\site-packages\setuptools_distutils\cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is depr
ecated.
!!

          ********************************************************************************
          Please avoid running ``setup.py`` directly.
          Instead, use pypa/build, pypa/installer or other
          standards-based tools.
 
          See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
          ********************************************************************************
 
  !!
    self.initialize_options()
  running build

你好，客户端是不是做成http:get或post请求,给mcu调用，如ESP32,STM32,这样会不会更有意思，反应速度提高了，MCU减轻了负担，你觉得呢？

下载包出现错误

在下载库的时候，出现下载错误：
Building wheel for webrtcvad (setup.py) ... error
error: subprocess-exited-with-error

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for webrtcvad
Running setup.py clean for webrtcvad
Failed to build webrtcvad
ERROR: Could not build wheels for webrtcvad, which is required to install pyproject.toml-based projects

程序异常退出

程序运行时崩溃，我的cuda版本为11.7

Python环境依赖的问题和模型删除的问题

1.Could not load library cudnn_ops_infer64_8.dll. Error code 126
Please make sure cudnn_ops_infer64_8.dll is in your library path!
在网上找依赖包的添加教程时很模糊，能否指点一下

2.之前在本地电脑上下载过large v3的model 如何把他删除掉呢

本地运行需要网络吗

本地运行，第一次会自动下载模型，之后断网就无法运行

readme中的描述有错误

如题，readme中说 .env 中的 REDIS_URL 应该修改为 REDIS_SERVER，或者将src/config.py中的os.getenv('REDIS_SERVER')修改为os.getenv('REDIS_URL')，不然会一直提示找不到URL

另外，建议可以加一个python的版本要求，aioredis包目前仍然不支持python3.11，使用python3.11会出现duplicate base class TimeoutError，建议使用python3.9或3.10

数据传输协议的问题

项目的数据传输是不是只有音频数据的传输，博主提到的通过Redis的数据传输有什么独特的优点嘛，没用过不是很清楚，其他协议像socket或mqtt协议可以胜任项目里的数据传输的操作嘛

remove redis dependency in client

经 @muzian666 提示，aioredis 包目前仍然不支持 Python3.11，Python 版本建议 3.8 ~ 3.10
把 .env 文件中的 REDIS_SERVER 改成自己的 Redis 地址，

client 为啥要 redis 啊

还不支持不支持 Python3.11

搞的mac 不好部署client

麻烦改进一下

你好，我将requirements文件copy到colab后，运行!pip3 install -r /content/requirements.txt，出现了如下错误，请帮忙，谢谢。

Building wheels for collected packages: PyAudio, webrtcvad, faster-whisper
error: subprocess-exited-with-error

× Building wheel for PyAudio (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Building wheel for PyAudio (pyproject.toml) ... error
ERROR: Failed building wheel for PyAudio
Building wheel for webrtcvad (setup.py) ... done
Created wheel for webrtcvad: filename=webrtcvad-2.0.10-cp310-cp310-linux_x86_64.whl size=73465 sha256=f69cc565de2fb14351556dbae3035362e38df81be565e7495380f7cd7a117f89
Stored in directory: /root/.cache/pip/wheels/2a/2b/84/ac7bacfe8c68a87c1ee3dd3c66818a54c71599abf308e8eb35
Building wheel for faster-whisper (setup.py) ... done
Created wheel for faster-whisper: filename=faster_whisper-0.10.0-py3-none-any.whl size=1539726 sha256=b9e57aeb0f7a805f9530a5b9e5fbe8208d0633f1329158df96eb8836f3962378
Stored in directory: /root/.cache/pip/wheels/b3/4e/9a/bd36d2645cb73f909a3a65a2e317fec5c6a79c8121ab9eb42f
Successfully built webrtcvad faster-whisper
Failed to build PyAudio
ERROR: Could not build wheels for PyAudio, which is required to install pyproject.toml-based projects