This is a project for a windows GUI application for subtitling system audio in soft real-time using Whisper running on Ryzen AI.
- Clone the repository
- Build the frontend
.\buildFrontend.ps1
- Install all backend dependencies Instruction in Backend README
- Run the GUI
cd .\backend python start_gui.py
- The application is currently unable to transcribe audio in any language other than English due to model limitations.
- The subtitle transcription for speaker loopback inputs is less accurate compared to direct microphone inputs.
- After clicking Stop Transcription, the application may become very slow due to ongoing backend transcription jobs. The application will return to normal speed after a short delay.
- Note: The download subtitles button remains functional and works independently of the backend.
- The application may crash if the user opens the settings window, selects the "Pin window to always stay on top" option, and then closes the settings window to quickly. (i.e. <2 seconds)
- This happens as the settings window has not been able to load the Settings javascript-python api bridge before it is closed.
- This issue can be mitigated by waiting at least 3 seconds after selecting the "Pin window to always stay on top" option before closing the settings window.
This application is built on top of the public domain work done at https://github.com/davabase/whisper_real_time.