This repository contains a simple Python application for speech recognition and text-to-speech (TTS) using Streamlit and the Whisper library.
The application provides two main functionalities:
- Speech Recognition: It allows users to upload an audio file or record audio directly through the app and transcribe the speech into text.
- Text-to-Speech: It utilizes the OpenAI-Whisper library to generate synthetic speech from text.
- Python 3.x
- Streamlit
- OpenAi-Whisper
- PyAudio
- Wave
-
Clone the repository:
git clone https://github.com/HimanshuMohanty-Git24/Basic-Speech-Recognition-TTS.git cd Basic-Speech-Recognition-TTS
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
streamlit run app.py
-
Choose an option:
- Upload Audio: Upload an audio file in WAV, MP3, or FLAC format for speech recognition.
- Record Audio: Record audio directly through the app using your device's microphone.
-
View the transcript or synthesized speech based on your selection.
- The application uses two pre-trained models: a base model for recorded audio and a medium model for uploaded audio. These models are provided by the Whisper library.
- Ensure that you have a working microphone if you choose to record audio through the app.
For any issues or suggestions, please feel free to open an issue on GitHub.