This repository contains code and resources for demonstrating the power of OpenAI's Whisper API in combination with ChromaDB and LangChain for asking questions about your audio data. The demo showcases how to transcribe audio data into natural language with the Whisper API. The project also demonstrates how to vectorize data in chunks and get embeddings using OpenAI embeddings model.
We then use LangChain to ask questions based on our data which is vectorized using OpenAI embeddings model. We used ChromaDB database for storing and querying vectorized data.
To get started with the demo, you will need to have Python (I use Python 3.8) installed on your machine.
- Clone this repo
- Install the required Python packages by running the following command:
pip install -r requirements.txt
- You may need to run this if you get any errors related to the package
brew install ffmpeg
- Create a file named
.env
on project root and set your keyOPENAI_API_KEY={YOUR_OPENAI_API_KEY}
- Put your audio files in the
/audio_files
folder. It supports audios downladed from whatsapp web. - Run
python whisper.py
file to generate a .txt file out for each of your audio data - In line 51 of
ask_the_audio.py
changetext_files/sample.txt
for the name of the file you'd like to chat about - Now you can run
python ask_the_audio.py
which will start a chat bot that can answer questions over the .txt file.