Git Product home page Git Product logo

mm9942 / whisperaudiotranscriber Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 14 KB

WhisperAudioTranscriber is an asynchronous audio recording and transcription tool built using Python. It utilizes the Hugging Face API, specifically leveraging the powerful capabilities of OpenAI's Whisper model

License: MIT License

Python 100.00%
audio-transcribing audio-transcription huggingface openai-whisper pyaudio python transcription whisper

whisperaudiotranscriber's Introduction

WhisperAudioTranscriber

WhisperAudioTranscriber is an asynchronous audio recording and transcription tool built using Python. It utilizes the Hugging Face API, specifically leveraging the powerful capabilities of OpenAI's Whisper model. This tool is designed to capture audio input, transcribe or translate it in real-time, and handle audio streams efficiently with asynchronous programming.

Features:

  • Asynchronous Audio Recording: Captures audio through your device's microphone using PyAudio, handling streams in an asynchronous manner to ensure non-blocking operations.
  • Transcription and Translation: Integrates with the Whisper model to provide real-time transcription and optional translation. Note: The translation feature is currently not working properly and requires additional configuration and testing.
  • Robust Error Handling: Includes mechanisms to handle and retry API requests in case of errors like model loading times, ensuring reliable performance even during API downtimes or slow responses.
  • Secure Authentication: Utilizes environment variables for API tokens to ensure security and ease of configuration across different environments.

Usage:

The tool requires the specification of an output filename and supports an optional language parameter for translation. It is designed to be interruptible, allowing the user to stop recording gracefully with a signal interruption (Ctrl+C).

To use this script, run it from the command line with the following options:

usage: speech.py [-h] -f FILENAME [-l LANGUAGE]

Async audio recorder.

options: -h, --help show this help message and exit -f FILENAME, --filename FILENAME Filename to save the recording. -l LANGUAGE, --language LANGUAGE Language for audio translation, please use a two char country code like "en" (optional, does not work properly).

Example Command

python speech.py -f output.wav -l en

This command will start the asynchronous audio recorder, save the recording to output.wav, and attempt to translate the audio to English.

License

This project is released under the MIT License. For more details, see the LICENSE file in this repository.

whisperaudiotranscriber's People

Contributors

mm9942 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.