Light

imranq / tarteel-musullah Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 1.53 MB

Detect Qur'an audio and match to ayats in real-time

Home Page: https://ayatfinder.info

Jupyter Notebook 94.67% HTML 1.69% Python 2.97% CSS 0.67%

tarteel-musullah's Introduction

Set up

Install requirements

Download Whisper Tarteel model:

https://huggingface.co/tarteel-ai/whisper-base-ar-quran

Install Python Requirements

pip install -r requirements.txt

Run Flask Server:

python3 local_app.py

Alternatively, if you set up a OpenAI key, you can do inference with the hosted version of Whisper by setting your environment variable using this guide and then running python3 hosted_app.py

Navigate to http://127.0.0.1:5000 to use the app!

Todo

Split up audio into chunks for processing in parallel (every 3 seconds)
- Add a event / session / recording model to keep track of audio clips and join if necessary
- On html page, add separate tracks per ayat maybe
- Remove background noise and parse out non-arabic sections
Merge outputs together into a summary of what was recited
Add UI/UX for easy use
Set up server
Change to a word based algorithm, from a passage based algorithm to account for repetitions
Quantize model https://pytorch.org/tutorials/intermediate/dynamic_quantization_bert_tutorial.html#evaluate-the-inference-accuracy-and-time https://huggingface.co/docs/optimum/concept_guides/quantization
Try out Faster-Whisper (https://github.com/guillaumekln/faster-whisper)
Word based approach could start using string matching for different roots for bi-grams, tri-grams etc
Deal with a tree of expanding notes of roots
stop recording after certain amount of silence 1 minute or so
Fix the issue when users repeat a verse
- Potential solution: Remove repetitious phrases
Look into switching to streamlit
Add tests and componentize functions

Algorithmic approaches

Whisper model to transcribe speech to arabic using Tarteel AI
Once transcript is retrieved, we use rapidfuzz for the string matching

tarteel-musullah's People

Contributors

Stargazers

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.