akras14 / speech-to-text Goto Github PK

View Code? Open in Web Editor NEW

172.0 11.0 89.0 41.43 MB

Example transcribing audio file (speech) to text with Google Cloud Speech API and Python

Python 100.00%

speech-to-text's People

Contributors

Stargazers

Watchers

Forkers

jnelson16 shashi1992 hinaya8 muzikworm shalomz fagnersutel docvaughan ricyoung harishdawath ninotna rootless4real piegu mdasuaje robspringles smanohar3 agcolella theobtl vivienzou1 gwtale sanketh2801 ariia-git o-date mcgraf yuchengwang themeworks sharathholy ariedamuco oc2ps aryanshar genesis02 kojosaah ziiin ikhwanr pythonthings dwgrigsby bidexbido oerdem19 putrhlim jho53 5heren cwchiu-faith aiwebbot zeliangwang bdarnell87 stevemurr magecommerce tiendht battyone allenwoods ozko externall-projects girish121003 jamjahal soumyachoudhary youssef0x3 fadibahodi blackbtech jessibreen anupgoenka ajeetkumar1212 cogitarian dodgymurx bargadeori sabogdan danishack mugundhs29 wheest selfcontrol7 niickymouze dks50217 guus-b-6007051 songhwangoo flamableconcrete aelgazar123 luoyang144 statisticaldatascience shyamanandanray leonatscarlett nickthomas andomeder christinaduthie5085 chester-cl-liu maniganesh2k20 solobeton99 lcsouzamenezes qualityontat1114 octoproc ic3b3rger gmh5225

speech-to-text's Issues

[HELP], Want to recognize the voice

I have used google cloud speech to text API which is working well but I need to show speakers just above the line. Suppose I have an audio in which 4 persons involved Now I want to get the persons just before start his / her text. Like
Person1:
Here is the text of person1.
Person2:
Here is the text of person2.
Person1:
Here is another line of text from person1.
Person3:
Here is the text of person3.
Can anyone let me know how I can get the speaker also with the text by using google API?

Speech to text

Can anyone tell me what is the reason of this issue, please.

I am a beginner and could not find anything on google. WAV docs are in Turkish. I don't know if it is related. Might be
Thank you for your time.
Ali

`Traceback (most recent call last):
File "fast.py", line 28, in
all_text = pool.map(transcribe, enumerate(files))
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 768, in get
raise self.value
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 48, in mapstar
return list(map(*args))
File "fast.py", line 21, in transcribe
text = r.recognize_google_cloud(audio, credentials_json=GOOGLE_CLOUD_SPEECH_CREDENTIALS)
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\site-packages\speech_recognition_init.py", line 937, in recognize_google_cloud
if "results" not in response or len(response["results"]) == 0: raise UnknownValueError()
speech_recognition.UnknownValueError

C:\Users\ASUS-25\Google Drive\Work\speech-to-text-master>`

ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC

Has anyone encountered a value error even though the audio file is a PCM wav? Any idea to solve it?
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC.

I ran the fast.py with some sample wav files and it worked perfectly! But when I tested it with audio files I collected from website, I got a value error even though the info from soxi command says otherwise.

I then re-ran the sample wav files that were previously worked, but received the same error messages.

Audio files I collected from website
I downloaded Amazon's audio (https://www.youtube.com/watch?v=CxK1VhtJlNQ), converted it to wav file at 16K sample rate and 1 channel. Split it into small pieces with py-webrtcvad.

soxi chunk-02.wav
Input File : 'chunk-02.wav'
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:00:03.03 = 48480 samples ~ 227.25 CDDA sectors
File Size : 97.0k
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM

Can we add more configs from google cloud speech ?

Hey akras,

enable_automatic_punctuation=True ?

Your codes i try configs and many things like enable_automatic_punctuation not working can u answer it thanks :)

Not actually an issue

I have setup your library locally and it works like charm thank you for good work ! I m trying to integrate this library with php but couldn't get it produce results in that case. This script is saved in a folder named speech_to_text and I m trying to execute it using php's shell command the code I run is $output = shell_exec("/usr/bin/python3 /var/www/html/speech_to_text/slow.py $directory"); and I have modified the slow.py file in following way

https://gist.github.com/khanof89/1c97f178dace3712991d114f95a3da2c

the following is the output I get:

foldername /var/www/html/podcasts-manage/storage/episode-2-of-the-awesome-mypodcast-a5dc/
['genevieve1.wav']
for f in tqdm /var/www/html/podcasts-manage/storage/episode-2-of-the-awesome-mypodcast-a5dc/genevieve1.wav
name /var/www/html/podcasts-manage/storage/episode-2-of-the-awesome-mypodcast-a5dc/genevieve1.wav
inside source
done source
credentials {
"type": "service_account",
"project_id": "api-project-11111111111",
"private_key_id": "private_key_id_goes_here",
"private_key": "-----BEGIN PRIVATE KEY-----\nMY_PRIVATE_KEY_GOES_HERE\n-----END PRIVATE KEY-----\n",
"client_email": "[email protected]",
"client_id": "1111111111111",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/analytics%40api-project-792103813257.iam.gserviceaccount.com"
}

exception in text=r.recognizerl": "https://www.googleapis.com/robot/v1/metadata/x509/analytics%40api-project-792103813257.iam.gserviceaccount.com"
}

exception in text=r.recognize

because I am creating this as a issue I have replaced many thing from my google credentials file but actually they are intact. Please help

How to make this work with Android ?

I have imported this project but no able to run in Android ?
How can I do that ?

akras14 / speech-to-text Goto Github PK

speech-to-text's People

Contributors

Stargazers

Watchers

Forkers

speech-to-text's Issues

[HELP], Want to recognize the voice

Speech to text

Can anyone tell me what is the reason of this issue, please.

ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC

Can we add more configs from google cloud speech ?

Not actually an issue

How to make this work with Android ?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent