Comments (2)
Hi @nishachalingal, you can download and extract a pre-trained model to the directory ./benchmarks/${dataset}/models
, which is the default setting. Alternatively, you can put the model to a directory other than that one. An extra step for the latter option is that you need to correspondingly change the model_path
and model_conf
at the configuration file in configs folder.
from visual_speech_recognition_for_multiple_languages.
Hey there.
I already installed requirements, model and language model.
I wonder which video format I must have. I'm using mp4 files and I get next error
(autoavsr) PS D:\Visual_Speech_Recognition_for_Multiple_Languages> python .\infer.py config_filename=.\configs\LRS3_V_WER19.1.ini data_filename=.\Grabacion.mp4 detector=mediapipe
Error executing job with overrides: ['config_filename=.\\configs\\LRS3_V_WER19.1.ini', 'data_filename=.\\Grabacion.mp4', 'detector=mediapipe']
Traceback (most recent call last):
File ".\infer.py", line 15, in main
output = InferencePipeline(cfg.config_filename, device=device, detector=cfg.detector, face_track=True)(cfg.data_filename, cfg.landmarks_filename)
File "D:\Visual_Speech_Recognition_for_Multiple_Languages\pipelines\pipeline.py", line 45, in __init__
self.model = AVSR(modality, model_path, model_conf, rnnlm, rnnlm_conf, penalty, ctc_weight, lm_weight, beam_size, device)
File "D:\Visual_Speech_Recognition_for_Multiple_Languages\pipelines\model.py", line 43, in __init__
self.token_list = ['<blank>'] + [word.split()[0] for word in open(file_path).read().splitlines()] + ['<eos>']
File "C:\Users\peduz\miniconda3\envs\autoavsr\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 4416: character maps to <undefined>
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Can u help me?
from visual_speech_recognition_for_multiple_languages.
Related Issues (20)
- 'git lfs pull' not work HOT 2
- Inference time HOT 2
- There is something wrong in README HOT 1
- Version issues HOT 3
- Is there an audio-visual Chinese model?
- RetinaFace detector not work HOT 1
- How are multiple datasets combined during training?
- pre-trained VSR / ASR model
- How deal with dataset already cropped mouth region HOT 1
- How does this compare to projects like SBL_For_Multilingual_Lip_Reading where they use phonemes to do multilingual lip reading
- GRID dataset
- CMU-MOSEAS Dataset HOT 4
- GRID dataset
- Error CMUMOSEAS_V_ES_WER44.5 model
- Fail to reproduce result from demo video HOT 3
- Using GPU in inference HOT 1
- Audio-Visual Speech Recognition Model HOT 1
- Multilingual models HOT 1
- Can lipreading support streaming or online? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from visual_speech_recognition_for_multiple_languages.