Comments (7)
The GRID Corpus which was the dataset I used has the following structure to their alignment files:
0 23750 sil
23750 29500 bin
29500 34000 blue
34000 35500 at
35500 41000 f
41000 47250 two
47250 53000 now
53000 74500 sil
The first number is the timestamp of the beginning of the word (or silence period, denoted with sil
) and the second number the timestamp for the end.
You could manually go over your dataset and write the timestamps and transcriptions for each video if it's small enough. Or you could use any of the available audio speech recognition libraries. You can try with pocketsphinx.
My particular implementation of lipnet does not use the timestamps, just the text. But you might want to experiment with the times too.
Hope this helped a bit!
from lipnet.
Thanks for the suggestion. What do you mean by just the text, is your align files different from GRID corpus align files?
from lipnet.
No, they are the same aligns as in the GRID Corpus but I just extract the words and ignore the numbers before handing that data to the model.
from lipnet.
OK thanks for the help. :)
from lipnet.
from lipnet.
from lipnet.
from lipnet.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lipnet.