Comments (5)
Thanks for reporting a very important bug. I will replace it with comma, and in that way I can parse it as csv. If you have any other suggestion, I will highly appreciate it.
In #3, I just meant that I mostly run this in python 2, as I use google cloud ML. I intend it to be fully compatible with both versions of python. If you find any difficulty in running with python 3, please report it here.
I will make it windows compatible in a few days, as I am busy tuning the model for better performance and releasing a pre-trained model.
from deepsphinx.
Hi,
About the input line, the transcription may contain a comma so it can be a problem
unlesss the transcription is enclosed between quotes (which also works for CSV).
The backslash in the path is not such a huge problem if you only take the first three fields
and treat the rest as a single field. You could also use a "|".
I agree that a csv-like solution, however, is the best :-)
The difficulty with Python 3 is bug #3 - if I understood correctly.
Could you please provide estimates of how much training data is required?
for example, because no dictionary is involved, does it require more data than pocketsphinx
to read a comparable performance? (not sure if worth opening another ticket for that).
Looking forward a windows-compatible version, thank you very much!
Yuval
from deepsphinx.
I fixed #3 with 5ed88ec, that's why I closed it.
I am using training data of around 80 hours and there is mild overfitting, but even after training for long, the validation accuracy does not decreases, for the (to be updated) default model, without any dropout.
from deepsphinx.
I forgot to mention about escaping commas, but thought the exact same thing and use any of the available parsers for csv :-) .
I have designed the transcription in haste for temporary use, and stuck with it ever since. That's why I haven't used the more obvious way of doing it. Thanks again for noticing it.
from deepsphinx.
It works under Windows as well, great!
Yuval
from deepsphinx.
Related Issues (8)
- What is the preferred dataset for training? HOT 1
- Having issue while running the program. HOT 2
- Examples of script invocations for training and inference? HOT 4
- Cannot find the module "flags" HOT 9
- Error message when training HOT 2
- How big is the model you've trained, can we use him simply HOT 1
- Can you please add some more information/guidelines about how to integrate LM for decoding HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepsphinx.