Comments (17)
I see!
I did that, here's how it looks now:
That was done instantly so I think something else is missing. Also can't find that .vtt file.
from subplz.
I thought it was RAM not VRAM, so yeah it didn't work even with "tiny" I guess I need to buy a GPU!
But really, thanks a lot for all the help, I learned a ton about ubuntu/python. I wanted to use this with jpdb's mpv plugin, which has color coded words based on my account's known words. It'd be amazing to mine as if it was anime, but for audiobooks.
Ah that's a good point - yes it's VRAM though the error from the OS doesn't really indicate that. For Japanese learning we might share some resources for this on TMW or check out my discord for more updates.
from subplz.
Yeah, I'm happy to help. Sorry its not clearer. You're the first person besides me to use this. So first off treat all the installs like they were inside Ubuntu. So you dont need to install ffmpeg on windows, but you do need to in Ubuntu. Then it should just be on the path.
Next, try running the split.sh command to get the m4b split into parts. Then you can run split_run which should run without any memory issues if your m4b got cut into chapters.
The run command you had should work too, if you dont feel like splitting and have enough ram to spare. Though I haven't been using it myself lately. Try converting your path to wsl, so something like "/mnt/c/documents/name" and using that instead of the wsl -a command, ehich only has the purpose of converting a windows path to a unix path.
I'm mobile right now, but if any of that doesn't make sense, let me know what you can try and what happens and I'll try to make clearer docs based on your pain points
from subplz.
This is what you meant, right? I feel like I'm missing something basic about ubuntu. All the installs I did were by command line on ubuntu, not windows. So ffmpeg, python, pip, stable-ts, were all installed on ubuntu wsl. Again, I didn't do step 4.
I get this if I type stable-ts, not sure if it helps
from subplz.
Ah I think I see your issue.
Your path is ./run.sh
which means you are saying ./
is your current directory, and it has run.sh inside it. Since you showed your ls command, we can see you are not inside the repo!
Try running the command from inside this repo.
so cd ~/thisProject/run.sh
should be somewhere on your path.
from subplz.
I can clean up the docs to clarify this but you should download the project with git.
git clone https://github.com/kanjieater/AudiobookTextSync.git
Then cd AudiobookTextSync
You should then have a folder with this stuff in it when you ls
from subplz.
Go ahead an git pull again. I've updated the readme. You should now be using ./run.sh for split or unsplit m4b's, and they should just work now. Let me know how it goes, and thanks for being patient with the troubleshooting.
In addition, there is now an anki command documented that can turn your m4b into an anki deck and import it for you.
from subplz.
Hey, thanks for the answer!!
I'm making progress now.
I did the git pull again (had to delete the old folder first) and I was getting some errors at first, like: ./split.sh: line 5: cd: too many arguments. I then pasted the audiobook folder inside AudiobookTextSync and used the absolute path for ubuntu:
powershell: \wsl.localhost\Ubuntu\home\pacote\AudiobookTextSync> wsl wslpath -a name
result: /home/pacote/AudiobookTextSync/name
Then it worked! Now I have the files split, but I'm getting an error when trying to ./run.sh:
Edit: I tried changing the code from run.sh line 17-19-20 from "python" to "python3" just to see what happens, then I get this instead:
I don't know if that helps!
I also tried rerunning pip install stable-ts just in case that's what the error meant, but it still happens even after rerunning it.
from subplz.
Good to hear you are making progress! Try these things
- You need to be on python 3.9.9.
python --version
should say 3.9.9 - Then install your dependencies, stable-ts from pip as
pip install git+https://github.com/jianfch/stable-ts.git
pip install -r requirements.txt
- Don't change the script, get python on your path as python 3. (If step one is working nothing to do here)
If you do all of those things it should be working 👍
from subplz.
Did as you said and it's running, but it's like this for a while, is that common? There's is no .srt file on the folder yet so it's probably not finished, my PC is also still running hard.
Also I saw the read me and it's much easier to understand now.
Edit: might've been my fault, here's what I got after the program ended:
./run.sh: line 17: 838 Killed python3 split_run.py "$FOLDER"
Forgot to change the python thing, will try a guide using an alias.
Edit2: added python=python3 to bashrs, now "python" gives me python 3.9.9 but when I try to "./run.sh I still get the same, line 17: python: command not found
from subplz.
I'm not sure which step you're at there that it was slow. Which step did it seem like. Here are what exists now:
1.1 (not pushed yet) Filter down audio to improve future results - slow & probably not heavy cpu or gpu usage. Heavier on cpu
1.2 split_run & stable-ts: Starts off heavy on CPU & RAM to identify the audio spectrum
1.3 stable-ts: GPU heavy & long part, where it tries to transcribe a text from the audio. THis is the long progressbar part
1.4 Merge vtt's for split subs
2.1 Split the script
2.2 match the script to the generated transcription to get good timestamps
Any idea where you were at in there?
from subplz.
Since the progress bar was completed I'm guessing 1.4+
That's all I got at the end. My laptop was running really hard there, after the progress bar hit 100.
Here's the folder just in case:
from subplz.
Uh oh you ran out of memory! Unfortunately that's step 1.3. If it won't work for you there, I don't have a software solution. You could change the split-py from using large-v2 to medium model, but preformance of the model will be affected.
from subplz.
I see! I'll try again while closing other apps and monitor it with task manager
from subplz.
You can modify this line to any of the models in split_run.py. It may have get overwritten in future git pulls though. You can use any that whisper supports I think. Try a smaller one, but smaller means less training which means less accurate results.
The other solution would be to find a way to cut your files up even smaller. I do not have a solution for that though currently. You would have to make sure to not cut up anywhere besides silent parts, and ideally not between a sentence.
from subplz.
I thought it was RAM not VRAM, so yeah it didn't work even with "tiny"
I guess I need to buy a GPU!
But really, thanks a lot for all the help, I learned a ton about ubuntu/python.
I wanted to use this with jpdb's mpv plugin, which has color coded words based on my account's known words. It'd be amazing to mine as if it was anime, but for audiobooks.
from subplz.
Just want to correct something I said above.
After running a few tests it seems that Medium tends to outperform large actually due to how stable-ts works. jianfch/stable-ts#80 (comment)
Tiny also performs almost as well it seems to the point that I might leave that as the default.
If you have new questions or give it ago again in the future, feel free to open a new issue.
from subplz.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from subplz.