I don't think this is an issue on your part, I just don't know how to run it. <p d

I see! I did that, here's how it looks now: <a target="_blank" rel="noopener n

Ah I think I see your issue. <a target="_blank" rel="noopener noreferrer nofollow"

Good to hear you are making progress! Try these things You nee

Need some help setting it up about subplz HOT 17 CLOSED

Whitesttax commented on August 26, 2024

Need some help setting it up

from subplz.

Comments (17)

Whitesttax commented on August 26, 2024 1

I see!
I did that, here's how it looks now:

That was done instantly so I think something else is missing. Also can't find that .vtt file.

from subplz.

kanjieater commented on August 26, 2024 1

I thought it was RAM not VRAM, so yeah it didn't work even with "tiny" I guess I need to buy a GPU!

But really, thanks a lot for all the help, I learned a ton about ubuntu/python. I wanted to use this with jpdb's mpv plugin, which has color coded words based on my account's known words. It'd be amazing to mine as if it was anime, but for audiobooks.

Ah that's a good point - yes it's VRAM though the error from the OS doesn't really indicate that. For Japanese learning we might share some resources for this on TMW or check out my discord for more updates.

from subplz.

kanjieater commented on August 26, 2024

Yeah, I'm happy to help. Sorry its not clearer. You're the first person besides me to use this. So first off treat all the installs like they were inside Ubuntu. So you dont need to install ffmpeg on windows, but you do need to in Ubuntu. Then it should just be on the path.

Next, try running the split.sh command to get the m4b split into parts. Then you can run split_run which should run without any memory issues if your m4b got cut into chapters.

The run command you had should work too, if you dont feel like splitting and have enough ram to spare. Though I haven't been using it myself lately. Try converting your path to wsl, so something like "/mnt/c/documents/name" and using that instead of the wsl -a command, ehich only has the purpose of converting a windows path to a unix path.

I'm mobile right now, but if any of that doesn't make sense, let me know what you can try and what happens and I'll try to make clearer docs based on your pain points

from subplz.

Whitesttax commented on August 26, 2024

This is what you meant, right? I feel like I'm missing something basic about ubuntu. All the installs I did were by command line on ubuntu, not windows. So ffmpeg, python, pip, stable-ts, were all installed on ubuntu wsl. Again, I didn't do step 4.

**

**

I get this if I type stable-ts, not sure if it helps

from subplz.

kanjieater commented on August 26, 2024

Ah I think I see your issue.

Your path is ./run.sh which means you are saying ./ is your current directory, and it has run.sh inside it. Since you showed your ls command, we can see you are not inside the repo!

Try running the command from inside this repo.
so cd ~/thisProject/run.sh should be somewhere on your path.

from subplz.

kanjieater commented on August 26, 2024

I can clean up the docs to clarify this but you should download the project with git.
git clone https://github.com/kanjieater/AudiobookTextSync.git
Then cd AudiobookTextSync

You should then have a folder with this stuff in it when you ls

from subplz.

kanjieater commented on August 26, 2024

Go ahead an git pull again. I've updated the readme. You should now be using ./run.sh for split or unsplit m4b's, and they should just work now. Let me know how it goes, and thanks for being patient with the troubleshooting.

In addition, there is now an anki command documented that can turn your m4b into an anki deck and import it for you.

from subplz.

Whitesttax commented on August 26, 2024

Hey, thanks for the answer!!
I'm making progress now.
I did the git pull again (had to delete the old folder first) and I was getting some errors at first, like: ./split.sh: line 5: cd: too many arguments. I then pasted the audiobook folder inside AudiobookTextSync and used the absolute path for ubuntu:
powershell: \wsl.localhost\Ubuntu\home\pacote\AudiobookTextSync> wsl wslpath -a name
result: /home/pacote/AudiobookTextSync/name
Then it worked! Now I have the files split, but I'm getting an error when trying to ./run.sh:

Edit: I tried changing the code from run.sh line 17-19-20 from "python" to "python3" just to see what happens, then I get this instead:

I don't know if that helps!

I also tried rerunning pip install stable-ts just in case that's what the error meant, but it still happens even after rerunning it.

from subplz.

kanjieater commented on August 26, 2024

Good to hear you are making progress! Try these things

You need to be on python 3.9.9. python --version should say 3.9.9
Then install your dependencies, stable-ts from pip as pip install git+https://github.com/jianfch/stable-ts.git
pip install -r requirements.txt
Don't change the script, get python on your path as python 3. (If step one is working nothing to do here)
If you do all of those things it should be working 👍

from subplz.

Whitesttax commented on August 26, 2024

Did as you said and it's running, but it's like this for a while, is that common? There's is no .srt file on the folder yet so it's probably not finished, my PC is also still running hard.

Also I saw the read me and it's much easier to understand now.

Edit: might've been my fault, here's what I got after the program ended:
./run.sh: line 17: 838 Killed python3 split_run.py "$FOLDER"

Forgot to change the python thing, will try a guide using an alias.
Edit2: added python=python3 to bashrs, now "python" gives me python 3.9.9 but when I try to "./run.sh I still get the same, line 17: python: command not found

from subplz.

kanjieater commented on August 26, 2024

I'm not sure which step you're at there that it was slow. Which step did it seem like. Here are what exists now:

1.1 (not pushed yet) Filter down audio to improve future results - slow & probably not heavy cpu or gpu usage. Heavier on cpu
1.2 split_run & stable-ts: Starts off heavy on CPU & RAM to identify the audio spectrum
1.3 stable-ts: GPU heavy & long part, where it tries to transcribe a text from the audio. THis is the long progressbar part
1.4 Merge vtt's for split subs
2.1 Split the script
2.2 match the script to the generated transcription to get good timestamps

Any idea where you were at in there?

from subplz.

Whitesttax commented on August 26, 2024

Since the progress bar was completed I'm guessing 1.4+
That's all I got at the end. My laptop was running really hard there, after the progress bar hit 100.

Here's the folder just in case:

from subplz.

kanjieater commented on August 26, 2024

Uh oh you ran out of memory! Unfortunately that's step 1.3. If it won't work for you there, I don't have a software solution. You could change the split-py from using large-v2 to medium model, but preformance of the model will be affected.

from subplz.

Whitesttax commented on August 26, 2024

I see! I'll try again while closing other apps and monitor it with task manager

from subplz.

kanjieater commented on August 26, 2024

You can modify this line to any of the models in split_run.py. It may have get overwritten in future git pulls though. You can use any that whisper supports I think. Try a smaller one, but smaller means less training which means less accurate results.

The other solution would be to find a way to cut your files up even smaller. I do not have a solution for that though currently. You would have to make sure to not cut up anywhere besides silent parts, and ideally not between a sentence.

from subplz.

Whitesttax commented on August 26, 2024

I thought it was RAM not VRAM, so yeah it didn't work even with "tiny"

I guess I need to buy a GPU!

But really, thanks a lot for all the help, I learned a ton about ubuntu/python.
I wanted to use this with jpdb's mpv plugin, which has color coded words based on my account's known words. It'd be amazing to mine as if it was anime, but for audiobooks.

from subplz.

kanjieater commented on August 26, 2024

Just want to correct something I said above.

After running a few tests it seems that Medium tends to outperform large actually due to how stable-ts works. jianfch/stable-ts#80 (comment)

Tiny also performs almost as well it seems to the point that I might leave that as the default.

If you have new questions or give it ago again in the future, feel free to open a new issue.

from subplz.

Need some help setting it up about subplz HOT 17 CLOSED

Comments (17)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent