Git Product home page Git Product logo

Comments (17)

Whitesttax avatar Whitesttax commented on August 26, 2024 1

I see!
I did that, here's how it looks now:
image

That was done instantly so I think something else is missing. Also can't find that .vtt file.

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024 1

I thought it was RAM not VRAM, so yeah it didn't work even with "tiny" image I guess I need to buy a GPU!

But really, thanks a lot for all the help, I learned a ton about ubuntu/python. I wanted to use this with jpdb's mpv plugin, which has color coded words based on my account's known words. It'd be amazing to mine as if it was anime, but for audiobooks.

Ah that's a good point - yes it's VRAM though the error from the OS doesn't really indicate that. For Japanese learning we might share some resources for this on TMW or check out my discord for more updates.

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

Yeah, I'm happy to help. Sorry its not clearer. You're the first person besides me to use this. So first off treat all the installs like they were inside Ubuntu. So you dont need to install ffmpeg on windows, but you do need to in Ubuntu. Then it should just be on the path.

Next, try running the split.sh command to get the m4b split into parts. Then you can run split_run which should run without any memory issues if your m4b got cut into chapters.

The run command you had should work too, if you dont feel like splitting and have enough ram to spare. Though I haven't been using it myself lately. Try converting your path to wsl, so something like "/mnt/c/documents/name" and using that instead of the wsl -a command, ehich only has the purpose of converting a windows path to a unix path.

I'm mobile right now, but if any of that doesn't make sense, let me know what you can try and what happens and I'll try to make clearer docs based on your pain points

from subplz.

Whitesttax avatar Whitesttax commented on August 26, 2024

This is what you meant, right? I feel like I'm missing something basic about ubuntu. All the installs I did were by command line on ubuntu, not windows. So ffmpeg, python, pip, stable-ts, were all installed on ubuntu wsl. Again, I didn't do step 4.

**
image
**

I get this if I type stable-ts, not sure if it helps
image

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

Ah I think I see your issue.
image
Your path is ./run.sh which means you are saying ./ is your current directory, and it has run.sh inside it. Since you showed your ls command, we can see you are not inside the repo!

Try running the command from inside this repo.
so cd ~/thisProject/run.sh should be somewhere on your path.

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

I can clean up the docs to clarify this but you should download the project with git.
git clone https://github.com/kanjieater/AudiobookTextSync.git
Then cd AudiobookTextSync

You should then have a folder with this stuff in it when you ls
image

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

Go ahead an git pull again. I've updated the readme. You should now be using ./run.sh for split or unsplit m4b's, and they should just work now. Let me know how it goes, and thanks for being patient with the troubleshooting.

In addition, there is now an anki command documented that can turn your m4b into an anki deck and import it for you.

from subplz.

Whitesttax avatar Whitesttax commented on August 26, 2024

Hey, thanks for the answer!!
I'm making progress now.
I did the git pull again (had to delete the old folder first) and I was getting some errors at first, like: ./split.sh: line 5: cd: too many arguments. I then pasted the audiobook folder inside AudiobookTextSync and used the absolute path for ubuntu:
powershell: \wsl.localhost\Ubuntu\home\pacote\AudiobookTextSync> wsl wslpath -a name
result: /home/pacote/AudiobookTextSync/name
Then it worked! Now I have the files split, but I'm getting an error when trying to ./run.sh:
image

Edit: I tried changing the code from run.sh line 17-19-20 from "python" to "python3" just to see what happens, then I get this instead:
image
I don't know if that helps!

I also tried rerunning pip install stable-ts just in case that's what the error meant, but it still happens even after rerunning it.

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

Good to hear you are making progress! Try these things

  1. You need to be on python 3.9.9. python --version should say 3.9.9
  2. Then install your dependencies, stable-ts from pip as pip install git+https://github.com/jianfch/stable-ts.git
  3. pip install -r requirements.txt
  4. Don't change the script, get python on your path as python 3. (If step one is working nothing to do here)
    If you do all of those things it should be working 👍

from subplz.

Whitesttax avatar Whitesttax commented on August 26, 2024

Did as you said and it's running, but it's like this for a while, is that common? There's is no .srt file on the folder yet so it's probably not finished, my PC is also still running hard.
image

Also I saw the read me and it's much easier to understand now.

Edit: might've been my fault, here's what I got after the program ended:
./run.sh: line 17: 838 Killed python3 split_run.py "$FOLDER"

Forgot to change the python thing, will try a guide using an alias.
Edit2: added python=python3 to bashrs, now "python" gives me python 3.9.9 but when I try to "./run.sh I still get the same, line 17: python: command not found
image

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

I'm not sure which step you're at there that it was slow. Which step did it seem like. Here are what exists now:

1.1 (not pushed yet) Filter down audio to improve future results - slow & probably not heavy cpu or gpu usage. Heavier on cpu
1.2 split_run & stable-ts: Starts off heavy on CPU & RAM to identify the audio spectrum
1.3 stable-ts: GPU heavy & long part, where it tries to transcribe a text from the audio. THis is the long progressbar part
1.4 Merge vtt's for split subs
2.1 Split the script
2.2 match the script to the generated transcription to get good timestamps

Any idea where you were at in there?

from subplz.

Whitesttax avatar Whitesttax commented on August 26, 2024

Since the progress bar was completed I'm guessing 1.4+
That's all I got at the end. My laptop was running really hard there, after the progress bar hit 100.
image
Here's the folder just in case:
image
image

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

Uh oh you ran out of memory! Unfortunately that's step 1.3. If it won't work for you there, I don't have a software solution. You could change the split-py from using large-v2 to medium model, but preformance of the model will be affected.

from subplz.

Whitesttax avatar Whitesttax commented on August 26, 2024

I see! I'll try again while closing other apps and monitor it with task manager

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

image
You can modify this line to any of the models in split_run.py. It may have get overwritten in future git pulls though. You can use any that whisper supports I think. Try a smaller one, but smaller means less training which means less accurate results.

The other solution would be to find a way to cut your files up even smaller. I do not have a solution for that though currently. You would have to make sure to not cut up anywhere besides silent parts, and ideally not between a sentence.

from subplz.

Whitesttax avatar Whitesttax commented on August 26, 2024

I thought it was RAM not VRAM, so yeah it didn't work even with "tiny"
image
I guess I need to buy a GPU!

But really, thanks a lot for all the help, I learned a ton about ubuntu/python.
I wanted to use this with jpdb's mpv plugin, which has color coded words based on my account's known words. It'd be amazing to mine as if it was anime, but for audiobooks.

from subplz.

kanjieater avatar kanjieater commented on August 26, 2024

Just want to correct something I said above.

After running a few tests it seems that Medium tends to outperform large actually due to how stable-ts works. jianfch/stable-ts#80 (comment)

Tiny also performs almost as well it seems to the point that I might leave that as the default.

If you have new questions or give it ago again in the future, feel free to open a new issue.

from subplz.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.