Git Product home page Git Product logo

Comments (5)

bardisty avatar bardisty commented on July 30, 2024

There should be no leftover files unless you modified the rclone_command variable from move to copy:

-rclone_command="move"
+rclone_command="copy"

If you haven't already, try running the script with debug=true. This should tell what the culprit is. If not, please post the output here so I can take a look.

As for the .webm.parts, I'd guess youtube-dl hasn't completed those downloads yet or failed to for some reason. youtube-dl should continue downloading them next time you run ytdlrc. If it doesn't, check to see if their ID's are in the archive.list and if so, delete them and run ytdlrc again.

from ytdlrc.

bardisty avatar bardisty commented on July 30, 2024

I've added some checks to prevent the script from running if the rclone config isn't found or if the specified remote has any issues (not found, unauthorized, etc.). If either of these are the culprit the latest version should let you know.

To update:

  • Navigate to where you downloaded the repository
  • If you haven't already:
    • git add ytdlrc
    • git commit -m "Modify options to my taste"
  • git pull and merge

from ytdlrc.

bardisty avatar bardisty commented on July 30, 2024

Just realized you meant it's only failing to upload the metadata, not everything. This is in part due to youtube-dl's --exec option, it only uploads the downloaded video on completion and not any metadata. Metadata is uploaded/moved after all videos have been downloaded and moved to the rclone remote. It's possible if the channel is large enough - and the VPS disk is small enough - that the metadata alone might fill the disk before the script finishes processing all the videos.

For a channel with ~450 videos, the metadata (.description, .info.json, .jpg) should come out to roughly 120-130MB. .jpg's amount for about 80MB of that, .info.json ~45MB, and .description ~350KB. I'm guessing your disk filled up primarily due to those .webm.part files (I've yet to encounter this myself; curious what's causing it).

Couple ways you can try to circumvent this:

  • If your disk is really small and you don't care about thumbnail images, you could remove the --write-thumbnail line from the download_all_the_things() function:

    ytdlrc/ytdlrc

    Line 258 in 3d6b2fc

    --write-thumbnail \
    If the channel has thousands of videos this may save enough space for ytdlrc to finish processing the entire channel.
  • Utilize the --datebefore or --dateafter options in youtube-dl. E.g., add --datebefore 20150101 to the download_all_the_things() function (I recommend putting it after the --continue option). This would only download videos uploaded before Jan 1st 2015. Once that completes, change it to 20160101 and run it again, each time incrementing the date by one year until all videos are processed, then remove the line.

As for cleaning up the mess so ytdlrc can continue:

  • Grab the youtube ID's for the incomplete downloads in case you need to remove them from the archive.list: ls *.webm.part
  • Search the archive.list for the ID's and remove them if they exist
  • Free up some disk space, delete the .webm.part files
  • Run ytdlrc with debug=true

from ytdlrc.

if1mad avatar if1mad commented on July 30, 2024

Thank you for the very in-depth reply!

After some testing and thinking, I believe I figured out the cause - I was running rclone with custom flag --drive-chunk-size=128M and ran out of RAM at some point which cancelled the rclone upload and ytdlrc continued working its way down the list, after this repeating enough times my drive became full!

Totally my fault I must admit!

Something I noticed when looking through the limited screen session log is that ytdlrc will continue working through the list even with a full disk:


[download] Downloading video 605 of 661
[youtube] gFoq8-Xszjs: Downloading webpage
[youtube] gFoq8-Xszjs: Downloading video info webpage
WARNING: unable to extract uploader nickname
WARNING: video doesn't have subtitles
[youtube] gFoq8-Xszjs: Looking for automatic captions
WARNING: Couldn't find automatic captions for gFoq8-Xszjs
ERROR: unable to create directory [Errno 28] No space left on device: '/opt/ytdlrc/stage/FailArmy'
Traceback (most recent call last):
File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 1751, in ensure_dir_exists
os.makedirs(dn)
File "/usr/lib/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 28] No space left on device: '/opt/ytdlrc/stage/FailArmy'


I have a suggestion if you don't mind: A minimum space remaining on disk check before each youtube download is started.

Thank you for creating this wonderful script!

from ytdlrc.

bardisty avatar bardisty commented on July 30, 2024

Suggestions are more than welcome!

A free space check before each download is a possibility if I can work out a sane way to add it.

What you can do in the meantime is tell youtube-dl to abort if it encounters any errors by removing the --ignore-errors flag:

ytdlrc/ytdlrc

Line 252 in 3d6b2fc

--ignore-errors \

The downside to this is some videos tend to be unavailable (usually due to copyright) and youtube-dl treats these as errors, as such it will skip the rest of the channel even if your VPS has plenty of space left. If you opt to go this route, you'll need to check your syslog periodically or run ytdlrc manually now and then to see if any videos are causing youtube-dl to abort, and if so, manually add their youtube ID's to the archive.list so youtube-dl can continue with the rest of the channel.

FWIW, it should be fairly safe to keep --ignore-errors. I've been running ytdlrc for over a year now on 15 channels and the only time I've run out of disk space (on a tiny 10GB partition) is when one of the channels uploaded a 24hr long 4k video. Of course, YMMV.

from ytdlrc.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.