lbryio / ytsync Goto Github PK
View Code? Open in Web Editor NEWAn overly complex tool to mirror youtube content to LBRY
Home Page: https://lbry.com/youtube
License: MIT License
An overly complex tool to mirror youtube content to LBRY
Home Page: https://lbry.com/youtube
License: MIT License
New error as of a few weeks ago, affecting 30-50 videos per day:
download error: ERROR: unable to download video data: <urlopen error [Errno 110] Connection timed out>```
We want to support publishing to channels that are transferred to their owners so that they don't have the burden of doing it themselves yet still own the content.
Right now the channel won't get reprocessed and live streams/premieres won't be uploaded if they failed due to not being live / or while the live stream was in process. Only time they are reprocessed is when a new video is published.
This could be done as part of #25
@kauffj commented on Tue May 15 2018
Some publishers do not wish to retain copyright or wish to publish under creative commons or similar.
If this information is available via YouTube, we should match it.
If not, we should ask but provide a sane default (e.g. it could be on the lbry.io/youtube/token status page).
Steps:
status.license
from youtube api (https://developers.google.com/youtube/v3/docs/videos#status.license). If license is creativeCommon
, publish under the creative commons license (check which one youtube uses and use the same). If its youtube
, do whatever we do now.This still happens for some reason, and other times it's blocked: https://lbry.tv/@Crypto-Tips:b/will-alts-outperform-btc-gen-z-rebellion:1
This would involve:
If there are circular references in video descriptions, I do not see a way to do this without performing an additional edit on some publishes.
If a video is premiered or live streamed, they end up in a failed status with "download error: no compatible format available for this video". I'm guess they come in through the API, but aren't actually available yet. We need to reprocess these automatically or skip them until they are available (I think the api should be able to tell us).
Some examples:
https://www.youtube.com/watch?v=drmXDZ7Jad4 (premiered)
https://www.youtube.com/watch?v=7fd_LLYIauM (live)
See this page as an example.
imagemagick mogrify -trim -bordercolor black -fuzz x%
may get you most of the way there, but it's worth running over a decent sample set to ensure it doesn't fail on images that are already pretty black.
There are a bunch of videos in failed state with "publish error: Error in daemon: Not enough funds to cover this transaction." / "download error: unexpected EOF" (not sure why this one is happening)
At least for popular channels (either by YT status, IAPI subs, or both)
For some time period over the last two weeks, videos from ytsync
were malformed, reducing their streamability.
Requires similar issues to be completed on types/lbrysdk as #18
Not sure if we want to start publishing with new metadata after it's immediately available or once the full set is ready (if we publish immediately, we'll need to update the claims later with other metadata)
From the download file, grab the filename and file size to populate name/size in lbryio/types#9.
Grab video length from the YouTube api (or from lbry-sdk process if that's easier) to populate length in lbryio/types#9
Grab published date from YouTube api to populate releaseTime in lbryio/types#13
Grab category/tags from YouTube api to populate in lbryio/types#15 or lbryio/types#16
Grab language from YouTube api and populate language on claim metadata
Research other relevant youtube API data that may require new types/sdk entries.
Store any of these fields in the synced videos table as needed
YT sync channels go into failed mode on api hiccups or other temporary reason. We should retry these failed channels automatically.
videos published before transfer are not marked as transferred - this causes the video to fail in transferring. This happens if there are new videos after we got an address/pubkey
support list needs to use the default account id - otherwise it will try to abandon supports for transferred videos and fail to sync.
if the process breaks / needs restart, supports need to be resent manually
571b35609b90cffceafee38242a6c48f9499c276 had duplicate synced_videos, so the counts were wrong
when we try to delete some content after transfer, things fail - https://lbryians.slack.com/archives/D5W0D8ZJP/p1569617590246000
As part of creator partnerships @robvsmith is forming, creators are going to be embedding LBRY URLs directly in YouTube descriptions.
Preferably, creators would be able to know the LBRY URL of that video at the time of publishing to YouTube (and thus before picked up by sync).
Potential solutions to this should be discussed and reviewed before implementation.
Support transfer of claims and channels for YouTubers who have requested their content to be sent to their local wallets.
Main features:
@lyoshenka commented on Fri Mar 16 2018
I believe he has about 2 or 3 channels from old sycs - we need to fix this up.
See https://github.com/lbryio/lbry-redux/blob/362d764c4c0de23032b6871b4d54207dc548028c/src/lbryURI.js#L7 for what the app / sdk are using
@nikooo777 how feasible would it be to allow @robvsmith and other growth team members to edit YouTube publishes and channels? Ideally via the same UI in app.
Right now we have a pull-type service where the channel needs to be processed to determine if there are newly published videos. It can take anywhere from 1 minute to 24 hours.
@tzarebczan commented on Tue Jul 24 2018
In order to support existing LBRY creators who migrate to YouTube Sync (non-public/internal api issue - https://github.com/lbryio/internal-apis/issues/471), we would need to capture additional data like their channel signing certificate (needs to be transferred/stored securely) and wallet address to start publishing their content automatically.
@alyssaoc we may want to create an epic (or turn this issue into an epic) if and when we are ready to spec it out.
Looks like a bunch of youtube syncs were set to abandoned status prematurely because they were not rewards approved. Not sure how to best handle this yet - but we may have some YT sync users that get set to rewards disabled initially, and then set to approved. Some of these were synced, and then set to abandoned.
@nikooo777 commented on Wed Oct 03 2018
We want to start using the new daemon.
To do that that we need to figure out the API calls that changed and fix them.
https://www.youtube.com/watch?v=qJ7YuscmPsk
lbry://why-i-m-quitting-online-coaching#559a41f0602288a8d130324c6541d26c2f689d3c
Maybe we are trying to download the video while it's still processing? Does the api tell us?
There are about 26 videos (one for a channel I was recently looking into), with this error: thumbnail error: error creating thumbnail: Failed retrieving thumbnail with status code: 404
Here are 2 samples: https://www.youtube.com/watch?v=JXxhafcHF4Y
https://www.youtube.com/watch?v=kKXqxsXv5sw
This will serve as the ytsyc issue to implement the below and an epic to track types/sdk required changes. When the sdk tickets are filed, I'll update it.
Setup a process to perform claim updates, preferably in batch mode
(lbryio/lbry-sdk#1821). If easy update mode is available (lbryio/lbry-sdk#1423), use it, if not, read from existing claim data. This also includes grabbing the sdblob for each claim in order to populate some of the new file related fields.
From the sdblob, grab the filename and file size to populate name/size in lbryio/types#9.
Grab video length from the YouTube api to populate length in lbryio/types#9
Grab published date from YouTube api to populate releaseTime in lbryio/types#13
Grab category/tags from YouTube api to populate in lbryio/types#15 or lbryio/types#16
Grab language from YouTube api and populate language on claim metadata
Research other relevant youtube API data that may require new types/sdk entries.
@nikooo777 commented on Wed Jul 04 2018
Some content publishers want to have their videos published for a price.
For this reason ytsync must fetch the following parameters from the API: fee_amount
, fee_currency
, fee_address
and then supply them when calling the publish method.
This issue depends on the implementation on the API side first.
@tzarebczan commented on Tue Jul 31 2018
We recently synced some content without a price for a creator who was promised we'd sync at 1 LBC.
@nikooo777 commented on Thu Aug 02 2018
and I agree that's a problem. I will try to get this in my next sprint
@kauffj commented on Tue May 22 2018
This is not a good choice long-term. Ideally it would be spee.ch.
@nikooo777 commented on Tue Oct 02 2018
I'd like to have a discussion on this.
I disagree that we should use spee.ch for thumbnails, but I agree that we should use a better domain in place of berk.ninja
I can discuss this at the office with Jeremy but my TL;dr would be that if we're not doing it in a completely decentralized way then we should just stick to a centralized solution that we know works fine and will work fine.
using spee.ch would only cause a burden because for each single claim we need a second claim just for the thumbnail, we'd be polluting the blockchain with content that is strictly necessary (possibly considered metadata) of another claim, on top of that, we'd depend on spee.ch being online and working forever (both the domain and the infrastructure) rather than just worrying about the domain and the data being on S3 (easily moved/replaced).
I don't see this solution scaling well and could become a huge problem in the future, so either we bundle the thumbnail with the claim (so that the thumbnail itself is part of the blobs associated) or we use the centralized solution.
edit: this ended up not being a tl;dr....
@lyoshenka commented on Wed Oct 03 2018
I agree with niko - putting this on spee.ch means we're making two claims for each upload. It also means if spee.ch goes down, all of the thumbnails will break. Are we committed to spee.ch being up with some SLA? I thought we still use it for testing things, running advanced SDK builds, etc.
Currently, the duplicate checker may modify claims incorrectly if the YouTuber were to edit/publish/abandon some claims locally. We should be able to send them their wallets and continue publishing from YT.
The bitrate on the 720 exports is quite low (~600 kbps), so we should consider moving to 1080p if if it's better and the file size is not that much larger.
So that we can identify where creators come from and have the right people talk to them
We had discussed the possibility of ensuring that YouTube channel
(at least for those with X amount of subs) lbry URLs resolve at their vanity names. This recently came up as a source of confusion on Twitter: https://twitter.com/eevblog/status/1080609595289042945
This shouldn't be too hard to setup as part of the YT sync process.
A video may fail to sync for many different reasons and may not retry until a new video is published or we have the creator complain to us. We should put all the "legit" failures into a queue and have at least 1 server reprocess those.
This will make us be at ease to see that our channel or video is on the queue.
See https://github.com/lbryio/lbry-sdk/releases/tag/v0.44.0
breaking API change for all list commands which now all always paginate - for simple upgrade in client code without implementing pagination logic use page: 1 page_size: 1000 or similar and just extract the list in items
@tzarebczan commented on Tue Sep 11 2018
I think we may sometimes run into a scenario where we sync a video before it is post-processed in 720P+ on YouTube so we end up syncing a lower quality version. We should check when the video was posted, and if it's very recent (not sure what the timing for post-processing looks like), delay download/publish until the next iteration.
This could be happening due to a large # of tags with a combination of a long description (I know we truncate)
https://lbryians.slack.com/archives/CACSTN9SL/p1564663088289500
'the transaction was rejected by network rules.\n\n16: bad-txns-claimscriptsize-toolarge\n[0100000001cdcc791725317c775b77e37c7f2879d90d916176697c3b0fd266d9c802b0edb9010000006b483045022100ec9d07981fbafe98c5bf74261c0c350028e081e3071fc6156b23d0e5c013e42c02200dc41e49ff8d4f80a4552c92a9752626421cacc9daa9a6c4ee5da4d174a24a34012102c1735560213182766f37c2ee2920c61bbe3275e99b4ac1242c8e1a017621dbd7ffffffff0240420f0000000000fd3022b5246d6173732d707562672d737569636964652d707562672d7468652d6c6f6b737761722d334dec2101e752db183a80641aa6401f7665f6f080d69bc0a9a6f60015dd37bee723cec9f55f2367b766d6e5a7f9260894a1f7ae739850f8dda9e5c7a3f2f9d2a3c981bacdb2e94512f0309100b29f2d941a96e2a45794cca10ac9010a94010a30b1ac510e882aae5c54897ead660c1cc743e67e9214a8c743dc3d869a0e536124083585e18b213a67aefb945631b31407121e6d6173732d707562672d737569636964652d707562672d7468652e6d703418b7a5e4092209766964656f2f6d703432309ae835089b040ebfb4f7b5ae199eeae4ca51b1e8e1a83b07c06add43c85331db6597d4c79f3baf729aaecc2bc11c330d1a1f436f7079726967687465642028636f6e74616374207075626c69736865722928d9a184e0055a0908d60610d60318fb0142534d617373205055424720537569636964653f20e0a49ce0a4bee0a4a8e0a4b2e0a587e0a4b5e0a4be20e0a4b9e0a58820e0a4afe0a587207075626720e0a497e0a587e0a4ae202121546865206c6f6b737761724ac23ee0a4ade0a4bee0a4b0e0a4a420e0a4aee0a58720e0a4b9e0a4b020e0a486e0a4afe0a58720e0a4a6e0a4bfe0a4a820e0a495e0a58be0a48820e0a4a820e0a495e0a58be0a48820e0a497e0a587e0a4ae20e0a49ae0a4b0e0a58de0a49ae0a4be20e0a4aee0a587e0a48220e0a4ace0a4a8e0a58020e0a4b0e0a4b9e0a4a4e0a58020e0a4b9e0a588e0a5a4e0a49ce0a58b20e0a4afe0a581e0a4b5e0a4bee0a49320e0a495e0a58b20e0a485e0a4aae0a4a8e0a58020e0a493e0a4b020e0a486e0a495e0a4b0e0a58de0a4b7e0a4bfe0a4a420e0a495e0a4b020e0a495e0a58720e0a4a7e0a580e0a4b0e0a58720e0a4a7e0a580e0a4b0e0a58720e0a489e0a4a8e0a495e0a58b20e0a485e0a4aae0a4a8e0a58720e0a4a8e0a4bfe0a49ce0a58020e0a59be0a4a8e0a4a6e0a497e0a58020e0a4b8e0a58720e0a4a6e0a582e0a4b0e0a58020e0a4ace0a4a8e0a4be20e0a4a6e0a587e0a4a4e0a58020e0a4b9e0a588e0a5a4e0a48fe0a49520e0a490e0a4b8e0a4be20e0a4b9e0a58820e0a497e0a587e0a4ae205055424720e0a4b9e0a58820e0a49ce0a58b2020e0a4aae0a4bfe0a49be0a4b2e0a58720e0a495e0a581e0a49b20e0a4a6e0a4bfe0a4a8e0a58be0a48220e0a4b8e0a58720e0a4afe0a581e0a4b5e0a4bee0a49320e0a495e0a58720e0a4b8e0a4bfe0a4b020e0a49ae0a59d20e0a4ace0a58be0a4b220e0a4b0e0a4b9e0a58020e0a4b9e0a588e0a5a4e0a487e0a4b820e0a497e0a587e0a4ae20e0a495e0a58b20e0a4b2e0a587e0a495e0a4b020e0a4afe0a581e0a4b5e0a4bee0a49320e0a495e0a58720e0a4ace0a580e0a49a20e0a48fe0a49520e0a485e0a4b2e0a49720e0a4b9e0a58020e0a4a4e0a4b0e0a4b920e0a495e0a58020e0a4a6e0a580e0a4b5e0a4bee0a4a8e0a497e0a58020e0a4a6e0a587e0a496e0a4a8e0a58720e0a495e0a58b20e0a4aee0a4bfe0a4b220e0a4b0e0a4b9e0a58020e0a4b9e0a588e0a5a4e0a4b5e0a587e0a4b8e0a58720e0a4a4e0a58b207075626720e0a4aae0a582e0a4b0e0a58720e0a4ade0a4bee0a4b0e0a4a420e0a495e0a58720e0a4afe0a581e0a4b5e0a4bee0a493e0a48220e0a495e0a58720e0a4ace0a580e0a49a20e0a49be0a4bee0a4afe0a4be20e0a4b9e0a581e0a48620e0a4b9e0a58820e0a4b2e0a587e0a495e0a4bfe0a4a820e0a496e0a4bee0a4b8e0a495e0a4b020e0a495e0a58720e0a4ace0a482e0a497e0a4b2e0a58be0a4b020e0a495e0a58720e0a4afe0a581e0a4b5e0a4bee0a493e0a48220e0a495e0a4be20e0a4b2e0a4a420e0a4ace0a4a820e0a497e0a4afe0a4be20e0a4b9e0a588e0a5a4e0a487e0a4b8e0a495e0a58020e0a4b2e0a4a420e0a489e0a4a8e0a495e0a58020e0a4abe0a4bfe0a49ce0a4bfe0a495e0a4b220e0a4b9e0a587e0a4b2e0a58de0a4a520e0a4b8e0a58720e0a4b2e0a587e0a495e0a4b020e0a4aee0a587e0a482e0a49fe0a4b220e0a4b9e0a587e0a4b2e0a58de0a4a520e0a4aae0a4b020e0a4ade0a58020e0a485e0a4b8e0a4b020e0a4a1e0a4bee0a4b220e0a4b0e0a4b9e0a58020e0a4b9e0a588e0a5a4c2a0e0a4a8e0a580e0a482e0a4a620e0a495e0a58020e0a4aae0a4b0e0a587e0a4b6e0a4bee0a4a8e0a5802c20e0a485e0a4b8e0a4b220e0a49ce0a4bfe0a482e0a4a6e0a497e0a58020e0a4b8e0a58720e0a4a6e0a582e0a4b0e0a5802c20e0a495e0a589e0a4b2e0a587e0a49c20e0a4b520e0a4b8e0a58de0a495e0a582e0a4b220e0a4b8e0a58720e0a4b2e0a497e0a4bee0a4a4e0a4bee0a4b020e0a490e0a4ace0a58de0a4b8e0a587e0```
@nikooo777 commented on Thu Aug 02 2018
A few dirty things were merged in master to not leave the PR hanging any longer. They should be addressed:
@alyssaoc commented on Thu Nov 01 2018
Issue moved to lbryio/ytsync #4 via ZenHub
This way we won't sync something we didn't intend to.
don't fail channels when we run out of quota, instead wait until 0000PT and retry
The SDK part of this is defined here: lbryio/lbry-sdk#2879
The changes in ytsync would be:
stream_create
/stream_update
with --file_path_high_def
passing the 1080p file and --file_path_low_def
passing the 480p filehttps://lbryians.slack.com/archives/CACSTN9SL/p1564664667305700
Should this say 0? From other things I read, it should have a value of 10.
Believe we should be able to sync this legally by attributing the correct copyright.
https://www.youtube.com/user/khanacademy/videos
https://www.khanacademy.org/about/tos
"7.5 Crediting Khan Academy. If You distribute, publicly perform or display, transmit, publish, or otherwise make available any Licensed Educational Content or any derivative works thereof, You must also provide the following notice prominently along with such Licensed Educational Content or derivative work thereof: “All Khan Academy content is available for free at www.khanacademy.org”.
We ran into an issue recently where a bunch of existing channels were not being synced because they were in failed status and also because of some server issues. We should be able to re-try sync in some of these scenarios and provide better errors/notifications when things stop processing server side.
Main error to address Initial wallet setup failed! Manual Intervention is required.: channel_claim_id: cannot be blank.
Today, a channel in "deleted on youtube" status cannot be transferred. There's a couple people in this boat requesting their channel.
@nikooo777 commented on Thu Aug 02 2018
A few dirty things were merged in master to not leave the PR hanging any longer. They should be addressed:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.