Comments (7)
@ikreymer What is the details nature of the problem we deal with here? I kind of assumed videos were working as this one of the first impressive things we had in the dev of warc2zim
.
from zimit.
I couldn't capture lesfondamentaux with current zimit. It doesn't capture the video. Tried with embed page as well. It seems that the JS code create the <video />
tag so maybe it's a --waitUntil
issue? How did you create yours?
from zimit.
This is not yet fully finished, but should work on lesfondamentaux.
Try the latest version from this branch: https://github.com/openzim/zimit/tree/config-opts
The command line I used was:
docker run -d -v ${PWD}/output:/output --cap-add=SYS_ADMIN --cap-add=NET_ADMIN --shm-size=1gb openzim/zimit --workers 6 --url https://lesfondamentaux.reseau-canope.fr/accueil.html
You can add a --limit 200
to just try a few pages.
(Edit: probably a few hundred is a good limit, not 20 as that will only get the category pages).
from zimit.
OK, FYI, didn't work on the embed URL
docker build . -t zimit && \
docker run -v $HOME/data/zimit/out:/output \
--cap-add=SYS_ADMIN --cap-add=NET_ADMIN --shm-size=1gb zimit \
--workers 6 --limit 200 --url https://lesfondamentaux.reseau-canope.fr/embed/limparfait-un-temps-regulier.html
Maybe there's something different about the embed. I'll check again on next iteration.
from zimit.
FYI I launched it on whole fondamentaux with lastest code and it worked fine.
from zimit.
Autoplaying of youtube and regular videos (that don't preload) will likely be needed for at least the following sites:
and possibly a few others that use video. The goal is to ensure youtube videos autoplay and other videos started manually as well.
from zimit.
@ikreymer What is the details nature of the problem we deal with here? I kind of assumed videos were working as this one of the first impressive things we had in the dev of
warc2zim
.
Sorry, forgot to respond! The default behavior already captures videos that autoplay on their own with a <video>
tag, such as fondamentaux.
This new work also captures videos that don't autoplay (by grabbing the URL directly) and also videos with custom embeds, like youtube.
from zimit.
Related Issues (20)
- Add support for `--logging` parameter of browsertrix crawler
- Pass scraper parameter to warc2zim HOT 1
- Remove cookie banners HOT 1
- Add parameter to exclude certain resources
- tvtropes is failing HOT 3
- TV Tropes 403 errors HOT 3
- Invalid leading whitespace in header HOT 1
- URL is different in error message HOT 2
- solar.lowtechmagazine.com is very unstable HOT 4
- Upgrade to browsertrix crawler 1.0.0 beta HOT 7
- Enhance integration test to assert final content of the ZIM
- Add support for downloading the browser profile from a URL
- networkidle is no longer a valid waitUntil HOT 7
- Browsertrix Crawler is stopping on disk full while it is not full HOT 2
- Zimit2: Youtube videos are not working everywhere HOT 8
- --exclude question HOT 4
- No output after quitting early HOT 3
- [zimit1] scraper never exits
- Crawler error: Cannot convert argument to a ByteString HOT 3
- Add option to only crawl website and not run warc2zim conversion HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zimit.