Comments (5)
This was caused by the -a
flag not being included in warc2zim
, which means it only includes URLs from the domain of the URL. But I wonder if that should just be the default in warc2zim?
from zimit.
OK, then that's linked to the exposure of all warc2zim args in zimit (regarless of whether you want to set it as default there)
from zimit.
@ikreymer To me warc2zim
per default should scraper external resources. We have here a symptom of a weakness in warc2zim
IMO.
from zimit.
It's just a config option to filter, can be toggled the other way by default.
from zimit.
It's just a config option to filter, can be toggled the other way by default.
yes, and the default value is not the right one IMO and this is why we have now this ticket and discussion.
Default configuration should be the one we use most of the time.
from zimit.
Related Issues (20)
- Best way to self hose YouZim.it equivalent? HOT 1
- What is the advanage HOT 3
- Error when building docker image HOT 8
- Support Linux/ARM64 architecture HOT 1
- Full scrape fails while limited one succeeds HOT 2
- External links should not be open in a new tab HOT 1
- Handling of external links seems incoherent HOT 1
- Non-clickable link HOT 1
- NDLA recipe failed HOT 1
- Frequent initial connect timeouts
- New zim should not be defaulted to English HOT 7
- Better auto-detection of multilanguage content HOT 1
- Zim files on Library Education on Wikipedia HOT 1
- dp.la recipe succeeded but file is not complete HOT 1
- Issues with task 672c1 HOT 35
- Conflicting lang param
- MIT OpenCources Recipe issue HOT 1
- ZIMit 2.0 HOT 4
- Use browser-generated text as IndexData
- zim do not open in the Firefox ESR browser using the Kiwix extension: offline wikipedia reader HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zimit.