Comments (2)
Unfortunately, there's not a good fix for this, other than double quotes, eg. --tag "'some;param'"
as the params are first handled by bash, then passed to JS, then passed to warc2zim. This would also affect other characters that are shell commands, like &
.
I can see about removing the bash script, but it does many useful things now, as far as changing permissions, creating dirs, trapping signal, etc.. which would otherwise have to be implemented in JS.
Because crawling often involves lots of complex options, the general practice is to use config files instead of trying to send all parameters on the command line.
To simplify creating specifying parameters, my recommendation would be to support a yaml based config file, which could be passed in as a volume or even be part of the output volume, eg. docker run -v /my/config.yaml:/config.yaml -it openzim/zimit zimit --url https://example.com/
. The config file could then specify the warc2zim command-line nicely:
warc2zim_opts: "--tag=a;b --publisher Test Publisher"
as well as other options which may be common to different types of crawls.
from zimit.
The obvious solution is to replace that shell script with a python one. Will be a lot clearer. You could even import warc2zim to check its part of the paeans...
Or am I missing something ?
from zimit.
Related Issues (20)
- `blockly_games_2023-01.zim` doesn't appear to work as intended HOT 1
- Several new ZIMs appearing in the zimit directory on download.kiwix.org are too small HOT 12
- Permanently move `developer.mozilla.org_en_all` to zimit directory HOT 4
- Permanently move `lowtechmagazine.com_en_all` to zimit directory HOT 3
- Mozilla developer zim file is broken HOT 7
- Alien content inside ZIM file HOT 5
- I meet some problem HOT 1
- Doesn't work on aarch64 HOT 2
- Best way to self hose YouZim.it equivalent? HOT 1
- What is the advanage HOT 3
- Error when building docker image HOT 8
- Support Linux/ARM64 architecture HOT 1
- Full scrape fails while limited one succeeds HOT 2
- External links should not be open in a new tab HOT 1
- Handling of external links seems incoherent HOT 1
- Non-clickable link HOT 1
- NDLA recipe failed HOT 1
- Frequent initial connect timeouts
- New zim should not be defaulted to English HOT 7
- Better auto-detection of multilanguage content HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zimit.