Git Product home page Git Product logo

pirate / wikipedia-mirror Goto Github PK

View Code? Open in Web Editor NEW
333.0 8.0 29.0 10.73 MB

๐ŸŒ Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump

Home Page: https://docs.sweeting.me/s/self-host-a-wikipedia-mirror

License: MIT License

PLpgSQL 4.78% Shell 95.22%
wikipedia wikipedia-dump wiki mediawiki xowa nginx docker docker-compose internet-archiving archiving datascience mwdumper openzim zim html kiwix kiwix-offline-wikipedia wikipedia-mirror

wikipedia-mirror's Issues

Is there a way to keep the local dump up to date?

Thank you for the guide. It would be helpful to include a way to keep the setup up to date periodically. I wish there was a way to to just download the 'diff' every month instead of complete dump.

Few remarks about the README

The README.md is a good explanation work :)

Here my few comments:

Can't load .env variables into .conf

This line generates the .conf from the template, but doesn't seem to load any of the sourced .env variables -- nginx -t reports them as invalid...

# Fill your config options into nginx.conf.template to create nginx.conf
envsubst \
    "$(printf '${%s} ' $(bash -c "compgen -A variable"))"\
    < "$CONFIG_DIR/nginx.conf.template" \
    > "$CONFIG_DIR/nginx.conf"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.