Git Product home page Git Product logo

mediawiki-to-markdown's Introduction

MediaWiki to Markdown

Convert MediaWiki pages to GitHub flavored Markdown (or other formats supported by Pandoc). The conversion uses an XML export from MediaWiki and converts each wiki page to an individual markdown file. Directory structures will be preserved. The generated export can also include frontmatter for Github pages.

You may also be interested in a forked version of this codebase available at https://github.com/philipashlock/mediawiki-to-markdown

You may also be interested in a forked version of this codebase available at https://github.com/outofcontrol/mediawiki-to-gfm

Requirements

  • Docker
  • Powershell

Export MediaWiki Pages

You'll export all your pages as a single XML file following these steps: http://en.wikipedia.org/wiki/Help:Export

Run

The simplest way to run is using the convert.ps1 script.

.\convert.ps1 -convertFileArgFullPath C:\wiki.xml

Further granular run parameters

In order to use any other options, you will have update the $dockerRunCmd variable in convert.ps1 script. The available options are below.

####--filename#### The only required parameter is filename for the name of the xml file you exported from MediaWiki, eg:

php convert.php --filename=mediawiki.xml

####--output#### You can also use output to specify an output folder since each wiki page in the XML file will generate it's own separate markdown file.

php convert.php --filename=mediawiki.xml --output=export

####--indexes#### You can set indexes as true if you want pages with the same name as a directory to be renamed as index.md and placed into their directory

php convert.php --filename=mediawiki.xml --output=export --indexes=true

####--frontmatter#### You can specify whether you want frontmatter included. This is automatically set to true when the output format is markdown_github

php convert.php --filename=mediawiki.xml --output=export --format=markdown_phpextra --frontmatter=true

####--format#### You can specify different output formats with format. The default is markdown_github. See

php convert.php --filename=mediawiki.xml --output=export --format=markdown_phpextra

Supported pandoc formats are:

  • asciidoc
  • beamer
  • context
  • docbook
  • docx
  • dokuwiki
  • dzslides
  • epub
  • epub3
  • fb2
  • haddock
  • html
  • html5
  • icml
  • json
  • latex
  • man
  • markdown
  • markdown_github
  • markdown_mmd
  • markdown_phpextra
  • markdown_strict
  • mediawiki
  • native
  • odt
  • opendocument
  • opml
  • org
  • plain
  • revealjs
  • rst
  • rtf
  • s5
  • slideous
  • slidy
  • texinfo
  • textile

mediawiki-to-markdown's People

Contributors

philipashlock avatar realrubberduckdev avatar victorklos avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.