Git Product home page Git Product logo

mediawiki-to-gfm's Introduction

Mediawiki to GitHub Flavoured Markdown

License: MIT Build Status

Mediawiki to GFM is a script to convert a set of Mediawiki pages to GitHub Flavoured Markdown (GFM). This script was written from a necessity to convert a MediaWiki installation to a GitLab wiki. This code is based on MediaWiki to Markdown by Philip Ashlock. Philip graciously gave us permission to post our version as a new project.

Major differences include the addition of PHPUnit tests, code is broken into classes, deprecated code removed, work around for a bug in Pandoc added, fix for a common MediaWiki user error added, other small changes other small changes.

Requirements

Installation

git clone https://github.com/outofcontrol/mediawiki-to-gfm.git
cd mediawiki-to-gfm
composer update --no-dev

Run

Run the script on your exported MediaWiki XML file:

./convert.php --filename=/path/to/filename.xml --output=/path/to/converted/files 

Options

./convert.php --filename=/path/to/filename.xml --output=/path/to/converted/files --format=gfm --addmeta --flatten --indexes

--filename : Location of the mediawiki exported XML file to convert 
             to GFM format (Required)
--output   : Location where you would like to save the converted files
             (Default: ./output)
--format   : What format would you like to convert to. Default is GFM 
             (for use in GitLab and GitHub) See pandoc documentation
             for more formats (Default: 'gfm')
--addmeta  : This flag will add a Permalink to each file (Default: false)
--flatten  : This flag will force all pages to be saved in a single level 
             directory. File names will be converted in the following way:
             Mediawiki_folder/My_File_Name -> Mediawiki_folder_My_File_Name
             and saved in a file called 'Mediawiki_folder_My_File_Name.md' 
             (Default: false)
--help     : This help message (almost)

Export Mediawiki Files to XML

In order to convert from MediaWiki format to GFM and use in GitLab (or GitHub), you will first need to export all the pages you wish to convert from Mediawiki into an XML file. Here are a few simple steps to help you accomplish this quickly:

  1. MediaWiki -> Special Pages -> 'All Pages'
  2. With help from the filter tool at the top of 'All Pages', copy the page names to convert into a text file (one file name per line).
  3. MediaWiki -> Special Pages -> 'Export'
  4. Paste the list of pages into the Export field.
  5. Check: 'Include only the current revision, not the full history'
    Note: This convert script will only do latest version, not revisions.
  6. Uncheck: Include Templates
  7. Check: Save as file
  8. Click on the 'Export' button.
  9. An XML file will be saved locally.
  10. Use this convert.php script to convert the XML file a set of GFM formatted pages.

In theory you can convert to any of these formats… (not tested):
https://pandoc.org/MANUAL.html#description

Updates and improvements are welcome! Please only submit a PR if you have also written tests and tested your code! To run phpunit tests, update composer without the --no-dev parameter:

composer update

Thank you

@mloskot: Verify that this script does run in PHP 7.2 (#1)
@timwsuqld: First contribution!

Disclaimer

This script has not been tested on Windows.

mediawiki-to-gfm's People

Contributors

jsoref avatar outofcontrol avatar thetechrobo avatar timwsuqld avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.