Comments (10)
See also github/pages-gem#69
from mediawiki_to_git_md.
See also jgm/pandoc#1982 - we or pandoc must also turn spaces into underscores in the relative URLs, and deal with special characters like percent signs with encoding, etc.
This also applies to the filename for each wiki page, where we probably need to follow the MediaWiki "canonical database form". Quoting http://www.mediawiki.org/wiki/Manual:ImportImages.php which focused on files (see our issue #1):
Note: The "canonical database form" ... is obtained from the file name by capitalizing the first letter, replacing all spaces with underscores, and then replacing multiple consecutive underscores with one underscore.
Quoting http://www.mediawiki.org/wiki/Manual:Title.php
In the article name spaces and underscores are treated as equivalent and each is converted to the other in the appropriate context (underscore in URL and database keys, spaces in plain text). "Extended" characters in the 0x80..0xFF range are allowed in all places, and are valid characters. They are encoded in URLs. Extended characters are not urlencoded when used as text or database keys. Other characters may be ASCII letters, digits, hyphen, comma, period, apostrophe, parentheses and colon. No other ASCII characters are allowed, and will be deleted if found (they will probably cause a browser to misinterpret the URL).
from mediawiki_to_git_md.
Having tested this now, GitHub/Jekyll serves pages with and without the .html
extension UNLESS there is a folder present with the same base name (e.g. BioSQL.html
works, but BioSQL
does not - perhaps this is fixable?).
This means we can (probably?) continue to use the MediaWiki style links (without extension) and work when rendered via GitHub/Jekyll. However, the links won't work when viewing the markdown directly in GitHub where the links would need a .md
extension.
However, we still need to fix spaces vs underscores, capitalisation, and rare special characters in links.
from mediawiki_to_git_md.
With the biopython.org wiki we were lucky that one URL was a problem - wiki/BioSQL
- minor enough we can probably just move the single page child page wiki/BioSQL/Windows
to wiki/BioSQL_on_Windows
(or similar) and remove the problematic folder wiki/BioSQL/
.
For www.open-bio.org there are currently ten folders like wiki/BOSC/
and wiki/BOSC_2014/
which make direct access to the parent page require adding .html
to the URL. See OBF/OBF.github.io#1
from mediawiki_to_git_md.
The pandoc issue was fixed, the MediaWiki writer will do spaces to underscores in wikilink URLs.
from mediawiki_to_git_md.
Still issues (probably worth filing with pandoc), e.g. on the OBF main page,
``the scope and role of our [[membership]].in
Main_Page.mediawiki`
Pandoc v1.15.2 made this the scope and role of our [membership](membership "wikilink").
in Main_Page.md
.
It should have been made into a wikimedia style page name with leading capital, i.e. the scope and role of our [membership](Membership "wikilink").
instead.
from mediawiki_to_git_md.
Another problem, again on the OBF main page,
`[[:File:OBF-Bylaws.pdf| Our bylaws]] lay out how the Board is elected, ...``
with pandoc v1.15.2 gives:
[ Our bylaws](:File:OBF-Bylaws.pdf "wikilink") lay out how the Board is elected, ...
Interestingly :File:OBF-Bylaws.pdf
as the URL works in mediawiki, redirecting to File:OBF-Bylaws.pdf
which seems more sensible.
I've edited the wiki to link directly to the file with:
[[media:OBF-Bylaws.pdf|Our bylaws]] lay out how the Board is elected, ...
from mediawiki_to_git_md.
I've created another issue in pandoc re the files .
jgm/pandoc#3052
from mediawiki_to_git_md.
Have you found any solution for fixing links to other .md files, adding the extension? I wonder if anybody else wrote a script to post-process it, parse the generated MD, check links, and try to apply fixes to links that don't work according to which way they're broken.
from mediawiki_to_git_md.
@ec1oud We've just used extension-less URLs in all the MediaWiki conversions thus far.
from mediawiki_to_git_md.
Related Issues (20)
- Image scaling during MediaWiki to Markdown conversion HOT 4
- Deal with MediaWiki attachments nicely, e.g. images HOT 2
- Convert usernames in commit comments
- Handle MediaWiki User:XXX pages HOT 1
- Skip empty commits (e.g. reverts after a skipped spammer's edit) HOT 1
- Another python tag quirk when given id
- Closing Python tags not always at end of line
- What to do with sub-folders (slashes in MediaWiki page names)? HOT 2
- Avoiding colons in filenames (for working on Windows)
- Convert <bash> tags as another <source> variant. HOT 1
- Spot case variation in category tags
- Exception in subprocess when pandoc not installed HOT 2
- Handle case changes in article titles HOT 4
- Errors when converting dump with different localization than english HOT 2
- Quotes within article titles cause problems with the git command HOT 2
- The converter is not robust against markup errors within the MediaWiki dump HOT 9
- Auto-squash flurries of revisions from same author into one commit?
- Handle MediaWiki categories HOT 3
- MediaWiki templates
- MediaWiki redirects --> Jekyll's redirection HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mediawiki_to_git_md.