The Wayback Machine - https://web.archive.org/web/20200415170321/https://github.com/WikiTeam/wikiteam/wiki/Wiki-engine-notes
Skip to content

Wiki engine notes

PiRSquared17 edited this page Sep 23, 2014 · 9 revisions

MediaWiki

List of wikis

Already exists.

Dump algorithm

Already exists.

Dump format.

MediaWiki XML dump + extras. Notes:

DokuWiki

List of wikis

Dump algorithm

Sketch:

  • Check XML-RPC API availability.
  • Generate list of titles (XML-RPC)
  • Use 'wiki.getAllPages' or 'dokuwiki.getPagelist' (is this restricted to single NS at a time?). (TODO: determine if this is usable)
  • Generate list of titles (do=index or ajax.php)
  • Try to load /lib/exe/ajax.php call=index (exists in newer versions) or do=index (present since first release)
  • Recursively, for each namespace, load the appropriate sub-index, adding each title to a list.
  • Add root pages to list.
  • Generate list of media/files/uploads
  • Check for /lib/exe/mediamanager.php or do=media
  • If it exists, use it recursively in each namespace, collecting a list of all file names (or just downloading each) e.g. ?ns=n1:ns2
  • Extract file details from /lib/exe/detail.php or do=media
  • Or /lib/exe/ajax.php call=medialist on each namespace?
  • Or XML-RPC?
  • Export current page content
  • export: do=export_raw (not in first release) or XML-RPC wiki.getPage. wiki.getPageInfo gives metadata about the current revision.
  • do=edit scrape textarea content
  • Export full history
  • do=revisions or XML-RPC wiki.getPageVersions.
  • For each revision, do ?rev=insert_rev_id_here&do=edit or ?rev=1234&do=export_raw. Or XML-RPC wiki.getPageVersion. wiki.getPageInfoVersion gives revision metadata.
  • Get site version and metadata
  • Note: In recent DokuWiki releases, it is not possible to get the version.
  • Download do=check
  • Try to preview a page with the following content (example output):
====== ~~INFO:syntaxmodes~~ ======
~~INFO:syntaxmodes~~
====== ~~INFO:syntaxtypes~~ ======
~~INFO:syntaxtypes~~
====== ~~INFO:syntaxplugins~~ ======
~~INFO:syntaxplugins~~
====== ~~INFO:adminplugins~~ ======
~~INFO:adminplugins~~
====== ~~INFO:actionplugins~~ ======
~~INFO:actionplugins~~
====== ~~INFO:rendererplugins~~ ======
~~INFO:rendererplugins~~
====== ~~INFO:helperplugins~~ ======
~~INFO:helperplugins~~
====== ~~INFO:helpermethods~~ ======
~~INFO:helpermethods~~
====== ~~INFO:authplugins~~ ======
~~INFO:authplugins~~
====== ~~INFO:remoteplugins ~~ ======
~~INFO:remoteplugins~~
====== ~~INFO:version~~ ======
~~INFO:version~~

Dump format

Compressed data directory. cache, index, locks, tmp probably not needed.

MoinMoin

...

UseModWiki, OddMuseWiki, etc.

List of wikis

Dump algorithm

  • Check if raw=1 is available.
  • Get list of pages
  • Use action=index (add &raw=1 if available).
  • Download current version only
  • For each page title, either get action=browse&id=FooBar&raw=1 (preferable) or action=edit&id=FooBar. If raw not available, scrape textarea content of edit box.
  • Loop
  • Get history of each page (note: UseModWiki history is not permanent!)
  • Use action=history&id=FooBar
  • Parse.
  • For each revision, download raw content:
  • If action=browse&id=Foo&revision=123&raw=1 is available, use that. Otherwise, use action=edit&id=Foo&revision=123
  • Get images.
  • Go through each saved page text, and search for image URLs defined using same regex as UseModWiki uses.
  • Save image.
  • Save site version/metadata.
  • Save action=version. In UseModWiki, this is not very useful, but it's cool to have for Oddmuse. Example: http://communitywiki.org/?action=version

Dump format

http://www.usemod.com/cgi-bin/wiki.pl?DataBase

Welcome to the WikiTeam documentation wiki! We are a group dedicated to archiving wikis around the Internet, and you are invited to be part of it! Find out more.


Clone this wiki locally
You can’t perform that action at this time.