Dev/Parser/Embeddable

From XOWA: the free, open-source, offline wiki application

Overview

XOWA can be embedded in other apps as a standalone parser.

Features

The XOWA parser has a number of features:

  • Comprehenisive: The parser handles virtually all aspects of MediaWiki wikitext, including:
    • Standard markup:
      • ''italic''
      • '''bold'''
      • [[internal links]]
      • [external links]
      • == section heading ==
      • preformatted text through a leading space:   preformatted text
      • lists with
        • * unordered list item
        • # ordered list item
        • ; term : definition
      • tables through {| |}, |- and |
    • Templates: {{some_template}} as well as {{{some_argument|some_default}}}
    • Parser functions: Over 100 functions including:
      • {{PAGENAME}}
      • {{#if}}, {{#ifeq}} and {{#switch}}
      • {{formatnum}}
      • {{#formatdate}} and {{#time}}
      • {{#expr}}
    • Extensions: Over 20 extensions including:
      • <gallery>
      • <imagemap>
      • <ref>
      • <poem>
      • <hiero>
      • <syntaxhighlight>
      • <math>
      • <dynamicpagelist>
      • <listing>
      • <score>
      • {{pagebanner}}
    • Scribunto: {{#invoke:module_name|lua_function|args}}
    • Wikibase: {{#property:qid}}
  • Fast: The parser can process 5+ million articles of English Wikipedia in 24 hours on a relatively high-end machine.
  • Multi-language: The parser can process wikitext in other non-English languages. This ranges from magic word translations (NOMPAGE) to numeric format (1.123,56) to variant support (-{variant:term}-).
  • Well-tested: The parser has close to 1000 automated tests. In addition, it has been run on over 100 different wikis, including English Wikipedia, Wiktionary, Wikisource, Wikivoyage, Wikiquote, Wikibooks, Wikiversity, and Wikinews as well as the non-English counterparts such as German, French, Russian, Arabic, Chinese, etc.

Issues

The XOWA parser is constantly changing as it needs to accomodate live changes to MediaWiki parser. Moreover, the embeddable feature is a work in progress. The following is a list of known limitations:

  • Resources are not embedded: Many features require standalone data (language translations; lua code; hiero images). These are not embedded into the XOWA jar, but are distributed separately with the XOWA app (somewhere under the /xowa/bin/any/xowa hive).
  • Non-lightweight memory requirements: The XOWA parser was built with an eye towards performance. As such, there is a good deal of caching that may impact memory adversely. A typical XOWA parser will require between 1 MB and 2 MB of memory

Example

The follow example demonstrates usage.

package sample_namespace;
import gplx.xowa.addons.parsers.mediawikis.*;

public class Test_class {
    public static void main(String[] args) {
        // create a new manager instance with the root directory of your XOWA installation
        // note that a full XOWA installation is needed, because it needs to load some standalone files (EX: Scribunto .lua files)
        // also note that the directory must end in a "\" if Windows or a "/" if Linux / Mac OS X; EX: "/home/me/xowa/" not "/home/me/xowa"
        Xop_mediawiki_mgr mgr = new Xop_mediawiki_mgr("C:\\xowa\\");

        // create a new worker instance
        // note that workers are not thread-safe. however, you can have each thread handle one worker.
        // also note that each worker can only parse pages from one wiki.
        // if you are parsing pages from two different wikis then you'll need two different workers
        Xop_mediawiki_wkr wkr = mgr.Make("en.wikipedia.org", new Xop_mediawiki_loader__custom());

        // parse some wikitext.
        // the below will print out "<p><i>My page</i>\n</p>"
        System.out.println(wkr.Parse("My_page", "''{{PAGENAME}}''"));

        // templates will be retrieved by the custom loader
        // the below will print out "<p>wikitext retrieved from your database for Template:Convert\n</p>"
        System.out.println(wkr.Parse("My_page", "{{Convert}}"));
    }
}
class Xop_mediawiki_loader__custom implements Xop_mediawiki_loader {
    // load page text by page title
    public String LoadWikitext(String page) {
        return "wikitext retrieved from your database for " + page;

    }
}

Namespaces

XOWA

Getting started

Android

Help

Blog

Donate