Dev/Command-line/Wikidata

From XOWA: the free, open-source, offline wiki application

XOWA can import Wikidata through the command-line

Import using the XML dump

XOWA can build wikidata using the XML dump at www.mediwa/wikidatawiki/. This import is basically the same as an import of any other wiki.

The script for the XML import follows.

  // build wikidata database; this only needs to be done once, whenever wikidata is updated
  add     ('www.wikidata.org' , 'util.cleanup')          {delete_all = 'y';}
  add     ('www.wikidata.org' , 'util.download')         {dump_type = 'pages-articles';}
  add     ('www.wikidata.org' , 'util.download')         {dump_type = 'categorylinks';}
  add     ('www.wikidata.org' , 'util.download')         {dump_type = 'page_props';}
  add     ('www.wikidata.org' , 'util.download')         {dump_type = 'image';}
  add     ('www.wikidata.org' , 'text.init');
  add     ('www.wikidata.org' , 'text.page');
  add     ('www.wikidata.org' , 'text.cat.core');
  add     ('www.wikidata.org' , 'text.cat.link');
  add     ('www.wikidata.org' , 'text.cat.hidden');
  add     ('www.wikidata.org' , 'text.term');
  add     ('www.wikidata.org' , 'text.css');
  add     ('www.wikidata.org' , 'util.cleanup')          {delete_tmp = 'y'; delete_by_match('*.xml|*.sql|*.bz2|*.gz');}

Import using the JSON dump

As of v2.6.3, XOWA also provides basic support for building wikidata from the JSON dump. This support was added for the following reasons:

Despite these reasons, there are limitations to the JSON dump.

  • Non-JSON pages not available: The JSON dump doesn't provide other pages, such as the Main Page or MediaWiki pages. Only pages in the main and property namespaces are available. This is by design. See: https://lists.wikimedia.org/pipermail/wikidata/2015-June/006441.html
  • Page metadata not available : Certain properties are not available, such as page_id and last_modified. XOWA provides substitutes for these values, but they will not match the Wikimedia version

The script for the JSON import follows.

  add     ('www.wikidata.org' , 'util.cleanup')          {delete_all = 'y';}
  // TODO: add     ('www.wikidata.org' , 'util.download')      {dump_type = 'wikidata-json';}
  add     ('www.wikidata.org' , 'wbase.json_dump');
  add     ('www.wikidata.org' , 'text.term');
  add     ('www.wikidata.org' , 'text.css');
  add     ('www.wikidata.org' , 'util.cleanup')          {delete_tmp = 'y'; delete_by_match('*.xml|*.sql|*.bz2|*.gz|*.json');}

Namespaces

XOWA

Getting started

Android

Help

Blog

Donate