App/Xtn/Page sync

From XOWA: the free, open-source, offline wiki application

XOWA can synchronize the latest pages from the online version to the offline version



Options

The options page is at Options/Page_sync

Background

In general, offline dumps are generated on a semi-frequent basis:

Sometimes though you may only want to update one page without:

  • Waiting for the dump to occur
  • Downloading and importing a whole new wiki

The Page sync feature allows you to update selected pages

Issues

The Page sync feature is still a work in progress. The following are known issues:

Math, Helper Buttons (enlarge / more info), Musical scores, and other images don't work

Wikipedia stores some images in a separate location. XOWA still needs code to detect this location, copy it offline, and show it correctly. This should be done in the next few releases, but in the meantime it's heavily recommended that you don't use Automatic sync for all pages. In particular, math pages will lose all equations after a sync. For example, https://en.wikipedia.org/wiki/Pythagorean_theorem

  • v3.9.4.1 now syncs Math images

No rollback option

XOWA stores only the latest version of a page. Previous versions of the page will not be available after synchronization. This feature will be added in a future version

If you synchronize and want to rollback, you will need to do it manually for all updates. For example, here's a scenario for English Wikipedia:

  • Backup en.wikipedia.org-core.xowa.
  • Backup en.wikipedia.org-html.user.xowa and en.wikipedia.org-sync.xowa if they exist.
  • Synchronize https://en.wikipedia.org/wiki/Pythagorean_theorem
  • Realize that the synchronization is bad and start the rollback.
  • Exit XOWA
  • Restore en.wikipedia.org-core.xowa
  • If backups exist, restore en.wikipedia.org-html.user.xowa and en.wikipedia.org-sync.xowa
  • If backups don't exist, delete en.wikipedia.org-html.user.xowa and en.wikipedia.org-sync.xowa
  • Run XOWA and go to https://en.wikipedia.org/wiki/Pythagorean_theorem

No synchronization for new pages

A page must exist in the offline wiki in order to be synchronized. New pages cannot be synchronized. Navigating to a new page will just result in a "Page not found" error.

This feature will also be added in a future version

Other issues

Other issues may be present. Please check https://github.com/gnosygnu/xowa/issues/72 for updates. Once the issue is closed, then the Page Sync feature will no longer be marked "Work in progress" and should be fully operational

Usage notes

Manual sync

Manual sync works by doing the following:

  • Enable "manual sync" in the options page
  • Click the "Sync" link in the left-hand sidebar.

Note the following details:

  • online mode required: You must have "Web access enabled" in Options/Security
  • exclusions: Page sync does not work for the following pages:
    • home wiki pages: For example, this page. The home wiki is updated offline with every release. The latest version can also be viewed online at http://xowa.org
    • Special pages: Nearly all special pages have dynamic content and cannot be "mirrored" offline
    • Wikia / non-Wikimedia pages: Wikia wikis and non-Wikimedia wikis cannot be synchronized. This may be an option for a future release, but there are currently no plans.

Auto sync

Automatic sync works by doing the following:

  • Enable "manual sync" in the options page
  • Visit the Main_Page for a wiki. The page will automatically sync

Note the following details:

  • default page is Main_Page: Due to the issues above, it's recommended that auto-sync only be enabled for the Main Page. Other pages can be added under custom scope
  • default interval is 1440 minutes (24 hours): By default, XOWA will only synchronize a page if the last synchronization check is at least 24 hours old. This interval can also be adjusted

Technical details

Manual sync

This is an overview of what occurs when the Sync link is pressed

  • XOWA calls the Wikipedia api to get the HTML version of the page. For example:
    https://en.wikipedia.org/w/api.php?action=parse&format=json&redirects=1&page=Wikipedia:Main%20Page
    
  • XOWA parses the HTML and...
    • Removes the Edit links (These aren't implemented in XOWA. They can be but I personally find them distracting and not applicable offline.)
    • Identifies images to download.
  • XOWA saves the HTML to "en.wikipedia.org-html.user.xowa"
  • XOWA updates the core database (en.wikipedia.org-core.xowa) to point to this HTML
  • XOWA then downloads the images separately

Automatic sync

Automatic sync uses the same process as Manual Sync. The main process is as follows:

  • XOWA opens a page
  • XOWA checks if auto sync is enabled
  • If auto-sync is enabled, then it checks the sync time in "en.wikipedia.org-sync.xowa"
  • If the sync time doesn't exist, or is greater than the specified interval (24 hours by default), then it kicks off the manual sync
  • XOWA updates the sync time for the page

Namespaces

XOWA

Getting started

Android

Help

Blog

Donate