Blog/2016-06

From XOWA: the free, open-source, offline wiki application

Contents

Release: v3.6.4.2 (2016-06-27 12:40 Mon)

(PC) Fix Download Central failing on last step of image import

This is a patch to fix an issue with Download Central. Download Central would fail when trying to import images (it failed on the very last command). Restarting the app and resuming would fix it.

Note that this bug only occurred on PC and not Android.

I also included a few minor HTML changes for some related screens.

Release: v3.6.4.1 (2016-06-25 20:20 Sat)

All apps are a major release due to fixes for Download Central and HTML databases. In addition, both PC and Android now have redlinks.

(Wiki) 2016-06 English Wiktionary, Wikisource, Wikivoyage, Wikiquote, Wikibooks, Wikiversity, Wikinews and Species Wiki

All English wikis are now available for Android as well as desktop (HTML dumps). To download them, go to Download Central.

(All) Fixes / changes for Download Central including multiple-tasks bug

Last week had a patch to handle Download Central not working after resume.

This release fixes another bug wherein tasks were only able to run one at a time. Now multiple tasks can be added to the queue. I tested this week with the English wikis and loading all the tasks.

In addition, this week has a few other notable items:

  • Download Central for HTTP Server: This uses a new long-polling mechanism to capture progress updates
  • Download Central CSS for Mac OS X: Safari needed a few CSS tweaks.
  • Download Central Info page: This page offers more data on each task, including download links (for use in your download manager)

(All) HTML Databases: Better support for Redlinks, Timeline and Gallery

Last week had a few patches to handle HTML databases not working for old archive.org wikis.

This week has a few new features:

  • Redlinks: Missing links to pages now show up as red links. Note that this doesn't work on HTTP Server yet.
  • Timeline: Timeline now works again for Android. Also, it has been added to HTML databases.
  • Gallery: Gallery now has better support on the PC. Images used to line up vertically: one underneath each other. Now they line up horizontally in rows.

(PC) Minor parser fixes including tags with apostrophe and missing transluded pages

There are a handful of fixes here:

  • Flag icons sometimes not showing: This had to do with some inintuitive usage of a Module called Flagg. It relied on a failed template to return back the entire text of the text. XOWA was only returning the title.
  • Some missing pages in Wikisource due to tag and apostrophe: The {{tag}} function was not being parsed correctly if it had apostrophes.
  • Incorrect redlinks for known pages in Wiktionary: This was another similar issue wherein missing pages should come back as just the title name. XOWA was adding "Template:"

Unfortunately, parser bugs are still part of the HTML dumps in the 2016-06 versions. They will be fixed in 2016-07 and beyond.

Next release: v3.7.1

  • Table of contents: HTML databases on the PC will get a TOC just like the Wikitext version
  • Gallery: Gallery still needs a few changes, particularly to handle packed galleries, as well as other custom-sized galleries.
  • Redlinks for HTTP Server: I'm going to try adding the new long-polling facility to the HTTP Server for redlinks
  • Wikis: I'm redoing German and adding them to Download Central.


Release: v3.6.3.4 (2016-06-21 15:00 Wed)

(Server) Fix blank pages for wikis downloaded from archive.org (fixed for GUI in 3.6.3.3, but not for HTTP Server)

Thombles reported that v3.6.3.3 still didn't fix the blank pages. See: https://github.com/gnosygnu/xowa/issues/60

I discovered that my fix in v3.6.3.3 only affected the GUI. The HTTP server still used different code (I was in the middle of refactoring).

I centralized the code for v3.6.3.4 and tested again with both GUI and HTTP Server. Hopefully this is the final patch release for the v3.6.3 series.

Thanks again to thombles for reporting the issues.

Release: v3.6.3.3 (2016-06-21 13:00 Tue)

Sorry all. Another patch release to fix a few major bugs.

Thanks to William and thombles for reporting the issues.

(PC) Fix blank pages for wikis downloaded from archive.org

v3.6.3.1 introduced automatic support for wikis downloaded through Download Central (HTML dumps). However, it wouldn't work with any wiki that was downloaded from archive.org (wikitext dumps, but with HTML metadata)

v3.6.3.3 fixes this issue. It now uses the wikitext dumps if the HTML dumps are not there. Just update your app, and you will see actual pages, not blank ones.

Also, note that you can disable the automatic HTML read by going to Options/HTML_databases

(PC) Fix Download Central icons for Mac OS X

Download Central looked pretty bad on Mac OS X. This was due to issues with the CSS flex directive and Safari browsers.

I put in a fix now to make it more presentable though there are still a few more blemishes to remove.

(PC) Log Download Central progress to troubleshoot failed completions

I had a report of Download Central failing for Simple Wikipedia. I was able to reproduce it partially on my Windows 32-bit machine, but it has since worked fine 4 times. Note that Download Central has also worked fine multiple times on Windows 64-bit machines and Linux 32-bit machines as well as Android OS.

I added some logging statements to troubleshoot the issue.

Release: v3.6.3.2 (2016-06-20 13:00 Mon)

(PC & Android) Fix Download Central not resuming after app is restarted

There was a bug in last night's release. Download Central would not resume after restarting the app.

This is fixed in this release. If you're downloading, please pause the download, exit the app, install the update, and resume again.

Release: v3.6.3.1 (2016-06-19 19:30 Sun)

The PC app is a major release. It has a new Download page, can now read Android HTML databases by default, and includes many minor fixes

The Android app is a major release. It also has a new Download page, as well as some minor changes.

The Server app is a major release. It can read Android HTML databases by default, and has a fix for the broken Search.

(PC & Android) New Download Central to download HTML dumps and images

Download Central is the major feature of the release. This is an in-app downloader that can download wikis for both Android or desktop. It can download images as well.

For v3.6.3, there are only two wikis: Simple Wikipedia and English Wikipedia. More wikis will be added every week throughout 2016. Next week, v3.6.4 should have other English wikis, like English Wiktionary and Wikisource.

Going forward, Download Central will be the primary vehicle to get XOWA wikis. It provides a simple way to import wikis. It eliminates any confusion about which files to download and where to copy them. Monthly updates for English Wikipedia will be published here as well.

To try it out, go to Download Central. For more info, see the Download Central help page.

(PC & Server) Android HTML dumps are now usable

This is the other major feature of the release. Previously, the app could only read wikitext dumps. Now, it can read the XOWA Android HTML dumps. This has a few key benefits:

  • One micro-SD card: many platforms: The same micro-SD card can be used to read XOWA on Windows, Linux, Mac OS X, Raspberry PI or Android
  • Fast page loads: The app no longer needs to generate HTML from the wikitext. It can just load the HTML directly. This will be significantly faster. For example, a page like Earth might have taken 5 seconds to load. Now it loads in 1 second (or less).
  • Wikidata no longer required: The HTML dumps no longer need Wikidata, which took up an additional 32 GB of space. Previously, Wikidata (www.wikidata.org) needed to be imported, or else some data would be missing.

However, there are drawbacks as well:

  • Missing features: The HTML dumps currently do not support these features:
    • Table of Contents: Android generates Table of Contents in a different manner
    • Redlinks: This feature still needs to be implemented.
    • Links in other languages: Wikidata is still needed for this information
    • Categories (partially): Categories are dynamically rendered. The HTML dumps includes the first 200 items in a category, but if there are more, they won't be available
    • Other omissions: Image Map and Gallery were found to be broken for 2016-06 English Wikipedia. They are fixed for future releases. However, there will probably be other items that will be discovered as well.
Plans: Each of these features will be implemented over the course of 2016. The end-goal is to have the HTML dumps produce the same output as the wikitext dumps.
  • Larger size: The HTML dumps are larger than the wikitext (approximately 30 GB vs 20 GB).
Plans: This will be whittled down over releases, though it's unlikely that the HTML dumps will ever be smaller than 25 GB.
  • Requires separate post-processing generation step: The wikitext dumps were automatically generated by downloading an XML dump. The HTML dumps requires another post-processing step that is not simple to run (See: Dev/Command-line/Dumps)
Plans: This may be simplified with a "Generate HTML dump" button in the future, but generating an HTML dump for English Wikipedia will probably be a resource-intensive task

Finally, a few notes on wikitext dumps versus HTML dumps.

  • More work on HTML dumps in the future: Going forward, a lot of development work will go into the HTML dumps. This is necessary as the wikitext dumps are too slow for Android.
  • HTML dumps will be uploaded to archive.org: Previously, Wikitext dumps were uploaded to archive.org. Now it will be HTML dumps instead.
  • Wikitext dump support is not going away: Wikitext dumps will still be supported and used. Note that they are still a critical precursor to HTML dumps. (HTML dumps can't be produced without them).

(PC & Android & Server) Search now reduces importance of short pages

The new XOWA Search Engine uses PageRank to rate pages by importance. Although this works well for Wikipedia, it sometimes overrates pages which exist for encyclopedic book-keeping.

For example, a lot of Wikipedia pages will have a small box called "Authority Control" at the bottom of the page. This box will have links to other pages like https://en.wikipedia.org/wiki/Integrated_Authority_Control If a million pages have this Integrated Authority Control link, then PageRank rates this page highly. ("1 million pages link to it!") However, the page itself is fairly short, and is not really one of the most important articles in Wikipedia (it would score higher than India, Insect, Italy, etc).

v3.6.3 tries to reduce the importance of these pages if these articles are "short". This heuristic was already present in the previous versions of the search engine, but has been further tweaked.

(Server) Fix for broken Search

This was kindly reported by thombles here: https://github.com/gnosygnu/xowa/issues/59 . The new Search Engine in 2016-03 broke the HTTP Server search. This was fixed in this release.

(PC) Minor fixes (Database, HTML Dump, Search, Special, Parser)

These can be described briefly as follows:

  • Database: Small wikis now generate a "core.xowa" file in addition to a "text.xowa" file. Previously, they only generated a "text.xowa" file.
  • HTML dump: Image Map and Gallery weren't working correctly. Note that these are still broken in the current 2016-06 dump, but will be fixed in the 2016-07 one.
  • Search: A few searches would be blank if the page also existed in a different namespace (for example, dokuwiki)
  • Special: Special page names can now be case-insensitive (Special:RANDOM) or use native-language terms (Spezial:Zufällige_Seite)
  • Parser: A handful of script errors around redirect links and country flags

(Android) Minor changes (more Special pages: redesigned UI for special pages, HTML Page Title, flat icon)

These can also be described briefly as follows:

  • More Special pages: The following two pages have been added:
    • Wiki Info: Shows information about the wiki, including size and location. Also allows deletion of the wiki. Available by doing: Menu -> XOWA -> Info icon
    • Log: Shows log information for troubleshooting. Available by doing: Menu -> Settings -> Logs
  • Redesigned UI for special pages: Special pages now use a common look and feel. This involves shared CSS, more icons, and a simple "help" panel. Also, they no longer show the footer (view page in browser; last modified, CC Content)
  • HTML Page Title: The Page Title used to be an Android TextView widget. Now it's part of the HTML document. This makes resizing the page much nicer (the widget used to jump around when the page was resized)
  • Flat icon: XOWA now has a flat icon

(Wikis) English Wikipedia (2016-06) and Simple Wikipedia (2016-06)

These are available through Download Central.

For users who want the 2016-06 English Wikipedia image update, see App/Import/Download Central

Next release: v3.6.4

I'm focusing on HTML dump related issues for the next few weeks, particularly table of contents and redlinks

For wikis, I'm going to update the generate the other English wikis for Download Central.

Release: Android Beta with wiki downloader (2016-06-12 23:00 Sun)

I released a beta version of the wiki downloader: See https://play.google.com/store/apps/details?id=org.xowa.beta and https://github.com/gnosygnu/xowa/releases . I'm planning on doing some minor tweaks this week, before uploading it to the main XOWA Android app next week. I'll also release the desktop app as well.

In addition, I generated 2016-06 English Wikipedia, and will be uploading a new version to be used with the wiki downloader. Once I get the wiki downloader stabilized, I'll start updating the other wikis.

Release: NONE (2016-06-05 22:00 Sun)

There's no release again this week. I've been bogged down in Android SQLite optimizations for the wiki downloader. I'm going to release something next week as I'd like to get back to regular releases.

English Wikipedia is building now, so hopefully I'll have that ready for next week.

Namespaces

XOWA

Getting started

Android

Help

Blog

Donate