Blog/2014

From XOWA: the free, open-source, offline wiki application

Contents

Release: NONE (2014-12-28 20:15 Sun)

I'm skipping a release again this week. I'm aiming to do a maintenance release next week for v2.1.1 but I don't have any major changes planned.

Release: NONE (2014-12-21 22:25 Sun)

I uploaded Polish and English Wikipedia images tonight. I'll update the image database pages later.

Release: NONE (2014-12-14 23:20 Sun)

My apologies, but I'm skipping the release for this week. I'll resume next week with an update for Polish and English Wikipedia images.

Release: v1.12.2 (2014-12-07 21:17 Sun)

v1.12.2 was another quick release. I generated images for the Arabic wikis.

v1.12.3 will continue the same trend. I'm going to do a refresh of the Polish wikis, but have nothing else planned.

Release: v1.12.1 (2014-11-30 22:25 Sun)

v1.12.1 was another simple release. I generated images for 2014-11-24 French Wikipedia and fixed a few parser bugs.

v1.12.2 will probably continue with short releases. I'm still working on a simple Android alpha for Simple Wikipedia.

Release: v1.11.4 (2014-11-23 22:07 Sun)

v1.11.4 was yet another short release. I generated images for 2014-11-06 English Wikipedia, but did little else.

v1.12.1 will continue the trend of short releases. I'll start working on some alpha HTML dumps for Simple Wikipedia / English Wikisource.

Release: v1.11.3 (2014-11-16 22:12 Sun)

v1.11.3 was another short release. I made a run through 2014-11-06 English Wikipedia, and it appears mostly okay, though there was one issue related to unclosed <imageMap> and Canadian pages. Aside from a few issues releated to <indicators> and one Scribunto badges issue affecting frwiki, I did nothing else.

v1.11.4 will probably be another short release. I'll generate an update for English Wikipedia, but it's unlikely that there will be anything more.

Release: v1.11.2 (2014-11-09 21:13 Sun)

v1.11.2 was also a short release. I've not had much time this week, and this may continue for a while. I did manage to work on Schnark's tickets, and did manage to generate an update for German wiki. In addition, I went through the last of the 50,000 wikis including Latvian, Nynorsk and Cebuano. As of now, all wikis lists as 50,000+ articles now have XOWA packages within the last 6 months. This also includes all the sister wikis as well (wiktionary, wikisource, etc)

v1.11.3 will still be a short release. I'm going to try to generate an update for English Wikipedia, but that may slip till next week. In addition, I'll work on a Latin update.

Release: v1.11.1 (2014-11-02 22:36 Sun)

v1.11.1 was an abbreviated release. I had some personal affairs that took up much of most of the week. I started working on the German Wiki image update, but didn't have the time to complete it. I added a few minor bug fixes, but that was it.

v1.11.2 should see a German Wiki image update. Unfortunately, personal matters may also keep me from making progress this week also. I'm hoping to get further in Android, but it looks like the timetable may slip.

Release: v1.10.4 (2014-10-26 21:33 Sun)

v1.10.4 had a number of fixes for Russian Wikipedia -- nearly all of which involved Scribunto and Wikidata. I continued on Android and have a fairly workable prototype. I've also started rewriting the Android database serialization to be more compressible. This is going to be a somewhat time-consuming task, but I'm hoping to be able to generate an entire HTML text dump of English Wikipedia in 25 GB. (contrast that with the current Wikitext dump of 19 GB)

For v1.11.1, I'm going to keep it light. I'm planning on doing a German update and a new Latvian package. In the meantime, I'm going to continue working on database format / compression.

Android: Initial overview (2014-10-26 18:12 Sun)

I thought it would be worthwhile to give a general overview, so that some expectations can be set.

XOWA will initially support Android 2.x (Eclair) and up. It won't support Alpha (1.0) Beta (1.1), Cupcake (1.5) or Donut (1.6). The limitation occurs because XOWA will be using the ActionBar Sherlock GUI library and it only supports 2.x+. If XOWA needs to support a pre 2.x version, I think it might be possible in a future release, but it won't be part of the initial release.

I'm planning three phases of release.

  • 2014-12-01: Simple Wikipedia. This will include an XOWA prototype and a 2.2 GB database of text and images. I'm delaying this a month, because I want the database format to be reasonably final.
  • 2015-01-05: English Wikipedia. This will include an XOWA beta and a 25 GB text-only database.
  • 2015-02-03: Google play store. This will be the final version of the XOWA beta with a "full" version of English Wikipedia that will fit within a 64 GB microSD card. In addition, I'll have a German Wikipedia available as well, with other versions to follow soon afterwards.

If you have an Android device, I'd really appreciate it if you do some beta testing for the first two phases. In particular, I'd like feedback on how XOWA works on your device in terms of the following:

  • Speed: Does XOWA retrieve pages fairly quickly? I have a Samsung Galaxy 2.2 phone, so I think I have the lower end of the spectrum covered, but I'd like to have feedback on as many different devices as possible.
  • Display: Does XOWA look fine at your screen resolution? How about portrait vs landscape?
  • Other issues: Android and its multiple frameworks can be challenging, so I'm sure there will be issues that will arise on certain devices.

Finally, XOWA will be using a new database html dump format. Android is too slow to compile wikitext, especially when you throw in Luaj and Jtidy (some pages would take 2+ minutes on my phone). The advantage of the new format is that it will be much faster (less than 2 seconds for any page). Unfortunately, the disadvantages are several:

  • New databases will have to be built. I'm working on a script file to compile the existing wikitext databases to html databases, but in the beginning, I'll probably be providing these html databases separately.
  • Wikitext won't be available. There will be an option to combine a wikitext database and an html database, but the resulting file will probably be 50 GB (without even the images).
  • I have to rewrite parts of XOWA to handle an "html-only" version. I've been doing this for the past few months, but will need to do more over the next two.

I'll set up a thread later in the XOWA forum to get more feedback.

Release: v1.10.3 (2014-10-19 22:31 Sun)

v1.10.3 was a straightforward release. Most of this had to do with 2014-10-03 English Wikipedia dump being relatively uneventful (unlike the last few monthly dumps). Vietnamese and Lithuanian were also straightforward, though a few pages in Vietnamese Wikisource forced a rewrite of the XOWA poem parser. In addition, due to the similarity of the lt and li language codes, I ended up doing Limburgish by mistake. However, it was small enough to upload anyway. In the meantime, I plugged away at the Android app, and have a bare-bones prototype for Simple Wikipedia.

For v1.10.4, I'm going to go out of order and do a Russian rebuild. Russian Wikipedia tends to use Scribunto a lot, and I'd just like another check that XOWA's Scribunto rendition is still working well. I'll add Estonian and maybe Latvian. Otherwise, I'll work on some database optimizations for the Android app.

Release: v1.10.2 (2014-10-12 22:21 Sun)

v1.10.2 includes some major changes to the javascript file layout. This should not be noticeable to most users, but for any one interested in the details, you can review this ticket. Aside from that, I worked a bit on Android and have a primitive alpha for Simple Wikipedia.

For v1.10.3, I'm going to do an English image update. I'll continue on the Android app, and should be getting closer to a workable prototype. For wikis, I'll be working on Vietnamese (update) and Lithuanian (new).

Release: v1.10.1 (2014-10-05 20:54 Sun)

For v1.10.1, I spent a lot of time on the wikidata.js replacement. I decided to reproduce the current look and feel for the Wikidata site. I think the final results are pretty good.

For v1.10.2, I'm going to go back to Android. For wikis, I'll be working on Norweigian (update) and Galician (new).

Release: v1.9.5 (2014-09-28 22:30 Sun)

v1.9.5 was mostly involved with a small number of minor parser fixes. Otherwise, English Wikipedia looked good (I reviewed over 100 pages) while Waray-Waray was much smaller than I expected. In addition, I worked on wikidata.js, but decided to incorporate this functionality directly in XOWA's Java code. (For more information, see below). I didn't finish the wikidata.js replacement for this release, but it should be done for v1.10.1. Otherwise, I worked a little on some XOWA Android experimentation.

For v1.10.1, I'm going to finish up the wikidata.js replacement. In addition, I'll start working on an Android prototype for Simple Wikipedia. For wikis, I'm planning to work on Thai (update) and Slovenian (new).

Wikidata.js replacement (2014-09-28 21:30 Sun)

First, as background, the Wikidata.js file was developed by Schnark as a way to view Wikidata pages within XOWA. You can see a screenshot here. As with all of Schnark's contributions, it was a well-developed piece of code that has performed its task perfectly for over a year and a half.

Unfortunately, the latest changes to the Wikidata scheme don't work with Wikidata.js. The Wikibase team introduced a new JSON schema that is different than the old model. The structure is basically the same, but the json paths are different.

I started working on upgrading it, but decided against it for the following reasons:

  • I'd be reproducing the two-version logic (v1 / v2) that I wrote in XOWA within Wikidata.js. Although this isn't a lot of code, I didn't like the idea of writing the same logic again. I also fear a time when the WMF introduces a v3, and the same thing needs to be done again.
  • SWT sometimes crashes while running the script. I don't know what is the underlying cause, but it seems to be the rapid-fire exchange of Javascript callbacks to resolve Property names. Unfortunately, these crashes are hard to reproduce, but they'll generally occur after viewing 20 or so pages. I've tried to patch this in the past, but was not able to. I think this would require a change in both Wikidata.js and XOWA to batch lookups (and it may still not work)
  • Performance would be quicker if I handled it directly with Java, rather than make calls back and forth between Java and Javascript.
  • A Java version would allow me to make offline HTML dumps. It would also allow me to run tests across every Wikidata page.
  • Popups don't work with it. This is a minor item, but one that I thought worth mentioning.

So, after much internal debating, I decided to make the v2 version for Wikidata within XOWA's Java code. I'm still going to leave the v1 infrastructure in place. If anyone wants to upgrade it to v2, they can do so, though I'm hoping the new version should perform well enough.

Release: v1.9.4 (2014-09-21 22:01 Sun)

Most of my time for the v1.9.4 release was spent on wikidata. As mentioned earlier, the latest wikidata dump introduced a brand new schema. More interestingly, the dump had articles in both the old schema (Q933) and the new schema (Q2). I had to rewrite a good deal of XOWA"s wikidata parser to handle two schemas, as well as implement the new schema. This took up most of my time for the v1.9.4 release, and I'm still not really done.

In addition, I took a peek at the 2014-09-03 enwiki dump, and corrected two widespread "Script error" issues. The rest of the dump seems fine, but I'll get a better sense when I generate the image update later this week.

For v1.9.5, I'm going to focus on finishing up wikidata issues. Primarily, the wikidata.js page doesn't work for the new schema (navigating to Q2 will show a lot of errors). I'm also going to try to add better qualifier / reference support. In addition, there will probably be a handful of minor English Wiki issues that will take some time. Aside from English Wiki, I'll try to do the Waray-Waray wikis, but that may end up spilling over to the following week.

Latest version of Wikidata is not compatible with XOWA v1.9.3 (2014-09-17 23:05 Wed)

It turns out that the 2014-09-12 Wikidata dump has a brand new schema and can't be imported into XOWA.

I'm working on a v2 parser now, and should have it ready for v1.9.4. In the meantime, please hold off on updating Wikidata.

Release: v1.9.3 (2014-09-14 22:04 Sun)

The v1.9.3 release has a handful of minor changes for the Simple wikis. I spent more time on Android development, and got an experimental dump working. However, it's still going to be a few weeks before I get any sort of alpha ready.

Also, it looks like archive.org is slow again. I uploaded Hungarian this morning, but Esperanto, Simple and Species will probably not be done until sometime tomorrow.

v1.9.4 will continue along with Android. For wikis, I'm going to do Portuguese (rebuild) and Slovak (new).

Release: v1.9.2 (2014-09-07 21:45 Sun)

The v1.9.2 release has several minor changes for English Wiktionary and Wikisource. Some were quite time-consuming, including one Luaj issue with .pairs(). I also decided to hold off on more HTML dump work, because I want to see how they perform in Android before I commit to the HTML dump format. Towards that end, I started working on the Android version this week, though so far most of the work is quite experimental.

v1.9.3 will involve more Android work along with HTML dumps. For wikis, I'm going to do Hungarian (rebuild) and Esperanto (new).

Release: v1.9.1 (2014-08-31 23:27 Sun)

The v1.9.1 release has some significant changes for the Chinese wikis, particularly Chinese Wikipedia and Wiktionary. Otherwise, most of the work went into more changes for the HTML dumps. At this point, I'm hoping that the main XOWA program is stable enough, and that any other major changes can wait until after an Android version.

v1.9.2 will still involve more HTML dump work. I'm also going to try to get a prototype going in Android, and make sure that the HTML dump approach works as planned. For wikis, I'm going to do English sister wikis (Wiktionary, Wikisource, etc.) and Hungarian Croatian (new).

Release: v1.8.4 (2014-08-24 22:22 Sun)

The v1.8.4 release has a handful of changes for English Wikipedia. Some of these were quite interesting (particularly the railway maps and the automatic-new-line-if-needed), though fortunately none of these should be major. Otherwise, I spent more time on the HTML dumps. It's taking longer than I expected, but I should definitely have something ready for a v1.9.* release.

Also, it's worth noting that I did some clean up the XOWA pages at archive.org for enwiki, dewiki, frwiki and plwiki. Specifically, I did the following:

This may still change in the next few weeks, but I'm hoping the new url should be easier to remember (Xowa_(language_code)wiki_latest) and clearer to use. Also, a big thank you to Anselm who has been very helpful with feedback.

As for the next release, v1.9.1 is a monthly milestone, so I plan to keep changes to a minimal. I'll probably spend most of my time on HTML dumps and hope to have a workable prototype. For wikis, I'm going to do Chinese (rebuild) and Serbo-Croatian (new).

Release: v1.8.3 (2014-08-17 22:24 Sun)

The v1.8.3 release is mostly about minor changes for the French and Polish wikis. In addition, there is a handful of fixes for the 2014-08-11 English wiki (which looks good so far). The SWT upgrade got put on hold as I ran into an issue on a Linux 32 bit box, and decided it wasn't worth the risk to upgrade any of the other platforms. (though it looks like a Linux only issue: SWT 4.4 requires GTK 3). Aside from that, I spent most of the week on the HTML dumps.

For v1.8.4, I'm going to continue with the HTML dumps. I'm hoping to have a beta feature ready within the next few releases. Other than that, I'm going to release the images for 2014-08-11 English wiki and probably do Arabic.

Release: v1.8.2 (2014-08-10 22:02 Sun)

The v1.8.2 update is larger than usual. I had to regenerate the language files because MediaWiki v1.24 added "!" as a magic word and German Wikipedia dropped Template:!. Since this was a low-level change, it forced a rebuild of all the language files. I also took the opportunity to move the language files from /xowa/user/anonymous/lang/xowa/ to /xowa/bin/any/xowa/cfg/lang/core/. I also did the same for /xowa/user/anonymous/wiki/#cfg/ to /xowa/bin/any/xowa/cfg/wiki/core/

In addition, I also worked on a number of performance enhancements for pages with heavy Scribunto usage. The worst is https://en.wiktionary.org/wiki/water . On XOWA this used to take 1.5 GB of memory and 50 seconds. Now it takes 200 MB of memory and 35 seconds. This is still bloated, but keep in mind the offical site takes about 20-25 seconds[1]. Also, this heavy Scribunto usage only affects a small number of pages (and mostly on en.wiktionary). I'll try to add some more incremental improvements over the next few releases, but ultimately this may have to be resolved by offline HTML dumps.

Finally, I'm trying to upload the German wikis now, but the upload speed is horrendous and at 33 GB, it'll take several days. I had the same problem last Sunday, though this time it's much worse. If this doesn't resolve by tomorrow, I'll contact archive.org for help, but in the meantime, German wikis will be late.

For v1.8.3, I'm going to upgrade to the latest version of SWT for Windows and Linux. I may wait on Mac OS X, as I don't have a system I can test (if there are any Mac OS X users who want to try a beta build, please contact me by email or by forum). I'm also going to continue on the offline HTML dumps. For wikis, I'd like to do rebuilds of French and Polish, but if these upload problems persist, I may settle for Serbocroatian / Croatian instead.


  1. ^ If you want to reproduce this, try the following: (Thanks to Anselm for pointing out the ambiguity)

Github Forks of Luaj and JTidy (2014-08-08 23:50 Fri)

I had to make some more changes to the Luaj project for v1.8.2. I decided that it was time to put the luaj_xowa project under some sort of DVCS, so I set up a github repository here. I also set up one for jtidy_xowa.

I'm also going to stop uploading these source tarballs to the XOWA support area, since they're basically redundant. Hopefully this won't inconvenience anyone.

Release: v1.8.1 (2014-08-03 22:29 Sun)

v1.8.1 was a light release with some minor UI fixes. This included the new collapsible wikidata div, as well as some code to handle a bread crumb trail for leaf pages. Japanese and Bulgarian wikis were straightforward, though due to some archive.org issues, it looks like the uploads won't be ready until sometime tomorrow.

For v1.8.2, I'm going to continue working on the HTML dumps. It'll probably be a few weeks before I have anything to show, but when I'm done, it should be the basis for the Android app. For wikis, I'm planning to do a rebuild of German and generate new images for Danish.

Release: v1.7.4 (2014-07-27 00:32 Mon)

v1.7.4 was mostly about the ImageMap extension. It took quite a bit of time, but I think the end result was worthwhile -- particularly with Popups. You can check the urls listed in the Change Log. I also added some "popup-at-anchor" functionality, though this will be limited until XOWA starts generating full html pages. In addition, there were some more Scribunto changes -- this time with Wikibase. I added a few minor UI changes in order to make "startup-from-previous-session" the default. I also went through the Ukrainian and Greek wikis. Finally, I uploaded the "slimmer" English Wikipedia set. You can check it out at https://archive.org/details/Xowa_enwiki_latest.

On another note, I tabled XUL Runner until v1.8.*. I'm mostly stopped by this SWT issue. I can work around it, but it's not that clean, and I'd rather hope they can fix it.

For v1.8.1, I'm going to try to keep it simple. I'll work on some general performance enhancements and revise some of the documentation. For wikis, I've decided to continue on with the rest of the 50,000+ wikis. Bulgarian will be next, and Japanese will be paired with it.

Fixed: English Wikipedia images for 2014-07-07 (2014-07-22 21:02 Tue)

The English Wikipedia images are now available: https://archive.org/details/Xowa_enwiki_2014-07-07_images_update For anyone curious, here are two items you can check:

  • en.w:Marshalsea: missing image: File:London1300-marking_the_spot_of_the_future_Marshalsea.png
  • en.w:Bosniaks: distorted image: File:Alen_Islamović.jpg

Technical difficulties: English Wikipedia images for 2014-07-07 (2014-07-21 22:54 Mon)

The English Wikipedia images is currently offline. I noticed a minor problem, and thought I could upload a fixed version. Unfortunately, I didn't expect archive.org to keep timing out on me. I'm going to continue trying tonight but it may be another day or so before the new version is up. Sorry about the delays.

Release: v1.7.3 (2014-07-20 22:17 Sun)

v1.7.3 had more follow-up Scribunto changes. These affected only a handful of pages (less than 50 out of 4.5 million). I think I'm done with the latest Scribunto changes, and hopefully the next changes won't be as traumatic. In addition there were two notable fixes: missing references due to lower-alpha, and excessive ampersands on some math pages. I added a few UI options, including the set-startup-page option. Hebrew wasn't as difficult as I expected, though it still involved a few changes.

English Wikipedia files was more challenging as the recent Scribunto changes increased the build time by a half day. However, I did manage to come up with a script to reduce the image database size from 91 GB to 68 GB. I'll be uploading the new set throughout the coming week. The url will probably be https://archive.org/details/Xowa_enwiki_latest but you should check https://archive.org/search.php?query=xowa .

For those users who have been keeping up to date with English Wikipedia, you can do any of the following:

  • If space isn't an issue, just download the latest set: https://archive.org/details/Xowa_enwiki_2014-07-07_images_update
  • If space is an issue:
    • If bandwidth is not an issue, redownload the new set and delete the old one. This will probably be easier than the next option
    • If bandwidth is an issue, run an XOWA build command on your existing set. I'll include instructions in the Xowa_enwiki_latest url. Note that this is the same script I will have used to generate the new set. The scripts will be < 1 MB. However, they will take 6 or 7 hours to run and requires the use of the command-line.

For v1.7.4, I'm going to try to release a fully functional ImageMap extension. I'm hoping this will be useful now that App/Xtn/Html/Popups is in place. I'm also going to try to do XULRunner 24, though if I run into issues, I may defer this till a v1.8 build. For wikis, I'm going to skip over Japanese and do Ukranian as the latter uses Scribunto heavily. I'll probably do Greek as well.

Release: v1.7.2 (2014-07-13 22:23 Sun)

v1.7.2 was quite a release. I downloaded the latest enwiki, and found a whole host of issues due to recent changes in Scribunto. As a result, there were a lot of changes to XOWA's Scribunto for this release. Hopefully I've caught all the major issues. In addition, I started working on XULRunner 24, but was diverted to some memory performance issues. Also worth mentioning is an important item for certain non-English wikis: case-insensitive UTF-8 lookups. The latter was especially time-consuming for a language like Turkish, which has asymmetric casings: Ä° lowercases to i but i uppercases to I. Fortunately, in comparison to the above, Swedish and Malay were easy to do.

For v1.7.3, I still need to do some additional cleanup in the wake of all the Scribunto changes from last week. In addition, I'm going to do another images update, and hopefully I'll be able to do a smaller English Wikipedia images dump size. For wikis, I'm going to try to do Hebrew, but that may spill over to the following week (I'm always nervous with RTL wikis).

Cleaning up the download area (2014-07-07 21:29 Thu)

Another administrative note. I decided to clean up the download area last night: https://sourceforge.net/projects/xowa/files/ There were a lot of old releases dating back to 0.3.0, with a few gaps inbetween. I've wanted to do it for quite a while, but I never like deleting files.

For those who are curious, I don't retain full historical sets of XOWA for two reasons:

  • XOWA's release set is quite large. Currently XOWA weighs in at 345 MB. As I'm using Sourceforge's site for free, I feel it would be in bad faith to use more than necessary.
  • XOWA evolves fairly rapidly, and the latest version is generally the best. There's really no reason to use an old version, especially as there may be issues which may have been resolved in the current one.

For now, I'm planning to use the following approach for release files:

  • There will be a full set of files (app + upgrade + source) for the last major / stable version. This is generally the .1 version, as in 1.7.1.
    • This will be for users who just want the latest major version, which should be the most stable.
  • There will be a month's worth of (upgrade + source) set of files. These sets vary in size, but usually are no more than 50 MB.
    • This will be for the rare situation wherein a user wants to rollback to an earlier version. There's really no reason to do so, as if there's a regression bug, I will fix it immediately. However, it's a simple safety net to provide.

For anyone who wants to review older versions of the source code, I've been posting on Wikimedia's Gerrit since 2013-08: https://gerrit.wikimedia.org/r/#/q/project:xowa+branch:master,n,z I've recently started dual-posting to GitHub as well: https://github.com/gnosygnu/xowa

Finally, if anyone wants a full historical release set (for example, 1.1.1), feel free to drop me an email. I'll post a copy temporarily to my Google drive account. Note that all hashes for the release sets are in https://sourceforge.net/projects/xowa/files/support/ so you can cross-check the copy if you don't trust me. :)

Release: v1.7.1 (2014-07-06 21:43 Sun)

v1.7.1 was more interesting than I expected. The Russian wikis had the expected Scribunto issue, but Serbian had a whole host of issues related to variants. I added a good deal of basic support, though I'm sure there are still some items missing. If any Serbian users can report any actual errors in articles, I'd greatly appreciate it.

I also added the last of the Popup changes in preparation for it becoming the default. The most notable changes are performance enhancements for English Wikipedia and a Popup History.

For v1.7.2, I'm planning to makes some more non-English wiki related changes, thanks to some of the Russian wiki changes from last week (particularly, case-insensitive lookups for UTF-8 characters). I'm going to look at upgrading XOWA from XUL Runner 10 to 24, now that SWT 4.4 is officially relesed. I'm also going to try again to look at English Wikipedia images dump size. There'll probably be some GUI changes as well. For wikis, I'm going to do Malay and Swedish.

A change to the url format for archive.org links (2014-07-03 23:34 Thu)

A note on a minor bookkeeping change. Starting on 2014-06-22 and with the Latin wikis, I've removed the date based component from the url. So, with the example of the Latin wikis, the previous format was Xowa_labundle_2014-06-23_including_wikipedia, whereas the current format is Xowa_labundle_latest_including_wikipedia.

I made this change to reduce XOWA's disk-based footprint. In particular, I only want to keep one current version of a given wiki (latest), not multiple versions (2014-02-02, 2014-06-23, 2014-10-01, etc.). The multiple copies in the second approach just felt like it needlessly consumed disk space, even considering archive.org's mission to provide snapshots of the web. When the next update for Latin comes around, I'll just delete the files at Xowa_labundle_latest_including_wikipedia and upload a new version.

This new url format should make it easier for users to remember. Also, it takes care of another administrative task related to publishing the urls.

Hopefully I won't run into any issues with this approach. If I do, I'll just go back to the date-based urls.

Release: v1.6.5 (2014-06-29 22:04 Sun)

v1.6.5 was mostly minor parser fixes for English Wikipedia. I also included a few minor changes to the Popups, thanks to feedback from Schnark

v1.7.1 will be a stable release, so I'm going to try to keep changes to a minimum. I'm planning to work on some performance enhancements to popups, in preparation for making it the default in v1.7.1. I will also go back to reducing the size of the English Wikipedia images dump. For wikis, I'm going to try Serbian and maybe Russian. If Russian is too difficult (they tend to be the most experimental with Scribunto), then I'll switch them with Swedish instead.

Release: v1.6.4 (2014-06-22 21:42 Sun)

Due to some technical complication, I had to redo the English Wiki scan mid-week. As a result, I didn't get a chance to look at reducing the size of the dump. While waiting, I pretty much ended up spending all my time on the new Popups feature. For an initial version, I think it turned out fairly well. I'm hoping others will find it useful.

v1.6.5 will involve some work on minor parser issues. The latest English Wiki has a few issues that I didn't get a chance to review. I'll also look at some tweaks / performance enhancements to popups, and try to include some other UI changes. For wikis, I'm going to resume with Spanish and Romanian.

Release: v1.6.3 (2014-06-15 21:53 Sun)

v1.6.2 was mostly about infrastructure changes. I changed tidy to parse only page content, while cleaning up most of the HTML <head> generation. Both these items took longer than expected. I did manage to add options to handle auto-collapse / auto expand, and included a few javascript refinements thanks to Schnark.

v1.6.3 will shift focus a bit. The enwiki dump finally started, and I'm going to try to change the update process to delete unused files. Enwiki is approximately 86.4 GB and has a turnover of 1 - 2 GB per month. If the unused images are deleted, I'm hoping to get back into the 70s, if not lower. I'm also going to explore an automated update option for these image databases. As most of my attention will be on enwiki this week, I'm going to skip generating wikis for other languages. Finally, I'm going to experiment with Hovercards in XOWA. If I'm lucky, I may be able to get something in for a v1.6.* release. Otherwise, it'll have to wait for XOWA HTML dumps.

Release: v1.6.2 (2014-06-08 21:56 Sun)

With v1.6.2, I included a significantly different JTidy. I ported as many of the changes from tidy over to JTidy -- mostly in the various Parse classes. I still have more changes planned for JTidy, but hopefully the bulk of the compliance work has been done. I was able to do a full parse of all the English Wikipedia pages, and the new JTidy jar helped me discover a few minor issues in the Wikipedia namespace. I also added a bunch of minor UI changes, including new privacy options and several Tab actions (Close Others, Undo Close Tab, Select by #)

For v1.6.3, I'm planning to work on more JTidy related changes. Now that JTidy is fully integrated into XOWA, I need to remove some early XOWA tidy-like logic that prevents JTidy from fixing malformed HTML. I'll also add a few other minor UI changes as well, with some focus on Javascript changes (particularly auto-collapse all sections). For wikis, I'm planning on Indonesian (new) and Italian (update).

About the blog (2014-06-06 23:56 Sat)

I thought it might be useful to have a blog that covers various parts of XOWA's development. Its main objectives would be the following:

  • Give an idea of what's in progress for the next release
  • Highlight some of the more interesting issues from a recent or upcoming release.
  • Cover anything else related to XOWA

This is a new thing, so I'll be experimenting with style / format / content for a while. I'm hoping to put up at least an entry or two per week.

If you want to discuss a blog entry, feel free to start a post at https://sourceforge.net/p/xowa/discussion/general/

Namespaces

XOWA

Getting started

Android

Help

Blog

Donate