Common wikisource

Leave a comment

As a follow up to my ramblings about Multilingual Wikisource: I have heard some people ask why all Wikisources are not Multilingual Wikisource, like Commons. (I have even heard “Why isn’t Wikisource part of Commons?”)

The latter is easily answered. Aside from the fact that Wikisource needs specific technology to function, it has a different scope and mission to Commons, which would clash if both were part of the same project.

There are many reasons for the former. I think the original was something to do with right-to-left text, which has been solved by now. Others still stand, however.

Disambiguation would be a nightmare, for example. The Bible is complicated enough in English on just one project. Multiple editions in each of hundreds of languages would be ridiculous. This could be solved with, say, namespaces but there are a finite number of namespaces in the MediaWiki software. Besides, the difference between a namespace and a language subdomain is negligible from a technological point of view. The same goes for disambiguation for that matter. A language subdomain is just a bigger version of the concept.

On a different tangent, while Commons is technically multilingual—and a lot of work has gone into supporting that—it is still predominantly English. Community communication is overwhelmingly done in English, English is the default for categories and templates, and so forth. Some grasp of English is often necessary to function on Commons. Language subdomains allow the monolingual (and the multilingual but not anglophone) Wikimedians to take part too, which is more important in curating a library than a media depository.

Obviously, now that we actually have language subdomains, we also have the problems of different cultures and communities on the different projects. Italian doesn’t allow translation, German doesn’t allow non-scans, French doesn’t allow annotation; while some languages, like English and Spanish, are pretty promiscuous in their content. There are likely to many more, seemingly trivial, quirks that are at odds across different projects. If anyone ever did attempt unification, these communities would clash and conflict all over the place, probably ending in either mutually assured destruction or a very small surviving user base.

You may as well ask why Wikipedia bothers with language subdomains when it could just be Multilingual Wikipedia, like Commons.

Multilingual ramblings

Leave a comment

Old Wikisource” (oldwikisource:) is the incubator of the Wikisources. Languages that do not yet have enough works in their library are all held here, from Akkadian to Zulu, before later potentially budding off into their own projects. They are not part of the actual Incubator because Wikisource relies on specific technology that is not installed there (and probably would need to be heavily adapted to fit it).

One problem this creates is that “oldwikisource” is not a recognised ISO 639 language code. Interwiki links do not work. Wikidata will have a hard time indexing it. No one really knows it’s there.

Fortunately, the International Organization for Standardization predicted situations such as this and included a few extra codes in their set. One of these is “mul” for multiple languages, for situations where databases need to categorise things by language but where some of those things have many. This could mean, for example, mul.wikisource, or even mul.wikipedia, mul.wikibooks, etc (although those are just possibilities, not suggestions).

In other words, exactly what Wikimedia requires for Old Wikisource. Mul could be used for interwiki links from other Wikisources, bringing some attention and potential traffic to an otherwise excluded and ostracised project. Mul could be used on Wikidata to collect and connect pages. Mul is already used in some parts of Wikisource to refer to the not-sub-domain.

It also helps that Old Wikisource, while accurate as the original project, is not as easily explained to Wikimedians on our sister projects as is “English Wikisource”, “French Wikisource” or, as it happens, “Multilingual Wikisource”.

So, for preference, I would see Old Wikisource become Multilingual Wikisource. I think it would make lots of things easier, while making the project more visible, more functional, and slightly more obvious to outsiders. It must be said that I am not a regular on Old Wikisource and those that are may not agree.

Fully enabling ISO 639 in Wikimedia would also technically affect user language options too. A user could conceivably select “Multiple” as their preferred language, regardless of where they were in Wikimedia. In practice, this would probably just default to English, so I don’t think it would be a big problem.

More serious would be the amount of trouble this would be to implement. Just creating an alias for Old Wikisource would be easiest, so the code could be used as described without really changing much.

In my view, moving the project entirely is still better: with most existing pages going to mul.wikisource.org and just a portal remaining at wikisource.org (in line with its sister projects like Wikipedia). If changes are going to be made, we might as well go all the way rather than patch the system with aliases. That’s a lot of work for relatively little gain though, and I don’t know how keen the current Old Wikisourcers would be with this option (nor the technical people who would have to do all the heavy lifting).

I haven’t actually made any proposal based on this (a some related bug reports have been open for years, however). I’m still not sure what would be best nor what the wider community would prefer and I’m just thinking, or typing, out loud. This is just a blog after all.

As it stands, though, my opinion is that Multilingual Wikisource would probably work better than Old Wikisource.