Wikisource and e-books


Wikimedia in general can now produce e-books in EPUB format on demand.  However, Wikisource was actually there first and is ahead of the pack in this area.

Wikimedia projects, such as Wikisource’s sister project Wikipedia, use the “book tool” to collect pages into books.  These can be printed and bound as print books via PediaPress, as well as produced into electronic format.  Initially PDF was the main format and recently EPUB has been added.  A problem from the point of view of Wikisource is that this tool does not take into account it’s specific qualities; it was built for Wikipedia and ignores the other projects.  For example, it adds a licence in the back matter of its output that claims a Creative Commons licence.  This is entirely accurate for many projects but amounts to copyfraud when applied to Wikisource’s public domain works.

Anyway, there is an alternative.  France is the technological home of Wikisource.  Hebrew was the first language-specific Wikisource and English is currently the largest but the technology on which Wikisource runs always seems to emanate from French Wikisource.  In this case, the tool WS Export was originally developed by French wikisourcers for French Wikisource and works for all language domains.  It supported EPUB before the book tool and looks likely to support Mobipocket first too.  More importantly, the tool and its output works better with Wikisource and attends to Wikisource’s quirks.

In November 2012, 3,700 EPUB works were produced by this tool.  Not surprisingly, French Wikisource produced the most (1,176), followed by Italian Wikisource (1,049) and English Wikisource (674).  Other EPUBs ranged from Breton (br) to Farsi (fa) to Venetian (vec).



Leave a comment

I didn’t notice it when I was proofreading the page.

I didn’t notice it when I was transcluding the page.

I did notice it somewhere beneath Belgravia when I was re-reading it on my Bebook One.

Typos or, more accurately, “Scannos“, uncorrected OCR errors, are a constant problem.  At least for me.  Despite all the measures to prevent them, I still find some later on my third re-read of the material I proofread in the first place,

In many ways, proofreading is never necessarily complete.   There is always the chance that you missed something regardless of however many times you read through it.

Help wanted

Leave a comment

Help pages, that is.  Wikisource is a little short of documentation and some of the pages it does have need to be updated.  Often the information is around, usually buried in the archives of community discussion or somewhere equally as arcane.  This is no use, of course, to novice wikisourcers or casual passers-by, who do not possess the “secret knowledge” necessary to answer their questions.  I know that even some experienced Wikimedians have problems grasping the general processes of Wikisource.

Lack of helpful help pages is not a problem unique to Wikisource but as our standard workflow is a little more complicated than most of our sister projects we should try to make things as easy as possible for anyone looking for some guidance.

There is a push this month to redevelop the Help: namespace and attempt to get all the necessary help pages up and running.  In my view, they don’t even need to be complete (although that would be ideal).  Even basic information can point a user in the right direction, giving them something to start with and maybe some terminology and keywords for searches or further questions.  At its most extreme, help page stubs with little or no content allow us to track known gaps in our help.   This is a step up from a complete lack of documentation combined with an equal lack of knowledge about that gap.

Anyone with experience of Wikisource can lend a hand, writing (from a paragraph to a whole page), updating or reviewing new help pages.  Novices and those that do not wish to write help pages themselves can help by pointing out areas that need more explanation and documentation.  We can only provide help if we know what help is needed.

Pulps, letters and science fiction fans

Leave a comment

In the process of my ongoing work to put Weird Tales and other pulps on Wikisource, I have found letters pages one of the more awkward things to transcribe. One of my recent tweaks is adding author pages for every published letter writer.

In the past have found published authors and notable people among these epistoleans, many of whom I did not know prior to this. Some were found by idly googling their name; some listed on the Internet Speculative Fiction Database (ISFDB); some only turned up when I wikilinked their name and it wasn’t red.

In any case, they are all technically published authors and Wikisource has no notability restrictions. Besides which, I’m not able to pick out just the “important” ones.

Therefore, author pages for all of them.

On the downside: A lot of these author would be treated as trivial and certainly wouldn’t make it on Wikipedia. Fortunately, as mentioned, Wikisource’s criterion is generally being published over notability. It is also going to be difficult if not impossible to get a much metadata beyond anything noted in the letter.

On the upside: There is a certain democracy to everyone getting an author page for writing a letter to a pulp magazine in the 1930s. This also serves to create a record of fans and readers of these magazines, with at least a little metadata, not to mention a historic record of people who may not otherwise have one. More practically, it enables tracking of people with multiple published letters, especially if over different magazines.


Leave a comment

I was googling not so long ago and found there was a lack of coverage of Wikisource in the blogosphere.

Sometime later the obvious solution occurred. So here is a brand new blog devoted to Wikisource. Primarily the English Wikisource but other flavours may be covered.

If you don’t know what Wikisource is, it is one of Wikipedia’s little sisters. A digital library of public domain and freely licensed works that anyone can help to build, mostly by transcribing scanned copies of paper-and-ink books to create electronic versions available online.

I have some ideas for future posts and I’m sure (or at least, hope) I’ll come it with more over time. As it stands, I intend to highlight the best parts of Wikisource, try to promote the project, try to make some issues better known and generally cover as many facets of the project as I can. I may also branch out into occasional sister and cousin projects that affect Wikisource.

I’ll start posting soon.

Newer Entries