The gender gap and Wikisource

1 Comment

As mentioned before, last month was Female Author Month on Wikisource. Combined with recent events, such as the increased coverage of misogynistic trolling in British media, I’ve been curious about the infamous Wikimedia gender gap. However, my main interest is Wikisource and, despite article after article, all coverage is relentlessly focussed on Wikipedia.

One of the few resources that does cover Wikisource is Spanish Wikipedian emijrp’s gender-gap-related “edits by project family” tool. This allows us to see the edits of declared-male and declared-female editors on each Wikimedia project. The following was the graph as at 6th September 2013:

Wikimedia gender gap chart as at 2013-09-06

Wikisource is the purple line.

I have been checking the graphs occasionally for a while, so I actually have a copy of the same from roughly a year ago (I wasn’t intending to keep constant records, it was just something I found interesting, so I don’t have anything precisely one year old; if that is possible via a different tool then I don’t know about it). This graph was the situation as at 1st June 2012:

Wikimedia gender gap chart as at 2012-06-01

It turns out Wikisource does really well. In the modern graph, Wikisource is almost always ahead of its sisters and the older graph shows it still being mostly ahead (or at least near the top) of the pack. Assuming these two periods are representative, we might actually be getting better: from an average of about 20/80 to 30/70. (Credit should go to Wikiquote as well for being the only project to achieve a female majority, and to do so in both graphs.)

(Small caveat: This data is based on the declared gender of each editor, set individually in their own preferences. It is possible there are more women editing but they haven’t declared their sex; or vice versa, men may even be under-represented here. If so, this could be recursive, as a reaction against perceptions of the gender gap or experience of being online; gender-anonymity may be a welcome break from the harassment.)

I tried checking Wikimedia’s gender gap mailing list for more but there wasn’t much about Wikisource, although there is some acknowledgement that it is better at equality than the other projects. A lot of the recent posts actually seemed to be men trying to explain sexism to women.

However, while good by Wikimedia standards, Wikisource’s edits still aren’t a 50/50 split. All things being equal, there should be parity between male and female editors. As there is no parity, it would follow that all things are not equal. Presumably the reasons are similar to those often cited for Wikipedia.

Nevertheless, Wikisource is apparently doing something better than its sisters.

It can’t be the interface, which is one of the reasons suggested for the gender gap in the past. Wikisource has the same interface as the others and a work-flow process that causes even experienced Wikipedians to run gibbering in terror. Of course, it could be that the semi-structured proofreading work-flow is more compliant with the general mindset of female users, but the gibbering in terror seems gender-equal from my purely anecdotal, not-even-slightly-scientific perspective.

It isn’t necessarily the lack of a strong social element (another potential reason). No Wikimedia project has this, so it’s hard to judge the effect.

Lack of free time would also be somewhat neutral between all projects. Proofreading a page is a simple micro-contribution but other projects have similar, and even easier, tasks. Adding a listing on Wikivoyage is probably the simplest, followed by adding an entry on Wikiquote (in my opinion at least).

It could be the environment. Wikisource is a much friendlier place than other projects. At least, that’s my opinion and part of the reason I ended up calling it my principal wiki. This could be due to the size of the project (about 300 active editors) but it might also be its nature. Research apparently shows that women are put off by argumentative and confrontational environments. Of all the sister projects, Wikipedia is an *especially* argumentative and confrontational environment, with virtual knife-fights over edits and gruelling wars of attrition to become the alpha-editor of a particular article. Wikisource does not offer quite as much fuel for confrontation. The words have already been written and cannot be changed. Individual expression comes in the form of choosing the material to transcribe, and it’s hard to even argue against that because the ultimate aspiration of the project is the transcription of all human art, literature and knowledge. This is not to say there are never any arguments or confrontation but they are rarer.

It might also be the nature of the project. Yet another reason suggested for the gender gap on Wikipedia is a lack of self-confidence among women, possibly as a result of socialisation. This may seem an odd tangent but I’ve heard it said that women do gardening while men do landscaping. It’s a joke about gendered language and thought: she nurtures plants and helps them grow; he assert himself upon nature and bends the plants to his will. In this sense, Wikipedia is very much about asserting knowledge. Wikisource is more about nurturing, or curating, knowledge. The text already exists and Wikisource makes it better, allowing it to be easily read, communicated to and re-used by many more people. Some of the other projects with a high proportion of female-editors, such as Wikiquote, are similar in nature (ie. identifying a quotation and adding it the list requires a little more assertion, but not much more).

Some combination of the above may be in effect. For example, increased complexity off-setting the friendlier environment. On the other hand, I may have missed something important.

Despite writing all of this, I admit that none of these thoughts really help Wikisource or its sister projects very much. The visual editor will eventually be deployed on Wikisource and even smaller micro-contributions (nano-contributions?) are planned for the future, so that covers some of the parts that may not even be part of the problem. We might be able to make more of the environment. I’m not sure if any other project could easily apply it, as it’s fundamental to the nature of the project itself.

Nevertheless, it was interesting to look at this.

Why Wikisource?


One place to start when talking about Wikisource is, “Why bother?” There are many other digital libraries, from Project Gutenberg to the Internet Archive. What separates Wikisource from them?

In fact, this was an early response to the proposal of a Wikisource-like project back in 2001. Larry Sanger was one of the first to comment, saying:

The hard question, I guess, is why we are reinventing the wheel, when Project Gutenberg already exists? I mean, what really is the need for having this project?

This was closely followed by none other than Jimmy Wales himself, who said:

Like Larry, I’m interested that we think it over to see what we can add to Project Gutenberg. It seems unlikely that primary sources should in general be editable by anyone.

So what does separate Wikisource from similar projects? What are Wikisource’s unique selling points?

Wikisource has many things in common with other libraries but many unique qualities as well. A quick list of unique selling points, as I see them at least, would be accessible scans, crowdsourced proofreading and potential for added value. Gutenberg has proofreading but its sources are hidden. The Internet Archive has scans but only error-ridden computer-transcribed text. Other digital libraries fall into one or the other of these camps. Wikisource, however, combines the reliability of the scans with human-made transcriptions.

Together anyone can contribute to proofreading, regardless of personal resources and access to texts. Once proofread, anything can be checked and corrected. If, for example, you doubt a spelling, a scan of the original page is just a click away where is can be confirmed or corrected.

Added value, such as wikilinking certain terms or embedding spoken-word versions, just adds more to this already pretty solid foundation.