The gender gap and Wikisource

1 Comment

As mentioned before, last month was Female Author Month on Wikisource. Combined with recent events, such as the increased coverage of misogynistic trolling in British media, I’ve been curious about the infamous Wikimedia gender gap. However, my main interest is Wikisource and, despite article after article, all coverage is relentlessly focussed on Wikipedia.

One of the few resources that does cover Wikisource is Spanish Wikipedian emijrp’s gender-gap-related “edits by project family” tool. This allows us to see the edits of declared-male and declared-female editors on each Wikimedia project. The following was the graph as at 6th September 2013:

Wikimedia gender gap chart as at 2013-09-06

Wikisource is the purple line.

I have been checking the graphs occasionally for a while, so I actually have a copy of the same from roughly a year ago (I wasn’t intending to keep constant records, it was just something I found interesting, so I don’t have anything precisely one year old; if that is possible via a different tool then I don’t know about it). This graph was the situation as at 1st June 2012:

Wikimedia gender gap chart as at 2012-06-01

It turns out Wikisource does really well. In the modern graph, Wikisource is almost always ahead of its sisters and the older graph shows it still being mostly ahead (or at least near the top) of the pack. Assuming these two periods are representative, we might actually be getting better: from an average of about 20/80 to 30/70. (Credit should go to Wikiquote as well for being the only project to achieve a female majority, and to do so in both graphs.)

(Small caveat: This data is based on the declared gender of each editor, set individually in their own preferences. It is possible there are more women editing but they haven’t declared their sex; or vice versa, men may even be under-represented here. If so, this could be recursive, as a reaction against perceptions of the gender gap or experience of being online; gender-anonymity may be a welcome break from the harassment.)

I tried checking Wikimedia’s gender gap mailing list for more but there wasn’t much about Wikisource, although there is some acknowledgement that it is better at equality than the other projects. A lot of the recent posts actually seemed to be men trying to explain sexism to women.

However, while good by Wikimedia standards, Wikisource’s edits still aren’t a 50/50 split. All things being equal, there should be parity between male and female editors. As there is no parity, it would follow that all things are not equal. Presumably the reasons are similar to those often cited for Wikipedia.

Nevertheless, Wikisource is apparently doing something better than its sisters.

It can’t be the interface, which is one of the reasons suggested for the gender gap in the past. Wikisource has the same interface as the others and a work-flow process that causes even experienced Wikipedians to run gibbering in terror. Of course, it could be that the semi-structured proofreading work-flow is more compliant with the general mindset of female users, but the gibbering in terror seems gender-equal from my purely anecdotal, not-even-slightly-scientific perspective.

It isn’t necessarily the lack of a strong social element (another potential reason). No Wikimedia project has this, so it’s hard to judge the effect.

Lack of free time would also be somewhat neutral between all projects. Proofreading a page is a simple micro-contribution but other projects have similar, and even easier, tasks. Adding a listing on Wikivoyage is probably the simplest, followed by adding an entry on Wikiquote (in my opinion at least).

It could be the environment. Wikisource is a much friendlier place than other projects. At least, that’s my opinion and part of the reason I ended up calling it my principal wiki. This could be due to the size of the project (about 300 active editors) but it might also be its nature. Research apparently shows that women are put off by argumentative and confrontational environments. Of all the sister projects, Wikipedia is an *especially* argumentative and confrontational environment, with virtual knife-fights over edits and gruelling wars of attrition to become the alpha-editor of a particular article. Wikisource does not offer quite as much fuel for confrontation. The words have already been written and cannot be changed. Individual expression comes in the form of choosing the material to transcribe, and it’s hard to even argue against that because the ultimate aspiration of the project is the transcription of all human art, literature and knowledge. This is not to say there are never any arguments or confrontation but they are rarer.

It might also be the nature of the project. Yet another reason suggested for the gender gap on Wikipedia is a lack of self-confidence among women, possibly as a result of socialisation. This may seem an odd tangent but I’ve heard it said that women do gardening while men do landscaping. It’s a joke about gendered language and thought: she nurtures plants and helps them grow; he assert himself upon nature and bends the plants to his will. In this sense, Wikipedia is very much about asserting knowledge. Wikisource is more about nurturing, or curating, knowledge. The text already exists and Wikisource makes it better, allowing it to be easily read, communicated to and re-used by many more people. Some of the other projects with a high proportion of female-editors, such as Wikiquote, are similar in nature (ie. identifying a quotation and adding it the list requires a little more assertion, but not much more).

Some combination of the above may be in effect. For example, increased complexity off-setting the friendlier environment. On the other hand, I may have missed something important.

Despite writing all of this, I admit that none of these thoughts really help Wikisource or its sister projects very much. The visual editor will eventually be deployed on Wikisource and even smaller micro-contributions (nano-contributions?) are planned for the future, so that covers some of the parts that may not even be part of the problem. We might be able to make more of the environment. I’m not sure if any other project could easily apply it, as it’s fundamental to the nature of the project itself.

Nevertheless, it was interesting to look at this.

Author demographics

Leave a comment

August 2013 was Female Author Month on Wikisource, with two works by women transcribed from scratch via the community Proofread of the Month and a third work partly validated.[1] This is a result of a request for more works by female authors made on Scriptorium.

However, we don’t actually know if we have a significant dearth of female-authored works. We don’t have any demographic information about our authors beyond era, nationality (usually) and religion (sometimes).

Wikidata may help with that, whenever it is rolled out to the Wikisources. Amending each and every author page on English Wikisource would be hard work at the moment because the process would have to be mostly manual. However, with Wikidata, we wouldn’t even need a bot. The author header template (and maybe a Lua module) could just read the Wikidata “sex” property (P21) and apply a hidden tracking category.

This could be extended to other metadata. We could have tracking categories for the entire QUILTBAG[2] range with the addition of the “sexual orientation” property (P91) and whatever is used to cover transsexuality. Ethnicity might be possible with the “ethnic group” property (P172). There may be even more demographics worth tracking too, and these could be easily added over time.

This might bring to mind the recent controversy over Wikipedia consigning female authors to be categorised into a female author ghetto, while leaving male authors categorised as just authors. However, Wikisource wouldn’t be discriminating as this approach would be fully automated and applied equally to all authors in the Authorspace. Hidden categories would avoid labelling authors too much; not to mention avoiding redundant information many readers could deduce from the name and/or portrait.

Then we would actually know where we stand.


  1. These were: Marriage as a Trade by Cicely HamiltonDiaries of Court Ladies of Old Japan edited by Annie Shepley Omori; and Pride and Prejudice by Jane Austen.
  2. QUILTBAG: Queer Intersexual Lesbian Transsexual Bisexual Asexual Gay
    EDIT: Actually, I got this wrong.  The acronym stands for Queer/Questioning Undecided Intersex Lesbian Trans(-gender/-exual) Bisexual Asexual Gay.  See Wiktionary for a full definition and history.  Personally I would have merged the first two into Quantumsexual, but that’s just me.