Read, Hot & Digitized: Visualizing Wikipedia’s Gender Gap

In this new series, librarians from UTL's Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship.  Our hope is that these monthly reviews will inspire critical reflection of and future creative contributions to the growing fields of digital scholarship.

Wikipedia is a website that many of us use every day – yes, even us librarians! Wikipedia was founded with utopian ideals, with its democratic approach to content creation and always-free, open knowledge. Therefore, it seems like the ideal platform to address structural inequalities in our information systems that reflect and reinforce racism, misogyny, homophobia, and transphobia and combinations thereof.

However, Wikipedia has a long-standing problem of gender imbalance both in terms of article content and editor demographics. Only 18% of content across Wikimedia platforms are about women. The gaps on content covering non-binary and transgender individuals are even starker: less than 1% of editors identify as trans, and less than 1% of biographies cover trans or nonbinary individuals. When gender is combined with other factors, such as race, nationality, or ethnicity, the numbers get even lower. This gender inequity has long been covered in the scholarly literature via editor surveys and analysis of article content (Hill and Shaw, 2013; Graells-Garrido, Lalmas, and Menczer, 2015; Bear and Collier, 2016; Wagner, Graells-Garrido, Garcia, and Menczer, 2016; Ford and Wajcman, 2017). To visualize these inequalities in nearly real time, the Humaniki tool was developed.

Humaniki was created in 2020 by merging two previous data visualization projects. Data scientist Maximillian Klein created the Wiki Data Human Gender Indicators project in 2016. The French project Denelezh was created by Enzel Le Mir for Wikimedia France in 2017. Both projects utilized the Wikidata API and merged because of their significant overlap and shared mission, and Klein recently received a grant from the Wikimedia Foundation to continue this work. Humaniki is also built using Python, and its backend code is available on GitHub

Humaniki has many ways to explore this data. One of the most interesting is to look at the numbers based on language. Wikipedia isn’t just available in English, and Humaniki offers users the chance to look at gender representation for biographies in 529 languages! Another interesting data point is Year of Birth, and the trends in the Humaniki data suggest the gender gap closes slightly for biographies about younger people. For example, 23% of biographies on people born in 1963 are about women. For biographies on people born in 1983, however, 29% are about women. 

Humaniki also provides numbers of biographies on people who identify as “other genders” (people whose gender identity is not cisgender). For each metric, you can review the “Other Genders Breakdown,” which lists out all the gender identities (trans women, trans men, nonbinary, genderfluid, two-spirit, etc.) included in that particular data point. The “Other Genders” metric is important because the numbers are so stark. Looking back to our examples from 1963 and 1983, only 16 biographies in the 1963 dataset and 31 from 1983 are about people who don’t identify as cisgender – that’s out of more than 50,000 biographies! This highlights the great need to create and expand articles on people who identify outside of the traditional gender binary.

Humaniki is a useful tool for building awareness of the Wikipedia gender gap, and there are many ways to act upon this knowledge and get involved. The UT Libraries sponsors multiple Wikipedia edit-a-thons focused on improving articles about women and LGBTQ+ people. Every March, we host Queering the Record, a homegrown edit-a-thon to improve queer and trans representation, and we participate in the international campaign Art + Feminism, which focuses on gender, feminism, and the arts. Additionally, we’ve hosted one-off edit-a-thons covering Latinx and Mexican women, Indigenous languages, and women and LGBTQ+ people in STEM fields. Keep an eye on the UT Libraries events page to learn about future edit-a-thons!

Appreciating Ada Lovelace

Ada Lovelace was a pioneering computer scientist and mathematician of the 19th century. Since 2009, on the second Tuesday in October individuals around the country and globe gather to celebrate Ada Lovelace Day by commemorating her life and raising the profile of women and LGBTQ+ persons in the STEM fields. To honor her legacy, a group of librarians at UT planned and facilitated a daylong Wikipedia Edit-a-thon scheduled for October 8, 2019. 

Beginning in earnest in mid-August, four librarians including Gina Bastone, Roxanne Bogucka, Lydia Fletcher, and myself sat together at a table in the Physics, Math and Astronomy Library to brainstorm ideas and organize what would turn out to be an amazing experience and very meaningful event. The event drew more than 45 participants from across campus to learn about the Wikipedia editing process and get inaugural edits under their belts. 

To organize a successful Edit-a-thon event requires considerable planning in addition to forethought and purpose. Some of the initial goals were to improve the visibility of women in STEM fields, to teach first-time editors the quirks of Wikipedia editing, and to democratize the process of editing Wikipedia, which itself is largely contributed to by cis white men. Creating an accessible and drop-in event where folks could learn something, grab some food, and edit in between classes was also a priority. Starting the research process, identifying useful Wikipedia-friendly sources on top of creating content was a high order to meet in addition to orienting participants to the editing process. Reflecting on our cumulative past experience it was agreed that structuring the event to be largely self-guided was the best approach. Recognizing that the average participant may spend about an hour between classes at the Edit-a-thon, librarians identified pages that required editing and organizing sources ahead of time, focusing specifically on local women in STEM. We reached out to campus groups such as Women in Physics, Gender & Sexuality Center, and CNS-Q, who proved helpful by enthusiastically providing support in word of mouth and extra sustenance on the day of the Edit-a-thon.

One of the event organizers guides a participant through the structure of a Wikipedia article.

We organized the day through a system of Google Drive links and physical sticky notes to ensure that only one person would be editing one article at a time, while retaining the ability to have more than one contributor to each article on the day. Using this system of sticky notes to identify topics for editing, each person would grab a note with a unique scientist’s name off the board, hold on to it while editing that topic and then return it to the board if the entry still needed further edits. The Google Drive folder contained supporting material for our selected topics in addition to a wealth of curated training documents. Many of these training documents were reused and can be reused again in the future. These tools allowed us to plan and coordinate an event without having a required time for a formal demonstration. 

Three of the event organizers standing in front of the whiteboard used to organize topics.

The Edit-a-thon was wildly successful and drew participation from many first-time editors in the College of Natural Sciences. While the turnout was better than we had expected, the true success was in the feedback. All of the respondents to our survey agreed that they had learned about editing Wikipedia and the construction of articles at the event, and 87% said that they plan to continue editing into the future. The goals of the planning group had been met and exceeded, encouraging us to run further events teaching the ins and outs of contributing to Wikipedia.