Read, Hot & Digitized: Visualizing Wikipedia’s Gender Gap

Read, hot & digitized: Librarians and the digital scholarship they love — In this new series, librarians from UTL’s Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship.  Our hope is that these monthly reviews will inspire critical reflection of and future creative contributions to the growing fields of digital scholarship.

Wikipedia is a website that many of us use every day – yes, even us librarians! Wikipedia was founded with utopian ideals, with its democratic approach to content creation and always-free, open knowledge. Therefore, it seems like the ideal platform to address structural inequalities in our information systems that reflect and reinforce racism, misogyny, homophobia, and transphobia and combinations thereof.

However, Wikipedia has a long-standing problem of gender imbalance both in terms of article content and editor demographics. Only 18% of content across Wikimedia platforms are about women. The gaps on content covering non-binary and transgender individuals are even starker: less than 1% of editors identify as trans, and less than 1% of biographies cover trans or nonbinary individuals. When gender is combined with other factors, such as race, nationality, or ethnicity, the numbers get even lower. This gender inequity has long been covered in the scholarly literature via editor surveys and analysis of article content (Hill and Shaw, 2013; Graells-Garrido, Lalmas, and Menczer, 2015; Bear and Collier, 2016; Wagner, Graells-Garrido, Garcia, and Menczer, 2016; Ford and Wajcman, 2017). To visualize these inequalities in nearly real time, the Humaniki tool was developed.

Humaniki was created in 2020 by merging two previous data visualization projects. Data scientist Maximillian Klein created the Wiki Data Human Gender Indicators project in 2016. The French project Denelezh was created by Enzel Le Mir for Wikimedia France in 2017. Both projects utilized the Wikidata API and merged because of their significant overlap and shared mission, and Klein recently received a grant from the Wikimedia Foundation to continue this work. Humaniki is also built using Python, and its backend code is available on GitHub

Humaniki has many ways to explore this data. One of the most interesting is to look at the numbers based on language. Wikipedia isn’t just available in English, and Humaniki offers users the chance to look at gender representation for biographies in 529 languages! Another interesting data point is Year of Birth, and the trends in the Humaniki data suggest the gender gap closes slightly for biographies about younger people. For example, 23% of biographies on people born in 1963 are about women. For biographies on people born in 1983, however, 29% are about women. 

Humaniki also provides numbers of biographies on people who identify as “other genders” (people whose gender identity is not cisgender). For each metric, you can review the “Other Genders Breakdown,” which lists out all the gender identities (trans women, trans men, nonbinary, genderfluid, two-spirit, etc.) included in that particular data point. The “Other Genders” metric is important because the numbers are so stark. Looking back to our examples from 1963 and 1983, only 16 biographies in the 1963 dataset and 31 from 1983 are about people who don’t identify as cisgender – that’s out of more than 50,000 biographies! This highlights the great need to create and expand articles on people who identify outside of the traditional gender binary.

Humaniki is a useful tool for building awareness of the Wikipedia gender gap, and there are many ways to act upon this knowledge and get involved. The UT Libraries sponsors multiple Wikipedia edit-a-thons focused on improving articles about women and LGBTQ+ people. Every March, we host Queering the Record, a homegrown edit-a-thon to improve queer and trans representation, and we participate in the international campaign Art + Feminism, which focuses on gender, feminism, and the arts. Additionally, we’ve hosted one-off edit-a-thons covering Latinx and Mexican women, Indigenous languages, and women and LGBTQ+ people in STEM fields. Keep an eye on the UT Libraries events page to learn about future edit-a-thons!

Scholarship and Popular Press on the Wikipedia Gender Gap

Bear, Julia B., and Benjamin Collier. “Where are the women in Wikipedia? Understanding the different psychological experiences of men and women in Wikipedia.” Sex Roles 74, no. 5-6 (2016): 254-265. 

Filipacchi, Amanda. “Wikipedia’s Sexism Toward Female Novelists.” The New York Times, April 24, 2013. 

Ford, Heather, and Judy Wajcman. “‘Anyone can edit’, not everyone does: Wikipedia’s infrastructure and the gender gap.” Social Studies of Science 47, no. 4 (2017): 511-527.

Gordon, Maggie. “Wikipedia Editing Marathons Add Women’s Voices to Online Resource.” Houston Chronicle, November 9, 2017. https://www.houstonchronicle.com/life/article/Adding-women-s-voices-to-Wikipedia-12344424.php

Graells-Garrido, Eduardo, Mounia Lalmas, and Filippo Menczer. “First women, second sex: Gender bias in Wikipedia.” In Proceedings of the 26th ACM Conference on Hypertext & Social Media, pp. 165-174. 2015.

Hill, Benjamin Mako, and Aaron Shaw. “The Wikipedia Gender Gap Revisited: Characterizing Survey Response Bias with Propensity Score Estimation.” PloS One 8, no. 6 (2013): e65782–e65782.

Paling, Emma. “The Sexism of Wikipedia.” The Atlantic, October 21, 2015. https://www.theatlantic.com/technology/archive/2015/10/how-wikipedia-is-hostile-to-women/411619/

Stephenson-Goodknight, Rosie. “Viewpoint: How I Tackle Wiki Gender Gap One Article at a Time.” BBC News, December 7, 2016. https://www.bbc.com/news/world-38238312

“The Nobel Prize Winning Scientist Who Wasn’t Famous Enough for Wikipedia.” The Irish Times, October 3, 2018. https://www.irishtimes.com/life-and-style/people/the-nobel-prize-winning-scientist-who-wasn-t-famous-enough-for-wikipedia-1.3650212

Wagner, Claudia, Eduardo Graells-Garrido, David Garcia, and Filippo Menczer. “Women through the glass ceiling: gender asymmetries in Wikipedia.” EPJ Data Science 5 (2016): 1-24.

Leave a Reply