Tag Archives: Digital Scholarship

Scholars Lab Newsletter – March 2024

Digital Humanities Workshop

 Introduction to Recogito

When: 3/8/24, 12:00 pm – 1:00 pm

Where: Zoom

Presenters: Miriam Santana and Willem Borkgren

Recogito is an open-source semantic annotation tool that allows you to tag key terms and reveal the relationships between key names, places, and events between multiple documents. Attendees will learn how to create an account, upload documents, and start working on tags and annotations. They will also learn the deeper capabilities of Recogito, such as mapping relationships, working collaboratively on a corpora of documents, and exporting data for use in other DH tools.

Zoom Registration

Introduction to Optical Character Recognition (OCR)

When: 3/22/24, 12:00 pm – 1:15 pm

Where: Hybrid – Zoom and Scholars Lab Data Lab, Perry-Castañeda Library

Presenters: Dale J. Correa, Mercedes Morris, & Natalya Stanke

This workshop introduces the basics of optical character recognition (OCR), which allows for full-text searching and other types of text manipulation of a digitized document. Attendees will learn how to use Google Docs to create a basic machine-readable text from an image file and be introduced to Tesseract for OCR through exercises in Google Colab.

This workshop is open to researchers interested in OCR for any language. It is strongly recommended that attendees:

1) prepare a digitized, highly legible sample image file for trying out the tools

2) have a Google account to do the exercises fully and save their work.

Register for Zoom or PCL Scholars Lab Data Lab


Open Education Week Virtual Panel

When: 3/8/24, 1:00 pm – 2:00 pm

Where: Zoom

UT Austin’s OER Working Group invites you to celebrate Open Education Week (March 4-8) by joining our faculty/student panel for a virtual discussion on open education practices. Join us for a special Open Education Week discussion on applying open education practices in your teaching. Our student/faculty panel will discuss their experiences finding, adopting, and even creating open educational resources (OER) and other no-cost course materials.

In addition to this faculty perspective, our panel will also include a student voice. Our student panelist is currently collaborating on an original OER project, bringing valuable and unique insight into how open pedagogy can transform student learning experiences.

Zoom Registration


Digital Scholarship in Practice

When: 3/8/24, 1:30 pm – 2:30 pm

Where: Scholars Lab Data Lab, Perry-Castañeda Library

Want to get started with Digital Humanities in the classroom, but you don’t know where to start? This introductory workshop will provide advice and practical ideas to incorporate digital humanities methodologies at all levels of teaching — from syllabus design to assignments and classroom activities. Learn about platforms, strategies, and resources to fit your classroom, your teaching style, and your comfort level with technology. While the advice given will apply to a wide variety of classrooms, the workshop will highlight resources specific to Japanese and East Asian Studies.

Scholars Lab Newsletter – February 2024

Digital Humanities Workshop Series

Digitization, Digital Projects, and Copyright Issues

When: Feb. 2, 2024, 12 pm – 1 pm 

Where: Perry-Castañeda Library Scholars Lab Project Room 6 (2.218)

Join us in-person for a discussion about some of the common copyright issues that pop up when digitizing materials or creating digital projects. We’ll have some scenarios to talk through as a group, but feel free to also bring your questions and we’ll try to discuss some of those scenarios as well.

In-Person Registration

Interactive Writing in Twine

When: Feb. 9, 2024, 12 pm – 1 pm

Where: Zoom 

Twine is an open-source application used to write interactive narratives ranging from fictional adventures to practical decision trees. This workshop will introduce the basics of Twine story creation: creating your first passage of text, linking passages, incorporating HTML and variables, and publishing a Twine project. The session will include a variety of example Twines of different complexity and purpose, and by the end, participants will have their skeleton decision tree that they can expand into a larger text. 

Zoom Registration

Getting Started with Scalar

When: Feb. 23, 2024, 12 pm – 1 pm

Where: Zoom 

Scalar is a free, open-source publishing platform designed for long-form, born-digital, and media-rich digital scholarship. This workshop will give an overview of Scalar and discuss what differentiates it from other content management systems, before demonstrating how to build your Scalar site.

Zoom Registration


Data & Donuts Workshop Series

 Research Data Management Best Practices

When: Feb 16, 2024, 12 pm – 1:15 pm

Where: Perry-Castañeda Library Scholars Lab Data Lab (2.202) and Zoom

This workshop will go over helpful strategies and techniques for effective research data management in all stages of the research lifecycle, from the drafting of comprehensive data management plans to successful publication of research data. Join this session to learn how to overcome data management challenges and stay in compliance with research data management regulations.

Zoom Registration


The Institute for Historical Studies in the Department Workshop

“Mapping Trauma: A Workshop on Space and Memory”

When:  Feb 19, 2024, 12 pm – 1:30 pm 

Where:  Perry-Castañeda Library Scholars Lab Data Lab (2.202) and Zoom 

Anne Kelly Knowles has been a leading figure in the Digital and Spatial Humanities, particularly in the methodologies of Historical GIS, for more than twenty years. She has written or edited five books, including Placing History: How Maps, Spatial Data, and GIS Are Changing Historical Scholarship (2008); Mastering Iron: The Struggle to Modernize an American Industry, 1800-1868 (2013); and Geographies of the Holocaust (2014). Anne’s pioneering work with historical GIS has been recognized by many fellowships and awards, including the American Ingenuity Award for Historical Scholarship (Smithsonian magazine, 2012), a Guggenheim Fellowship (2015), and three successive Digital Humanities Advancement grants from the National Endowment for the Humanities (2016-2022). She is a founding member of the Holocaust Geographies Collaborative, an international group of historians and geographers who explore the spatial aspects of the Holocaust through digital scholarship. She is currently developing a public website to share data on over 2,200 Holocaust camps and ghettos and nearly 1,000 survivor testimonies to enable students and scholars to map the historical geographies of named and unnamed Holocaust places.

Levi Westerveld is a geographer and award-winning cartographer with broad experience in spatial data gathering, analysis and visualization. He has 8 years of work experience in GIS and mapping for environmental modeling, impact assessments, community engagement and communication. Levi has international project management experience overseeing multidisciplinary teams with delivery in the Arctic and Pacific, and thematic knowledge in land and marine environmental issues, including climate change, waste and biodiversity. He is the lead editor of the forthcoming Arctic Permafrost Atlas. He is currently employed as senior engineer in the section for digitalization and innovation at the Norwegian Coastal Authority.

For In-person Registration email: cmeador@austin.utexas.edu

Zoom Registration


Digital Scholarship in Practice

Computational Approaches in the Study of History: The Case of People’s Daily

When: Feb 21, 2024, 12 pm to 1 pm 

Where: Perry-Castañeda Library Learning Lab 3

In this talk, we will explore what computational approach and methods may look like in historical studies. Alongside the potential advantages, the talk will also discuss the limitations and pitfalls in computational historical analysis. We will focus on a case study of the People’s Daily 人民日报, a prominent national newspaper of the PRC, to demonstrate the outcomes and limitations of applying computational methods in historical research.

Scholars Lab Newsletter – November 2023

Digital Humanities Workshop Series

Getting Started with Omeka

When: Nov. 3, 2023, 12 pm – 1 pm 

Where: Zoom

Omeka is a free, open-source platform for creating digital archives, exhibitions, and more. This workshop will give an overview of the various versions of Omeka and their different uses, before covering how to set up a basic Omeka site.

Zoom Registration

Additional Information


Libraries Workshop

Patent Basics

When: Nov. 7, 2023, 11 am – 12 pm

Where: Zoom

A virtual workshop on patents aimed at a beginner audience. We will define patents as a type of intellectual property, describe the different ways in which patents can be useful to researchers, and show how to find patent documents on freely available websites such as Google Patents.

Zoom Registration

Additional Information

Author Profiles & Citation Metrics: An Introduction for Scholars

When: Nov. 8, 2023, 1 pm to 2 pm

Where: Zoom

Taking advantage of profile services and understanding publishing metrics can help you increase the discovery of your work and track its impact. This workshop will introduce you to ORCID and Google Scholar profile systems and give you some tips for making the most of these types of services. We will also highlight several widely used citation metrics (Impact Factors, h-indices, SJR indicators) and help to demystify what they mean and how to find them.

Zoom Registration

Additional Information


The Theory & Practice of Digitization: A Community Symposium

When: Nov. 9, 2023, 4:45 pm – 7 pm

Where: The Scholars Lab, Perry-Castañeda Library

Join us in the Scholars Lab for a symposium on digitization. What gets digitized and how it gets digitized are decisions that affect everyone, but most of all, marginalized communities that have been historically disadvantaged from participation in scholarship and the building of library collections. Come and listen to lightning talks from cohort members trained in OCR and digitization, followed by a keynote address by Dr. Raha Rafii.

 Additional Information


UT GIS Day

GIS Day 2023 Celebration

When: Nov. 15, 2023, 12:30 pm to 5 pm

Where: Scholars Lab, Perry-Castañeda Library and Zoom

Join the UT Austin community in celebrating GIS Day 2023 on Wednesday, November 15th! GIS Day is an internationally acknowledged annual event held each November on the Wednesday of Geography Awareness week. It is a day dedicated to appreciating, discussing, and learning about GIS (geographic information system) technology and all that it enables.
    Through our UT GIS Day events this year we hope to raise the profile of the innovative GIS work being carried out by the UT campus community and specifically highlight open geospatial research since 2023 has been designated a Year of Open Science by the White House Office of Science and Technology Policy (OSTP).

In Person & Zoom Registration

Additional Informational


Sign up to receive monthly newsletter updates.

If you have any questions please feel free to email scholarslab@austin.utexas.edu

Scholars Lab Newsletter – October 2023

Digital Humanities Workshop Series

Introduction to StoryMaps

When: Friday, October 13, 12-1 pm

Where:  Zoom

StoryMaps is a digital tool that enables you to craft a narrative using maps, images, videos, and text. This workshop session will provide an introductory overview of creating a digital exhibit with StoryMaps. Participants will learn to weave together data points, images, videos, and text to form engaging stories.

Zoom Registration


Data & Donuts

Customer Reviews Data

When: Friday, October 20, 12-1:15 pm

Where: Zoom

How much is a star really worth? This session will examine customer review data including how to use reviews effectively, how to spot fake reviews, and what consumers, companies and academic researchers do with customer review data.

Zoom Registration

Open Source Geographic Information Systems (GIS) 

When: Friday, October 27, 11-12 pm

Where: Zoom and Perry-Castañeda Library (PCL), Scholars Lab, Data Lab

This workshop will provide an explanation of key geospatial terms and concepts and an introduction to open source geographic information system (GIS) software for visualizing, analyzing, storing, processing, and managing geospatial data. By the end of this session you should have the core knowledge required to start working effectively with geospatial datasets using open source tools.

In-person &

Zoom Registration

More Information


OA Week 2023

Support for Open Access Publishing at UT

When: Tuesday, October 24, 12 – 1 pm

Where: Zoom

In this session we’ll talk about Libraries’ support for open access (OA) publishing, including support that eliminates article processing charges (APCs) for UT authors. We’ll discuss the main types of OA publishing business models (including OA book publishing), and how the Libraries is strategically investing in these options. Finally, we’ll show participants how they can share their work regardless of the publication model. This free session is open to anyone, but will be of most interest to faculty, students, and staff who publish scholarly content. Registration is required. 

Zoom Registration

Los del Valle Oral Histories Available at Libraries’ Collections Portal

The Benson Latin American Collection at The University of Texas at Austin has made a significant oral history archive featuring voices of the Rio Grande Valley of South Texas and Northern Mexico available online through the Libraries’ Collections Portal.

University of Texas Rio Grande Valley history professor Manuel F. Medrano launched the Los del Valle Oral History Project in 1993 with the goal of collecting and preserving historical memories in the Rio Grande Valley, a region that has been historically underrepresented in archival and published research. Many of the original interviews were broadcast in edited form on local public access television. The collection of nearly 300 videos was transferred to the Benson Latin American Collection in 2015.

Raw footage of an interview with Dr. Américo Paredes, 1995. Dr. Paredes discusses how his parents came to Brownsville, his advice for writers, and the publication of his dissertation \With a Pistol in His Hand.

“By making the Los del Valle Oral History Project fully available online, the Benson highlights the immense intellectual and cultural contributions of the people of the lower Rio Grande Valley to the state of Texas,” says John Morán González, J. Frank Dobie Regents Professor of American and English Literature and former director of the university’s Center for Mexican American Studies. “Scholars, students, and the general public now have access to key figures and ideas that will surely enrich our understanding of this unique borderlands region.”

Los del Valle (Spanish for “those of the Valley”) is a term used to describe Mexican Americans who live in the rural South Texas, especially those in Hidalgo, Starr and Cameron Counties. These predominantly Mexican American communities, some of which predate the modern border between Mexico and the United States, represent a vibrant culture along this historically fluid border. Interviewees come from both sides of the modern border, and include writers Rolando Hinojosa-Smith, Carmen Tafolla and Oscar Cásares; scholar and folklorist Américo Paredes; educator Juliet Garcia; artist Carmen Lomas Garza; and accordionist Narciso Martínez. Other subjects include shrimp boat workers, Charro Days participants, World War II veterans and filmmaker Gregory Nava. These interviews cover a wide range of topics, from the early days of settlement in the region to the Chicano Movement and beyond.

An interview with Carmen Lomas Garza, a Chicana artist born in Kingsville, Texas, who talks about her art career. Lomas Garza talks about racial discrimination toward Mexican American families, and shares the influence and involvement of the Chicano movement in her life.

“Professor Manuel Medrano and his team have gifted us with an important resource that helps us understand the history of the Rio Grande Valley. By doing so, it places the RGV in the context of Texas and, more broadly, the U.S.,” says Maggie Rivas-Rodriguez, director of the Voces Oral History Center and the Center for Mexican American Studies.

“Oral history is key in documenting the perspective of the Latino community—too few Latinos/as will leave diaries, letters, and other records to a publicly accessible archive,” says Rivas-Rodriguez. “But even in the case of people like Américo Paredes, who did in fact leave his papers at the Benson, oral history provides context that would otherwise be unattainable.”

Interviews with Members of the 124th Cavalry Regiment at the 30th Annual Reunion. Interviews with members of the 124th Cavalry Regiment and their wives about their background, their memories of World War II, and what the reunion means to them.

Learn more about the specific holdings in the Los del Valle Oral History Project at Texas Archival Resources Online, or browse the online collection in the Libraries’ Collections Portal.

Los del Valle Oral History Project Archive was digitized with funds from the Latin American Materials Project (LAMP), Center for Research Libraries.

Read, Hot and Digitized: Indian Princely States Online Legal History Archive

Read, hot & digitized: Librarians and the digital scholarship they love — In this series, librarians from the UT Libraries Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship. Our hope is that these monthly reviews will inspire critical reflection of, and future creative contributions to, the growing fields of digital scholarship.


As a librarian, I can’t help but love a good bibliography. 

The first professional book I purchased after getting my first bibliographer job was Maureen Patterson’s South Asia Civilizations: a Bibliographic Synthesis.  Over the course of many years, Patterson, the former Bibliographer of the South Asia Collection at the University of Chicago, enlisted the help of a small army of graduate students and library staff to identify and succinctly document citations of scholarly books and articles organized in the ways that academics think.  Arranged by broad chronological and thematic categories, Patterson’s Bibliography was a life-saver for me while in graduate school.  Whenever I ventured into unknown territory as a grad student, the Bibliography was the perfect launching pad, giving me recommendations to begin learning.  Since then, as a librarian often called upon to help people in areas less familiar to me, I’ve turned to Patterson’s Bibliography over and over to learn, explore, and discover.  My personal copy, now tattered and torn but always with lots of post-it notes and flags pointing me to particular areas, reveals just how helpful this work has been to me.

Author’s personal copy of South Asian Civilizations

And yet, as a print source, published only once in 1981, it is dated.  Not just in terms of content—the way we think about South Asia has certainly changed since 1981!—but also in terms of its static functionality.  Bibliographies are essentially curated lists of citations, that is, of metadata (“data about other data”).  The intersection of online metadata and citations, namely in and through tools such as citation managers such as Endnote, Procite, RefWorks, and Zotero, is fertile digital humanities ground wherein we can learn about new subject areas.

For example, I recently learned of a new bibliography for the study of legal history, the Indian Princely States Online Legal History Archive, or IPSOLHA.  IPSOLHA takes up the challenge of complex histories from the colonial period when there were “hundreds of semi-sovereign, semi-autonomous states across the South Asian subcontinent. Varying in size and authority, these states (sometimes referred to as native, feudatory, or zamindari states) were incubators for innovative legal, administrative, and political ideas and offered a unique counterbalance to the hegemony of British rule. Yet despite their unique history, studying these states is complicated by the scattered nature of their archival remains.” IPSOLHA’s intervention is to use the tools of the digital humanities “to build a database and collection of references to facilitate historical study of these states, with a special focus on their legal and administrative history.” 

Example of entries re: Princely States from Patterson’s Bibliography

Main collection of IPSOLHA, with options for sorting, display and visualization

Like Patterson’s Bibliography, IPSOLHA is built upon student labor to investigate and document publications; but unlike Patterson, IPSOLHA has used the dynamic citation manager tool, Zotero, to gather relevant references from both online and analog resources which are then uploaded into a database.  The database sorts and presents the references in static thematic categories, but also in ways that can be determined by the researcher, including by type, language, location and more.  At the time of this writing, IPSOLHA is primarily a discovery tool (like Patterson), but in time, the hope is that the discovery will lead to digitization projects and more online full-text access for researchers.

Display from IPSOLHA of Gazetteers

IPSOLHA is a fabulous place for both beginner researchers to get started, but also for more advanced scholars of princely India to find hitherto unknown source materials.  I encourage all to dive in and explore the possibilities.

Learn more about:

Read, Hot and Digitized: Nuṣūṣ — A Corpus of Neglected Texts

Read, hot & digitized: Librarians and the digital scholarship they love — In this series, librarians from the UT Libraries Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship. Our hope is that these monthly reviews will inspire critical reflection of, and future creative contributions to, the growing fields of digital scholarship.


While digital, machine-readable texts in Arabic are growing in their availability, certain genres of writing and scholarship in Arabic have become more readily accessible than others. Among those more obscure disciplines are Sufism, theology (Muslim and Christian), and philosophy. These tend to be theoretically complex, and even dogmatically challenging, disciplines that are not as well represented in North American Islamic Studies programs as literature or Qur’anic studies. The Nuṣūṣ corpus––a project led by Antonio Musto––seeks to fill in some of the desiderata by putting more texts from these essential disciplines up on the Internet for researchers to use.

A project that began with an almost exclusive focus on Sufism, Nuṣūṣ has expanded to include works from a variety of complex disciplines of Arabic-language scholarship produced by Muslims and Christians. The corpus currently contains 61 machine-readable texts, with plans to add more and to make the text files available for download. Differing from other, larger corpora of Islamicate[1] disciplines, Nuṣūṣ provides the bibliographic information for the modern editions from which these digitized texts are derived. This is not only a responsible move, but a useful one for researchers: modern editions of historic texts can differ greatly; comparing modern editors’ approaches to the text and their choices that affect meaning and understanding is therefore rich area of exploration in Arabic-language digital humanities. It is hoped that––as possible––Nuṣūṣ will start to add multiple editions of historic texts in order to facilitate this comparative work.

Image of a table of Arabic-language works held in the Nusus corpus.
Nusus’s “Browse Corpus” page.

Nuṣūṣ’s aspirations lie in providing researchers with an adequate corpus from which to do computational text analysis. To that end, the team has created several different ways for researchers to access and engage with the texts. The “Browse Corpus” feature gives researchers an accurate sense of which specific items are included. If one is looking for a particular author or text, this would be the list to consult. This is also where crucial metadata (information about the item) is located, such as the origin of the digital images (Nuṣūṣ’s own OCR process or the OpenITI project repository), the internal corpus text ID, the date of the historic text’s alleged composition, the discipline, the genre of writing, the title, and the author. Author names link to biographies from the Encyclopaedia of Islam, and titles link to the WorldCat record for the modern edition used in the digitization of the text.

Image of a search for an exact term in the Nusus corpus.
Performing a search for the exact term “عقل” in the Nuṣūṣ corpus.

Furthermore, the Nuṣūṣ team has provided a cross-corpus search tool. Researchers can build a search using the provided fields and Boolean operators (AND, OR), and can specify whether they are searching for an exact term. It is also possible to confine the search to specific titles, authors, or genres. This arrangement encourages researchers to pursue projects that might compare across a scholar’s oeuvre, across a genre of writing (Muslim theology, philosophy, Sufism, or Christian theology), or across a single text. Researchers could use this tool to construct searches across known networks of scholars, as well. As the corpus expands, the ability to conduct searches and collect the resulting data will become increasingly effective and useful.

Readers interested in text and corpora analysis should consult the UT Libraries’ Digital Humanities Tools and Resources guide for more information on methods to apply to corpora like Nuṣūṣ. For recommendations of other corpora that might be useful for your research, consult the Data Set list on the Text Analysis guide. Lastly, as the Nuṣūṣ corpus partners with and derives from the OpenITI repository, it is worth considering the OpenITI repository documentation at the KITAB project. Happy corpus hunting!

Dale J. Correa, PhD, MS/LIS is Middle Eastern Studies Librarian and History Coordinator for the UT Libraries.


[1] The term Islamicate was coined by Marshall G.S. Hodgson in volume 1 of his The Venture of Islam (p. 57).

Read, Hot, and Digitized: The Freedmen’s Bureau Search Portal

Read, hot & digitized: Librarians and the digital scholarship they love — In this new series, librarians from the UT Libraries Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship. Our hope is that these monthly reviews will inspire critical reflection of, and future creative contributions to, the growing fields of digital scholarship.


Government documents can offer crucial insight into the histories of a nation, but traditional access can require skill with microfilm readers, resources to travel to an archive and astute understanding of how to use an index. As cultural heritage institutions take on more digitization projects, researchers have benefited from remote access to digital collections complimented by user-friendly browse and search features. This past November, the National Museum of African American History and Culture (NMAAHC) gifted scholars and genealogists alike with the Freedmen’s Bureau Search Portal, a valuable new platform to discover 1.7 million pages of digitized records from the Bureau of Refugees, Freedmen, and Abandoned Lands.

Screenshot of PDF showing a scanned image of report from the Bureau's collection with the transcribed text on the left side.
Researchers can download a pdf of records that include the transcribed text side by side with the scanned record image.

Created in 1865, the Bureau of Refugees, Freedmen, and Abandoned Lands, more commonly known as the Freedmen’s Bureau, aspired to help Southerners, including 4 million formerly enslaved people, transition to a new society after the Civil War. Congress charged the Bureau with providing social support like medical care, rations and educational opportunities, and tried to help poor individuals deal with seized lands and find employment. Abolished in 1872 by Congress, the short-lived Bureau’s positive impact on assisting formerly enslaved people is still debated. However, the utility of these records for genealogical and scholarly purposes is certain as they offer valuable insight into the Reconstruction period, including government policies and interactions between freedmen, white southerners and government officials.

Previously, portions of these records have been available online for browsing, but were not always searchable or in one place. The NMAAHC interface allows users to filter records by collection, record type, location and date. In addition to a keyword search, these features help users discover materials like ledgers of employment, marriage records and reports describing criminal and civil disputes. Thanks to efforts to index names and locations, users can also search the names of enslaved and former owners, which is of particular use to genealogists and individuals researching family histories.

The indexing was the first step to the collection portal’s debut on the Smithsonian-developed digital asset management system, “Enterprise Digital Asset Network (EDAN)”. This system connects multiple Smithsonian digital collections and allows users to access metadata using the institution’s own API. The user-friendly search interface is built using the open source search platform, Apache Solr, which UT Libraries also uses for our own Collections portal.

Screenshot of the search portal results page. It shows options to filter by name, date and keyword search. The results show the titles of reports and the option to "Quick View Transcription"
Screenshot showing the search results page for record locations indexed from Texas. Users can quickly review the transcribed text from the results page without having to scroll through the scans.

What makes the NMAAHC’s search portal especially notable is its support from a crowdsourcing transcription project, a collaborative endeavor from the NMAAHC and Smithsonian Transcription Center. This is the largest crowdsourcing project the Smithsonian has ever undertaken and so far, 400,000 pages have been transcribed by volunteers. The records’ cursive script makes it challenging to automatically transcribe using OCR, and the project will greatly benefit from transcription efforts. These efforts are invaluable as the letters and reports that provide more details beyond statistical ledgers are more often than not untranscribed.

Screenshot of the Smithsonian Transcription Center project page for the Freedmen's Bureau. It shows the percentage completed for each project, with the first two being at 87% and 86% percent complete.
Screenshot showing percentage completion of Freedmen’s Bureau transcription projects from the Smithsonian Transcription Center.

For now, users can still search the indexed data for names, places and dates, and additional information provided by volunteers in their transcription efforts like subjects and keywords. The records themselves and the transcription project will provide scholars a glimpse into life during the Reconstruction period and allow genealogy researchers to make meaningful connections with ancestors and family histories.

Explore more in these UT Libraries resources:

New UT Libraries Database! African American Heritage

  • Digital resource exclusively devoted to an American family history research containing primary sources devoted specifically to African American family history, including census records, vital records, freedman and slave records, church records, legal records, and more.

Crouch, Barry A. The Freedmen’s Bureau and Black Texans. University of Texas Press, 1999.

Farmer-Kaiser, Mary. Freedwomen and the Freedmen’s Bureau: Race, Gender, and Public Policy in the Age of Emancipation. Fordham University Press, 2010.

Mears, Michelle M. And Grace Will Lead Me Home: African American Freedmen Communities of Austin, Texas, 1865-1928. Texas Tech University Press, 2009.

Read, Hot & Digitized: Art and Revolution

Read, hot & digitized: Librarians and the digital scholarship they love — In this new series, librarians from the UT Libraries Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship. Our hope is that these monthly reviews will inspire critical reflection of, and future creative contributions to, the growing fields of digital scholarship.

Working at the Nettie Lee Benson Latin American Collection since I began a career in librarianship, I have been fortunate to witness and sometimes participate in various facets of what goes into making the Benson the premiere Latin American collection in the world. The collection has many incomparable features, and depending on a researcher’s interest, they will know the Benson in unique ways from others. For instance, there are those that know the Benson because we hold the papers of Gloria Anzaldúa and Alicia Gaspar de Alba, two groundbreaking Chicana writers. Others will know it because of the Archive of Indigenous Languages of Latin America (AILLA), the digital archive that is a gateway to linguistic preservation and revitalization. Others will know it still because of our wonderful circulating collection, which includes journals, new publications, canonical works, children’s literature, etc. At the Benson we always say that if it exists and is tied to Latin American or US Latinx subject matter, we try to collect it.

One unsurprising aspect of the Benson is our dedication to documenting human rights initiatives. This happens across all of the ways that we do collecting, but I’m thinking specifically about the work that my colleague Theresa Polk and the Latin American Digital Initiatives team do on a daily basis, particularly working with post-custodial partners throughout Latin America to document local, often grassroots struggles.

I couldn’t help but think of her work when I saw a noteworthy digital collection from the University of New Mexico’s esteemed Center for Southwest Research. The collection, “Asamblea de Artistas Revolucionarios de Oaxaca Pictorial Collection,” is described as a “collection of prints, posters, and mural stencils…created by a collective of young Mexican artists that formed during the state of Oaxaca’s 2006 teachers strike.” The strike lasted seven months and turned violent after police opened fire on non-violent protestors representing the teachers’ union. Eventually, various groups forced the police out of the city and set up an anarchist community for several months while unsuccessfully calling for the resignation of then-Oaxacan governor Ulises Ruiz Ortiz. The 127 artworks in this collection reflect this period through themes that include “land rights, political prisoners, government corruption, political violence, police brutality, violence against women, art exhibitions and the nationalization of agriculture and oil.”

The artwork has been digitized and made available on the site using high-resolution scans. One of the strengths of the collection is that users can see a thumbnail and a brief, but useful description of the document, as shown below.

Then, users can click on each individual item for a larger image with richer metadata. Indeed, another strength of the collection is its metadata. While only in English, it contextualizes the image for a deeper understanding.

Another feature of the digital collection is that UNM’s Center for Southwest Research has worked with the Asamblea de Artistas Revolucionas de Oaxaca (ASARO) to archive their blogs and other digital-born materials using Archive-It. Having access to these blogs in a shared digitize space enhances the collection because it preserves ASARO’s voices on the struggle, using their words and their language. Like the metadata, this creates fuller meaning for researchers while fostering a relationship between ASARO and UNM.   

This collection is useful to researchers and classes who are interested in understanding politics and local movements in twenty-first century Mexico. Like the Benson’s Latin American Digital Initiatives, the themes are so varied, making it a useful tool for classes doing interdisciplinary work, and particularly for scholars who are more visually-inclined. In any case, it is a welcome contribution to the study of human rights in Latin America, and a wonderful reminder of the work that libraries do in documenting and preserving historical moments.

Would you like to know more about the teachers’ strike? Check out the following resources we hold at UT Libraries.

La batalla por Oaxaca (2007)

“Women in the Oaxaca Teachers’ Strike and Citizens’ Uprising (2007)

“‘Our Culture’s Not for Sale!’: Music and the Asamblea Popular de los Pueblos de Oaxaca in Mexico” (2021)

Introducing Rozha: A Tool to Simplify Multilingual Natural Language Processing

In my role as European Studies Liaison, one of my priorities is to assist people in their digital humanities work.  In that work, I have found a glaring gap in tools that support multilingual and  non-English materials, particularly those that focus on natural language processing (NLP).  Much of the work that has been done using NLP has been focused on an Anglocentric model, using English texts in conjunction with tools and computer models that are primarily designed to work with the English language. I wanted to make it easier for people to begin engaging with non-English materials within the context of their NLP and digital humanities work, so I created Rozha.

Rozha, a Python package designed to simplify multilingual natural language processing (NLP) processes and pipelines, was recently released on GitHub and PyPI under the GNU General Public License, allowing users to use and contribute to the tool with minimal limitations. The package includes functions to perform a wide variety of NLP processes using over 70 languages, from stopword removal to sentiment analysis and many more, in addition to visualizations of the analyzed texts. It also allows users to choose from NLTK, spaCy, and Stanza for many of the processes it can perform, allowing for easy comparison of the output from each library. Examples of the code being used can be seen here.

While the project first grew out of the needs of researchers and graduate students working at UT-Austin who were interested in exploring NLP and the digital humanities using non-English languages but who did not have very much prior coding experience, its code also aims to streamline NLP work for those with more technical knowledge by simplifying and shortening the amount of code they need to write to accomplish tasks. Output from the package’s functions can be integrated into more complex and nuanced workflows, allowing users to use the tool to perform standard tasks like word tokenization and then use the response for their other work.

The package is written in Python for a variety of reasons. Python has a wide base of users that makes it easy to share with others, and which helps ensure that it will be used widely. It also helps ensure that people will contribute to the project, building upon its existing code. Fostering contributions for multilingual digital humanities and NLP can help broaden the community of scholars, coders and researchers working with these multilingual materials, which will broaden the community in general while also improving the package. Python is also very commonly used for NLP applications, and the packages integrated into Rozha all have robust communities of their own. This allows for users to connect with other communities as well, and to explore these technologies on their own for applications beyond what this package provides.

The Rozha package ultimately aims to make multilingual digital humanities and natural language processing more accessible and to simplify the work of those already working in the field–and perhaps open up new avenues to explore for newcomers and established NLP practitioners. My hope is that this tool will help encourage diversity in the NLP landscape, and that people who may have felt it too daunting to work with materials in non-English languages may now feel more comfortable through the ease of working with this package.  Beyond that, I hope the package will serve as a conduit for additional contributions and collaboration, and that the code will ultimately help strengthen the field and community of practitioners working with non-English materials.