Tag Archives: handwritten text recognition

Unlocking the Colonial Archive: Grant Will Bring Access to a Trove of Documents

Game-changing innovations that use artificial intelligence (AI) tools will improve access to Indigenous and Spanish colonial archives. “Unlocking the Colonial Archive: Harnessing Artificial Intelligence for Indigenous and Spanish American Historical Collections” is a collaborative project led by LLILAS Benson Latin American Studies and Collections at The University of Texas at Austin, the Digital Humanities Hub at Lancaster University, and Liverpool John Moores University. The project will transform “unreadable” digitized Indigenous and Spanish colonial archives into data that will be accessible to a broad spectrum of researchers and the public.

The project will be funded by a $150,000 collaborative grant from the National Endowment for the Humanities (NEH) as well as €250,000 (approx. US$304,000) from the UK’s Arts and Humanities Research Council (AHRC) through the joint New Directions for Digital Scholarship in Cultural Institutions program. Kelly McDonough, associate professor in the Department of Spanish and Portuguese, and Albert A. Palacios, digital scholarship coordinator at LLILAS Benson, will manage the project at UT Austin.

The Benson Latin American Collection at The University of Texas at Austin possesses one of the world’s foremost collections of colonial documents in Spanish and Indigenous languages of Latin America. Yet even when digitized, such documents are often neither searchable nor readable because of calligraphy, orthography, and the written language of the document itself. In tackling this problem, the collaborators propose to employ and develop interdisciplinary data science methods with three goals in mind: to expedite the transcription of documents using cutting-edge Handwritten Text Recognition technology; to automate the identification and linking of information through standardized vocabulary ontologies using Linked Open Data and Natural Language Processing techniques; and to facilitate the automated search and analysis of pictorial elements through Image Processing approaches.

The research will be based on three digital collections under the aegis of LLILAS Benson and one from the National Archive of Mexico. The LLILAS Benson collections are digitized Benson Collection colonial holdings, including the Relaciones Geográficas, 16th-century painted written and pictorial documents describing the geography and peoples of New Spain; the Royal Archive of Cholula at the Archivo Judicial del Estado de Puebla (Mexico), which was digitized through a Mellon-funded post-custodial grant; and the Primeros Libros de las Américas, a digitized collection of books published in the Americas before 1601.

McDonough and Palacios say that the project will further colonial Latin American studies not only at UT, but beyond, significantly facilitating the discoverability and interpretation of these materials. “While the work will begin with collections at the Benson and its Latin American partners, the technology developed will be accessible to libraries and archives worldwide, who can use it to automatically transcribe their digitized manuscripts,” Palacios said. In addition, “through the public workshops that are part of this project, we will train humanists on new innovative approaches that leverage the potential of machine learning to facilitate research,” McDonough added.

The geographical diversity among the project’s leadership and collaborators reenforce its global reach. The PIs are McDonough and Palacios of UT Austin, Patricia Murrieta-Flores of Lancaster University (UK), and Javier Pereda Campillo of Liverpool John Moores University (UK). Other collaborators hail from Germany, Mexico, Poland, Portugal, Spain, and Switzerland. Among the numerous participants from Mexico is Lidia García Gómez, history professor at the Benemérita Universidad Autónoma de Puebla, who was involved with the digitization of the Royal Archive of Cholula.


For more information: Susanna Sharpe, Communications Coordinator, LLILAS Benson, The University of Texas at Austin