Tag Archives: Digital Scholarship

Transforming Text: A Year of the Scan Tech Studio

The Scan Tech Studio (STS), located in the new PCL Scholars Lab, is a self-service facility designed to empower scholars and researchers in digitization, image processing, and text analysis projects. Equipped with advanced scanning equipment and software, the STS allows the UT community to independently digitize materials, apply optical character recognition (OCR) and handwritten text recognition (HTR), and engage in digital text analysis. From helping patrons scan historical documents to applying machine-readable techniques to modern texts, the STS has had an exciting first year guiding users in elevating their research.

The team behind this effort is the Scan Tech Studio Working Group, composed of seven librarians and digitization experts dedicated to helping scholars maximize the studio’s resources. We’re also grateful for the support of UT Libraries IT and the Scholars Lab Graduate Research Assistants, who keep everything running smoothly behind the scenes. The working group develops workshops, creates research guides, and promotes the use of digital scholarship tools related to OCR, HTR, and text analysis. Additionally, we offer guidance on copyright considerations and assist users in navigating the complexities of text recognition and analysis. Over the past year, the STS Working Group has been instrumental in fostering a dynamic learning environment within the Scholars Lab and building campus-wide connections to unlock the studio’s potential.

The working group has been dedicated to developing services that meet the evolving needs of the campus community. So far, our primary focus has been providing consultations and instruction related to digitization, OCR/HTR, and text analysis. With the diverse expertise of our team, we’ve been able to offer tailored, one-on-one consultations and small group sessions that help users think through the various stages of their digital projects, from planning to execution. Scheduling time with STS experts is simple through our user-friendly request form, ensuring patrons have easy access to specialized support.

Overall, we received 18 reservation requests, which meant that users had a consultation with one of the STS Working Group members, needed the space for digitization, and/or used our digital tool to OCR their materials. Many of these requests came from graduate students, specifically from the Department of History and the School of Information.

In addition to consultations, we’ve developed instructional tools such as a comprehensive research guide on research data management and the use of the studio’s equipment and software. The STS has also become a valuable teaching space, regularly hosting classes that integrate the studio’s technology into their curriculum, allowing students hands-on experience with advanced digitization tools and methods.

Reflecting on the past year, the STS has hosted several workshops inside and outside the studio to showcase its tools and demonstrate the possibilities to the campus community. For example, STS team members led workshops at this past summer’s Digital Scholarship Pedagogy Institute, focusing on digitization, OCR, and text analysis. Additionally, we contributed to the Digital Humanities Workshop Series, providing training in these specialized areas. 

It’s also worth noting that the working group dedicates time to internal development by hosting workshops for ourselves, allowing us to learn from one another and build up our collective skillset. As the saying goes, the best way to learn is to teach—and we’ve embraced this approach to better serve our users!

Due to the Scan Tech Studio being a new service, we wanted to partner with existing programs and reach out to various centers. We invited and provided an overview of our services to different centers around campus, such as JapanLab and the Center for Middle Eastern Studies. This gave us great insight into the needs around campus regarding digitization and OCR.

Additionally, we provided training in using specialized OCR tools such as Abbyy FineReader, a paid program under Adobe that is exclusively available at STS. It works exceptionally well for accurately OCRing text and training. We had about 36 uses in just our first year in the space.

As we continue to see the success of our space, we are planning to expand our services and tools. We aim to create additional resources covering various OCR tools and processes. We also plan to continue to collaborate with the Digital Humanities Workshop series to present different OCR and text analysis tools. Additionally, we intend to develop workshops tailored to researchers, including pre-research and post-research workshops. These workshops will help researchers understand what they need to do when conducting their research to ensure a successful OCR experience and facilitate the beginning of text analysis upon their return. We look forward to seeing how the groundwork we laid during the first year will impact our service in the upcoming year.

As you can see, we have a lot of promising plans to build off the Scan Tech Studio’s successful first year. We look forward to continuing to grow the space as a new hub for digitization and text analysis on campus. Scan you feel the excitement? 

Illuminating Explorations: Music and Childhood Culture

Hannah Neuhauser, 2025 PhD in Musicology, Butler School of Music


“Illuminating Explorations” – This series of digital exhibits is designed to promote and celebrate UT Libraries collections in small-scale form. The exhibits will highlight unique materials to elevate awareness of a broad range of content. “Illuminating Explorations” will be created and released over time, with the intent of encouraging use of featured and related items, both digital and analog, in support of new inquiries, discoveries, enjoyment and further exploration.

Music is a portal and can unlock a door to a fantastical sonic landscape, brimmed with mystic, melodic magic. We turn a page and open ourselves to discovering an entirely different realm, full of magic and mystery. Dangers may lurk around each corner, giants may want to gobble us up for lunch, and at times, the path may be so utterly twisted that we almost lose ourselves. Suddenly, the darkness becomes light, and in silence, we find ourselves back in the safety of our childhood bedrooms. The lion’s roar – a radiator. The pitter pattering of tiny Wild Things – the rain outside. Yet within a small, singular space, we traveled to another world and returned on the other side changed. In Throw the Book Away (2013), Anne Doughty remarks that regardless of how much a child reads, it is the experience of self-reliance and youthful agency that will ensure a protagonist’s survival in an unknown labyrinth. They must hear the warnings, read the signs, and act on their own. 

Cover image of the score book Songs and a Sea Interlude by Oliver Knussen and Maurice Sendak, from the opera Where the Wild Things Are.

This exhibit recreates these aural portals. However, instead of reading a book, we invite you to immerse yourself in the experience of children’s music. The Music and Childhood Culture Spotlight Exhibit seeks to inform scholars of the rich history of children’s music by highlighting hidden gems from the UT Austin library collections. Did you know A.A. Milne commissioned his own songbook for Winnie the Pooh in 1929? Or that Carole King wrote a children’s television special called Really Rosie in 1975? It was a huge hit and we have the score, which you can check out to sing to your younger friends! Selections also range from audio recordings like Danny Kaye’s narration of Tubby the Tuba (1947) to Oliver Knussen’s operatic score of Where the Wild Things Are (1982) and a wealth of interdisciplinary scholarship from Mozart’s influence on childhood labor (Mueller) to the rise of Young People’s Records (Bonner). 

Cover image of the score book Really Rosie by Carole King

Music is a psychological tool to study emotional regulation “without rules or limitations, it is pure assimilation” and media can stimulate fantasy for children to pretend “as if” they are something else (Gotz et, all, 2005 p.13). Numerous scholars discuss the sentimentality and destruction of child development due to media dependency, but children will always make their own ideas of media to understand, transgress, rebel, and connect with their surroundings (Parry, 2013). Here, in this exhibit, we seek to highlight the positive attributes of musical media that allow children (and our inner child) to enact their own creative cultures through their imaginations and identify the “traces” of media that we value. 

Cover image of the book Mozart and the Mediation of Childhood by Adeline Mueller

I hope you enjoy these discoveries as much as I did. 


Works Cited 

  • Doughty, Amie A. Throw the Book Away: Reading versus Experience in Children’s Fantasy. Jefferson, North Carolina: McFarland & Company, Inc., Publishers, 2013. 
  • Gotz, Maya, Dafna Lemish, Hyesung Moon, and Amy Aidman. Media and the Make-Believe Worlds of Children. Routledge, 2005. 
  • Parry, Becky. Children, Film, and Literacy. London: Palgave Macmillan, 2013.

Data Analysis of Library Data

Anusha Ravi, a Scholars Lab Graduate Research Assistant (GRA), is entering her second year in the School of Information Science, specializing in Data Science and Analytics. During the last academic year, she undertook a Digital Scholarship Project as part of her GRA position. Collaborating with the Collection Development team, she cleaned, analyzed, and visualized data they had collected over the past few years.

I am a passionate data analyst with a keen interest in leveraging data to drive meaningful insights and decisions. My recent work at Scholars Lab Graduate Research Assistant(GRA) has given me a valuable opportunity to apply my skills in a real-world setting, addressing practical challenges and contributing to the enhancement of our informational resources present at the library. My journey in data science is driven by a curiosity to explore data intricacies and a commitment to using technology for the greater good. As part of my responsibilities, I have to complete a digital scholarship project.

The data points in this graph have been anonymized to safeguard confidentiality.

As my digital scholarship project, I worked with the Collection Development team on improving the process of handling suggested purchase requests. These requests are crucial as their analysis would help them understand and enhance the breadth and depth of the collections available in the library. My role involved exploring historical data to identify gaps and understand its structure thoroughly for future enhancements. Collaborating with the Collection Development team who are my stakeholders, I ensured their needs were clearly understood and actionable. This collaborative approach not only enriched my perspective but also aligned our efforts with the library’s strategic goals.

The data points in this graph have been anonymized to safeguard confidentiality.

Using Python, I undertook the task of cleaning and anonymizing the data. Fixing missing values and ensuring data confidentiality was challenging, yet automating these processes was a significant achievement. Python’s versatility and powerful libraries were instrumental in this endeavor. Looking ahead, I aim to deepen my expertise in Python to automate more complex data workflows and improve efficiency further. Learning to automate this process was a big challenge, but overcoming it was a significant achievement. I had to code with a future use case in mind, which proved to be very insightful and thereby allowed me to improve my skills.

For data visualization, I turned to Tableau, known for its user-friendly interface and powerful visualization capabilities. Creating interactive and simple charts made it easier to communicate complex data insights to non-technical stakeholders. This was confirmed on presenting this dashboard to the Collection Development team who praised the simple but effective dashboard.  Additionally, based on their feedback, I plan to create documentation on using Tableau to ensure easy navigation for future use of the team. 

The Scholars Lab provided invaluable support, offering resources and expert advice that enhanced my analysis. Presenting my findings at a poster session was a highlight, showcasing the success and the practical recommendations for better data organization and future collection improvements. This project taught me the importance of stakeholder collaboration, secure data practices, and the continuous quest for automation and efficiency in data processes. 

Scholars Lab Newsletter – March 2024

Digital Humanities Workshop

 Introduction to Recogito

When: 3/8/24, 12:00 pm – 1:00 pm

Where: Zoom

Presenters: Miriam Santana and Willem Borkgren

Recogito is an open-source semantic annotation tool that allows you to tag key terms and reveal the relationships between key names, places, and events between multiple documents. Attendees will learn how to create an account, upload documents, and start working on tags and annotations. They will also learn the deeper capabilities of Recogito, such as mapping relationships, working collaboratively on a corpora of documents, and exporting data for use in other DH tools.

Zoom Registration

Introduction to Optical Character Recognition (OCR)

When: 3/22/24, 12:00 pm – 1:15 pm

Where: Hybrid – Zoom and Scholars Lab Data Lab, Perry-Castañeda Library

Presenters: Dale J. Correa, Mercedes Morris, & Natalya Stanke

This workshop introduces the basics of optical character recognition (OCR), which allows for full-text searching and other types of text manipulation of a digitized document. Attendees will learn how to use Google Docs to create a basic machine-readable text from an image file and be introduced to Tesseract for OCR through exercises in Google Colab.

This workshop is open to researchers interested in OCR for any language. It is strongly recommended that attendees:

1) prepare a digitized, highly legible sample image file for trying out the tools

2) have a Google account to do the exercises fully and save their work.

Register for Zoom or PCL Scholars Lab Data Lab


Open Education Week Virtual Panel

When: 3/8/24, 1:00 pm – 2:00 pm

Where: Zoom

UT Austin’s OER Working Group invites you to celebrate Open Education Week (March 4-8) by joining our faculty/student panel for a virtual discussion on open education practices. Join us for a special Open Education Week discussion on applying open education practices in your teaching. Our student/faculty panel will discuss their experiences finding, adopting, and even creating open educational resources (OER) and other no-cost course materials.

In addition to this faculty perspective, our panel will also include a student voice. Our student panelist is currently collaborating on an original OER project, bringing valuable and unique insight into how open pedagogy can transform student learning experiences.

Zoom Registration


Digital Scholarship in Practice

When: 3/8/24, 1:30 pm – 2:30 pm

Where: Scholars Lab Data Lab, Perry-Castañeda Library

Want to get started with Digital Humanities in the classroom, but you don’t know where to start? This introductory workshop will provide advice and practical ideas to incorporate digital humanities methodologies at all levels of teaching — from syllabus design to assignments and classroom activities. Learn about platforms, strategies, and resources to fit your classroom, your teaching style, and your comfort level with technology. While the advice given will apply to a wide variety of classrooms, the workshop will highlight resources specific to Japanese and East Asian Studies.

Scholars Lab Newsletter – February 2024

Digital Humanities Workshop Series

Digitization, Digital Projects, and Copyright Issues

When: Feb. 2, 2024, 12 pm – 1 pm 

Where: Perry-Castañeda Library Scholars Lab Project Room 6 (2.218)

Join us in-person for a discussion about some of the common copyright issues that pop up when digitizing materials or creating digital projects. We’ll have some scenarios to talk through as a group, but feel free to also bring your questions and we’ll try to discuss some of those scenarios as well.

In-Person Registration

Interactive Writing in Twine

When: Feb. 9, 2024, 12 pm – 1 pm

Where: Zoom 

Twine is an open-source application used to write interactive narratives ranging from fictional adventures to practical decision trees. This workshop will introduce the basics of Twine story creation: creating your first passage of text, linking passages, incorporating HTML and variables, and publishing a Twine project. The session will include a variety of example Twines of different complexity and purpose, and by the end, participants will have their skeleton decision tree that they can expand into a larger text. 

Zoom Registration

Getting Started with Scalar

When: Feb. 23, 2024, 12 pm – 1 pm

Where: Zoom 

Scalar is a free, open-source publishing platform designed for long-form, born-digital, and media-rich digital scholarship. This workshop will give an overview of Scalar and discuss what differentiates it from other content management systems, before demonstrating how to build your Scalar site.

Zoom Registration


Data & Donuts Workshop Series

 Research Data Management Best Practices

When: Feb 16, 2024, 12 pm – 1:15 pm

Where: Perry-Castañeda Library Scholars Lab Data Lab (2.202) and Zoom

This workshop will go over helpful strategies and techniques for effective research data management in all stages of the research lifecycle, from the drafting of comprehensive data management plans to successful publication of research data. Join this session to learn how to overcome data management challenges and stay in compliance with research data management regulations.

Zoom Registration


The Institute for Historical Studies in the Department Workshop

“Mapping Trauma: A Workshop on Space and Memory”

When:  Feb 19, 2024, 12 pm – 1:30 pm 

Where:  Perry-Castañeda Library Scholars Lab Data Lab (2.202) and Zoom 

Anne Kelly Knowles has been a leading figure in the Digital and Spatial Humanities, particularly in the methodologies of Historical GIS, for more than twenty years. She has written or edited five books, including Placing History: How Maps, Spatial Data, and GIS Are Changing Historical Scholarship (2008); Mastering Iron: The Struggle to Modernize an American Industry, 1800-1868 (2013); and Geographies of the Holocaust (2014). Anne’s pioneering work with historical GIS has been recognized by many fellowships and awards, including the American Ingenuity Award for Historical Scholarship (Smithsonian magazine, 2012), a Guggenheim Fellowship (2015), and three successive Digital Humanities Advancement grants from the National Endowment for the Humanities (2016-2022). She is a founding member of the Holocaust Geographies Collaborative, an international group of historians and geographers who explore the spatial aspects of the Holocaust through digital scholarship. She is currently developing a public website to share data on over 2,200 Holocaust camps and ghettos and nearly 1,000 survivor testimonies to enable students and scholars to map the historical geographies of named and unnamed Holocaust places.

Levi Westerveld is a geographer and award-winning cartographer with broad experience in spatial data gathering, analysis and visualization. He has 8 years of work experience in GIS and mapping for environmental modeling, impact assessments, community engagement and communication. Levi has international project management experience overseeing multidisciplinary teams with delivery in the Arctic and Pacific, and thematic knowledge in land and marine environmental issues, including climate change, waste and biodiversity. He is the lead editor of the forthcoming Arctic Permafrost Atlas. He is currently employed as senior engineer in the section for digitalization and innovation at the Norwegian Coastal Authority.

For In-person Registration email: cmeador@austin.utexas.edu

Zoom Registration


Digital Scholarship in Practice

Computational Approaches in the Study of History: The Case of People’s Daily

When: Feb 21, 2024, 12 pm to 1 pm 

Where: Perry-Castañeda Library Learning Lab 3

In this talk, we will explore what computational approach and methods may look like in historical studies. Alongside the potential advantages, the talk will also discuss the limitations and pitfalls in computational historical analysis. We will focus on a case study of the People’s Daily 人民日报, a prominent national newspaper of the PRC, to demonstrate the outcomes and limitations of applying computational methods in historical research.

Scholars Lab Newsletter – November 2023

Digital Humanities Workshop Series

Getting Started with Omeka

When: Nov. 3, 2023, 12 pm – 1 pm 

Where: Zoom

Omeka is a free, open-source platform for creating digital archives, exhibitions, and more. This workshop will give an overview of the various versions of Omeka and their different uses, before covering how to set up a basic Omeka site.

Zoom Registration

Additional Information


Libraries Workshop

Patent Basics

When: Nov. 7, 2023, 11 am – 12 pm

Where: Zoom

A virtual workshop on patents aimed at a beginner audience. We will define patents as a type of intellectual property, describe the different ways in which patents can be useful to researchers, and show how to find patent documents on freely available websites such as Google Patents.

Zoom Registration

Additional Information

Author Profiles & Citation Metrics: An Introduction for Scholars

When: Nov. 8, 2023, 1 pm to 2 pm

Where: Zoom

Taking advantage of profile services and understanding publishing metrics can help you increase the discovery of your work and track its impact. This workshop will introduce you to ORCID and Google Scholar profile systems and give you some tips for making the most of these types of services. We will also highlight several widely used citation metrics (Impact Factors, h-indices, SJR indicators) and help to demystify what they mean and how to find them.

Zoom Registration

Additional Information


The Theory & Practice of Digitization: A Community Symposium

When: Nov. 9, 2023, 4:45 pm – 7 pm

Where: The Scholars Lab, Perry-Castañeda Library

Join us in the Scholars Lab for a symposium on digitization. What gets digitized and how it gets digitized are decisions that affect everyone, but most of all, marginalized communities that have been historically disadvantaged from participation in scholarship and the building of library collections. Come and listen to lightning talks from cohort members trained in OCR and digitization, followed by a keynote address by Dr. Raha Rafii.

 Additional Information


UT GIS Day

GIS Day 2023 Celebration

When: Nov. 15, 2023, 12:30 pm to 5 pm

Where: Scholars Lab, Perry-Castañeda Library and Zoom

Join the UT Austin community in celebrating GIS Day 2023 on Wednesday, November 15th! GIS Day is an internationally acknowledged annual event held each November on the Wednesday of Geography Awareness week. It is a day dedicated to appreciating, discussing, and learning about GIS (geographic information system) technology and all that it enables.
    Through our UT GIS Day events this year we hope to raise the profile of the innovative GIS work being carried out by the UT campus community and specifically highlight open geospatial research since 2023 has been designated a Year of Open Science by the White House Office of Science and Technology Policy (OSTP).

In Person & Zoom Registration

Additional Informational


Sign up to receive monthly newsletter updates.

If you have any questions please feel free to email scholarslab@austin.utexas.edu

Scholars Lab Newsletter – October 2023

Digital Humanities Workshop Series

Introduction to StoryMaps

When: Friday, October 13, 12-1 pm

Where:  Zoom

StoryMaps is a digital tool that enables you to craft a narrative using maps, images, videos, and text. This workshop session will provide an introductory overview of creating a digital exhibit with StoryMaps. Participants will learn to weave together data points, images, videos, and text to form engaging stories.

Zoom Registration


Data & Donuts

Customer Reviews Data

When: Friday, October 20, 12-1:15 pm

Where: Zoom

How much is a star really worth? This session will examine customer review data including how to use reviews effectively, how to spot fake reviews, and what consumers, companies and academic researchers do with customer review data.

Zoom Registration

Open Source Geographic Information Systems (GIS) 

When: Friday, October 27, 11-12 pm

Where: Zoom and Perry-Castañeda Library (PCL), Scholars Lab, Data Lab

This workshop will provide an explanation of key geospatial terms and concepts and an introduction to open source geographic information system (GIS) software for visualizing, analyzing, storing, processing, and managing geospatial data. By the end of this session you should have the core knowledge required to start working effectively with geospatial datasets using open source tools.

In-person &

Zoom Registration

More Information


OA Week 2023

Support for Open Access Publishing at UT

When: Tuesday, October 24, 12 – 1 pm

Where: Zoom

In this session we’ll talk about Libraries’ support for open access (OA) publishing, including support that eliminates article processing charges (APCs) for UT authors. We’ll discuss the main types of OA publishing business models (including OA book publishing), and how the Libraries is strategically investing in these options. Finally, we’ll show participants how they can share their work regardless of the publication model. This free session is open to anyone, but will be of most interest to faculty, students, and staff who publish scholarly content. Registration is required. 

Zoom Registration

Los del Valle Oral Histories Available at Libraries’ Collections Portal

The Benson Latin American Collection at The University of Texas at Austin has made a significant oral history archive featuring voices of the Rio Grande Valley of South Texas and Northern Mexico available online through the Libraries’ Collections Portal.

University of Texas Rio Grande Valley history professor Manuel F. Medrano launched the Los del Valle Oral History Project in 1993 with the goal of collecting and preserving historical memories in the Rio Grande Valley, a region that has been historically underrepresented in archival and published research. Many of the original interviews were broadcast in edited form on local public access television. The collection of nearly 300 videos was transferred to the Benson Latin American Collection in 2015.

Raw footage of an interview with Dr. Américo Paredes, 1995. Dr. Paredes discusses how his parents came to Brownsville, his advice for writers, and the publication of his dissertation \With a Pistol in His Hand.

“By making the Los del Valle Oral History Project fully available online, the Benson highlights the immense intellectual and cultural contributions of the people of the lower Rio Grande Valley to the state of Texas,” says John Morán González, J. Frank Dobie Regents Professor of American and English Literature and former director of the university’s Center for Mexican American Studies. “Scholars, students, and the general public now have access to key figures and ideas that will surely enrich our understanding of this unique borderlands region.”

Los del Valle (Spanish for “those of the Valley”) is a term used to describe Mexican Americans who live in the rural South Texas, especially those in Hidalgo, Starr and Cameron Counties. These predominantly Mexican American communities, some of which predate the modern border between Mexico and the United States, represent a vibrant culture along this historically fluid border. Interviewees come from both sides of the modern border, and include writers Rolando Hinojosa-Smith, Carmen Tafolla and Oscar Cásares; scholar and folklorist Américo Paredes; educator Juliet Garcia; artist Carmen Lomas Garza; and accordionist Narciso Martínez. Other subjects include shrimp boat workers, Charro Days participants, World War II veterans and filmmaker Gregory Nava. These interviews cover a wide range of topics, from the early days of settlement in the region to the Chicano Movement and beyond.

An interview with Carmen Lomas Garza, a Chicana artist born in Kingsville, Texas, who talks about her art career. Lomas Garza talks about racial discrimination toward Mexican American families, and shares the influence and involvement of the Chicano movement in her life.

“Professor Manuel Medrano and his team have gifted us with an important resource that helps us understand the history of the Rio Grande Valley. By doing so, it places the RGV in the context of Texas and, more broadly, the U.S.,” says Maggie Rivas-Rodriguez, director of the Voces Oral History Center and the Center for Mexican American Studies.

“Oral history is key in documenting the perspective of the Latino community—too few Latinos/as will leave diaries, letters, and other records to a publicly accessible archive,” says Rivas-Rodriguez. “But even in the case of people like Américo Paredes, who did in fact leave his papers at the Benson, oral history provides context that would otherwise be unattainable.”

Interviews with Members of the 124th Cavalry Regiment at the 30th Annual Reunion. Interviews with members of the 124th Cavalry Regiment and their wives about their background, their memories of World War II, and what the reunion means to them.

Learn more about the specific holdings in the Los del Valle Oral History Project at Texas Archival Resources Online, or browse the online collection in the Libraries’ Collections Portal.

Los del Valle Oral History Project Archive was digitized with funds from the Latin American Materials Project (LAMP), Center for Research Libraries.

Read, Hot and Digitized: Indian Princely States Online Legal History Archive

Read, hot & digitized: Librarians and the digital scholarship they love — In this series, librarians from the UT Libraries Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship. Our hope is that these monthly reviews will inspire critical reflection of, and future creative contributions to, the growing fields of digital scholarship.


As a librarian, I can’t help but love a good bibliography. 

The first professional book I purchased after getting my first bibliographer job was Maureen Patterson’s South Asia Civilizations: a Bibliographic Synthesis.  Over the course of many years, Patterson, the former Bibliographer of the South Asia Collection at the University of Chicago, enlisted the help of a small army of graduate students and library staff to identify and succinctly document citations of scholarly books and articles organized in the ways that academics think.  Arranged by broad chronological and thematic categories, Patterson’s Bibliography was a life-saver for me while in graduate school.  Whenever I ventured into unknown territory as a grad student, the Bibliography was the perfect launching pad, giving me recommendations to begin learning.  Since then, as a librarian often called upon to help people in areas less familiar to me, I’ve turned to Patterson’s Bibliography over and over to learn, explore, and discover.  My personal copy, now tattered and torn but always with lots of post-it notes and flags pointing me to particular areas, reveals just how helpful this work has been to me.

Author’s personal copy of South Asian Civilizations

And yet, as a print source, published only once in 1981, it is dated.  Not just in terms of content—the way we think about South Asia has certainly changed since 1981!—but also in terms of its static functionality.  Bibliographies are essentially curated lists of citations, that is, of metadata (“data about other data”).  The intersection of online metadata and citations, namely in and through tools such as citation managers such as Endnote, Procite, RefWorks, and Zotero, is fertile digital humanities ground wherein we can learn about new subject areas.

For example, I recently learned of a new bibliography for the study of legal history, the Indian Princely States Online Legal History Archive, or IPSOLHA.  IPSOLHA takes up the challenge of complex histories from the colonial period when there were “hundreds of semi-sovereign, semi-autonomous states across the South Asian subcontinent. Varying in size and authority, these states (sometimes referred to as native, feudatory, or zamindari states) were incubators for innovative legal, administrative, and political ideas and offered a unique counterbalance to the hegemony of British rule. Yet despite their unique history, studying these states is complicated by the scattered nature of their archival remains.” IPSOLHA’s intervention is to use the tools of the digital humanities “to build a database and collection of references to facilitate historical study of these states, with a special focus on their legal and administrative history.” 

Example of entries re: Princely States from Patterson’s Bibliography

Main collection of IPSOLHA, with options for sorting, display and visualization

Like Patterson’s Bibliography, IPSOLHA is built upon student labor to investigate and document publications; but unlike Patterson, IPSOLHA has used the dynamic citation manager tool, Zotero, to gather relevant references from both online and analog resources which are then uploaded into a database.  The database sorts and presents the references in static thematic categories, but also in ways that can be determined by the researcher, including by type, language, location and more.  At the time of this writing, IPSOLHA is primarily a discovery tool (like Patterson), but in time, the hope is that the discovery will lead to digitization projects and more online full-text access for researchers.

Display from IPSOLHA of Gazetteers

IPSOLHA is a fabulous place for both beginner researchers to get started, but also for more advanced scholars of princely India to find hitherto unknown source materials.  I encourage all to dive in and explore the possibilities.

Learn more about:

Read, Hot and Digitized: Nuṣūṣ — A Corpus of Neglected Texts

Read, hot & digitized: Librarians and the digital scholarship they love — In this series, librarians from the UT Libraries Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship. Our hope is that these monthly reviews will inspire critical reflection of, and future creative contributions to, the growing fields of digital scholarship.


While digital, machine-readable texts in Arabic are growing in their availability, certain genres of writing and scholarship in Arabic have become more readily accessible than others. Among those more obscure disciplines are Sufism, theology (Muslim and Christian), and philosophy. These tend to be theoretically complex, and even dogmatically challenging, disciplines that are not as well represented in North American Islamic Studies programs as literature or Qur’anic studies. The Nuṣūṣ corpus––a project led by Antonio Musto––seeks to fill in some of the desiderata by putting more texts from these essential disciplines up on the Internet for researchers to use.

A project that began with an almost exclusive focus on Sufism, Nuṣūṣ has expanded to include works from a variety of complex disciplines of Arabic-language scholarship produced by Muslims and Christians. The corpus currently contains 61 machine-readable texts, with plans to add more and to make the text files available for download. Differing from other, larger corpora of Islamicate[1] disciplines, Nuṣūṣ provides the bibliographic information for the modern editions from which these digitized texts are derived. This is not only a responsible move, but a useful one for researchers: modern editions of historic texts can differ greatly; comparing modern editors’ approaches to the text and their choices that affect meaning and understanding is therefore rich area of exploration in Arabic-language digital humanities. It is hoped that––as possible––Nuṣūṣ will start to add multiple editions of historic texts in order to facilitate this comparative work.

Image of a table of Arabic-language works held in the Nusus corpus.
Nusus’s “Browse Corpus” page.

Nuṣūṣ’s aspirations lie in providing researchers with an adequate corpus from which to do computational text analysis. To that end, the team has created several different ways for researchers to access and engage with the texts. The “Browse Corpus” feature gives researchers an accurate sense of which specific items are included. If one is looking for a particular author or text, this would be the list to consult. This is also where crucial metadata (information about the item) is located, such as the origin of the digital images (Nuṣūṣ’s own OCR process or the OpenITI project repository), the internal corpus text ID, the date of the historic text’s alleged composition, the discipline, the genre of writing, the title, and the author. Author names link to biographies from the Encyclopaedia of Islam, and titles link to the WorldCat record for the modern edition used in the digitization of the text.

Image of a search for an exact term in the Nusus corpus.
Performing a search for the exact term “عقل” in the Nuṣūṣ corpus.

Furthermore, the Nuṣūṣ team has provided a cross-corpus search tool. Researchers can build a search using the provided fields and Boolean operators (AND, OR), and can specify whether they are searching for an exact term. It is also possible to confine the search to specific titles, authors, or genres. This arrangement encourages researchers to pursue projects that might compare across a scholar’s oeuvre, across a genre of writing (Muslim theology, philosophy, Sufism, or Christian theology), or across a single text. Researchers could use this tool to construct searches across known networks of scholars, as well. As the corpus expands, the ability to conduct searches and collect the resulting data will become increasingly effective and useful.

Readers interested in text and corpora analysis should consult the UT Libraries’ Digital Humanities Tools and Resources guide for more information on methods to apply to corpora like Nuṣūṣ. For recommendations of other corpora that might be useful for your research, consult the Data Set list on the Text Analysis guide. Lastly, as the Nuṣūṣ corpus partners with and derives from the OpenITI repository, it is worth considering the OpenITI repository documentation at the KITAB project. Happy corpus hunting!

Dale J. Correa, PhD, MS/LIS is Middle Eastern Studies Librarian and History Coordinator for the UT Libraries.


[1] The term Islamicate was coined by Marshall G.S. Hodgson in volume 1 of his The Venture of Islam (p. 57).