Librarian Lens: Requesting Information via the Freedom of Information Act (FOIA)

Meryl Brodsky, Information & Communication Librarian and Juliana Kasper. Juliana Kasper received a Master’s in Information Studies from UT’s iSchool in 2024 where she conducted interdisciplinary research on Autonomous Vehicles using PIA requests. She now works in records and information management as a Records Analyst for the Texas state government.


The Freedom of Information Act (FOIA) provides the public the right to request access to records from any federal agency. Federal agencies are required to disclose information requested under the FOIA unless it falls under one of nine exemptions which protect interests such as personal privacy, national security, and law enforcement.

The FOIA requires agencies to post some categories of information online, including frequently requested records. FOIA is part of informing the public about government activities. The first step you should take if you think there is agency information you need, is to see if it is already available online. It  may be in a FOIA Library, or a FOIA Reading Room posted on an agency website. As an example, here’s a link to the Electronic Reading Room from the National Archives: https://www.archives.gov/foia/electronic-reading-room.

You will need to know which agency produces this information. Here is a list of federal agencies: https://foia.wiki/wiki/Agencies_Landing_Page. Each federal agency is a separate entity. There is no central office. For example, you must determine if you need information from the Department of Education or whether what you’re really looking for is local information about schools. You can request information from federal, state and local government agencies.

Once you’ve identified which agency has the information you are looking for, you must send them your FOIA request in the manner in which they specify. It could be a letter or an email or an online form. You need to make your request very specific. Structure your request so that whoever responds can easily find the information and get the information to you. In addition, you must specify the format in which you wish to receive your information. https://www.foia.gov/how-to.html

If you need to make multiple similar requests, it is valuable to build out a template to batch them. You can plug in words/terms based on the specific request. For example: “I am requesting [INSERT RECORD TYPE i.e: emails, meeting minutes] the [DEPARTMENT, AGENCY, OFFICE, ETC.] has collected on [INSERT SPECIFIC, SEARCHABLE TERM] related to [INSERT TOPIC OF INTEREST]. I aim to obtain records documenting [INSERT MORE CONTEXT/RELEVANT DETAILS ON THE SEARCHABLE TERM] related to [TOPIC] from [Month Date Year to Month Date Year].”

Agencies generally process requests in the order in which they are received. However, the information you seek may not be available immediately, depending on its complexity. Complex requests may be large or they may require searching for records from different locations or from different time periods. FOIA requests can take longer than a semester, so if you are interested in this information for a class project, you may need to start early, or use data that’s already available.

Sometimes there are fees for photocopying or other services. For example, the CIA has their fee schedule right on the FOIA page, though their information is generally free for academic pursuits.

You may even need to follow up to see where your request is in the process. Most of the agencies list people to call. Do not hesitate to get in touch with them. Getting access to public information is your right, even if you are not a citizen. However, it may take some effort. You might receive the response of “no responsive information” to a request. Sometimes the agency doesn’t keep the information you requested, especially if the information comes from a public/private partnership.

The Population Research Center on UT’s campus has restricted information on population health and well being, reproductive, maternal and infant health, family demography and human development and education, and institutions. This restricted information is generally private, but you may be able to access it for research purposes. If your research falls into this realm, you may apply to use their data, but you should first contact an Administrator to see if your proposal is feasible. Using this type of data requires a duty of care to protect study participants, even if the data is unrestricted public information. The data may be used if it is required for a research project, and the researcher keeps the data secure. https://liberalarts.utexas.edu/prc/

Texas Public Information Act (TPIA)

The Texas Public Information Act (TPIA) allows citizens to access government records held by public agencies. Information granted by TPIA is circulated but may not be readily available to the public.

While FOIA requests are open to the general public, TPIA requests are only available to citizens. TPIA can often release information faster, since the documentation has already circulated. Under the TPIA, governmental bodies are required to respond to PIA requests of all forms— you can even send them one written on a napkin and they have to respond to it.

Under the TPIA if it takes longer than 5 hours for them to gather the responsive information then they can charge you: https://www.texasattorneygeneral.gov/sites/default/files/files/divisions/open-government/conference/12-3CostBasics.pdf

Also, do not be surprised if the legal department asks questions about the nature of your research. 

For more information go to the Texas Public Information Act.

City of Austin

Austin makes municipal information available. For example, you can request information on reported incidents involving driverless vehicles. Go here to learn more. The FAQ is also helpful.

The City of Austin also maintains an Open Data Portal. You can find out graduation rates of local schools here: https://catalog.data.gov/dataset/city-of-austin-schools-with-data

Last, for more detailed information, the UT Libraries has a new research guide on this topic here: https://guides.lib.utexas.edu/FOIA

Israeli Literary Magazine Digitization Project Complete

https://repositories.lib.utexas.edu/handle/2152/29709

Iton 77 is one of the prominent literary magazines of literature, poetry, and culture in Israel.[1] The UT Libraries has cooperated with the publishers of Iton 77 since 2013 and recently finished the digitization of 391 issues, bringing almost the whole run online.[2] Additional issues will be digitized or added as digitally-born files in the near future. This is the most complete digital archive of Iton 77 currently in existence. Being a searchable, full text archive, openly accessible to the public worldwide with no restrictions, it promises to be a valuable resource for scholars as well as for the general public.

Established by the late poet and editor Yaakov Besser in 1977, the magazine is now celebrating 48 years of commitment to literary work. Many Israeli poets and authors published their first texts in Iton 77, and it is still a desired platform for emerging and experienced writers alike. Published works include poems, short stories, book reviews, literary criticism and research, opinion editorials, essays, and works in translation. Wide representation is given to Israeli writers who write in languages other than Hebrew, such as Arabic, Russian, and Yiddish. Being a pluralistic platform, Iton 77 is open to alternative narratives and opinions, acknowledging the importance of historical contexts while discussing the complicities and difficulties of Israeli existence.  Current editors are Yaakov Besser’s son, Michael Besser, and ‘Amit Yisre’eli-Gil’ad.

Upon the acquisition of some print back issues of the magazine in 2013, UT Libraries and the Iton 77 publishing house discussed a future online visibility for the publication, and the possibility for hosting the digital issues on the UT Libraries digital repository – now known as Texas ScholarWorks or TSW. Like many other digital repositories, TSW was established to provide open, online access to the products of the University’s research and scholarship, and to preserve these works for future generations. In addition, TSW is also used as a platform for digital content that is not necessarily created on campus, but is rather a product of cooperation with off campus content owners, such as the Iton 77 publishing house. 

Screenshot of a Texas ScholarWorks repository page from The University of Texas at Austin, displaying metadata for "Iton 77, issue 001." The page includes a thumbnail image of the magazine cover, access to full-text PDF files, publication date (1977-02), authors (Besser, Micha; Gilad-Yisre’eli, Amit), and publisher (Iton 77). It also lists the department (UT Libraries), keywords and LCSH subject headings related to Israeli literature and periodicals, and links to the item's URI and DOI.

TSW provides stable and long-term access to submitted works, as well as associated descriptive and administrative metadata, by employing a strategy combining secure backup, storage media refreshment, and file format migration. Conveniently and helpfully, all works submitted to TSW are assigned persistent URLs, – permanent web addresses that will not change overtime.

All scanned issues of Iton 77 have been OCR-ed for full-text searchability and can be downloaded either as text or PDF files. Currently issues are sortable by date and title, with sorting by author and subject in the works. With the permission of UTL and the Iton 77 publishing house, most of the content is mirrored and indexed on the Ohio State University Modern Hebrew Literature Lexicon.

The total number of downloads of all issues to-date is 241,947. Issues are viewed and downloaded from every corner of the globe. Not surprisingly, most of the users are from Israel, with the United States and Germany in second and third place. Other Hebrew readers connect from many other countries, including Egypt, Japan, Togo, and Syria. 

The most popular issue since going online in TSW with 6194 downloads to-date is the double issue from January 1987, called the ‘decade issue.’ It celebrated some of the most prominent Israeli authors, poets, and essayists of that time, such as Yitsḥaḳ Aṿerbukh Orpaz, Aharon Meged, Erez Biton, A.B. Yehoshua, Dalia Rabikovitch, Anton Shamas, Shimon Balas, and many others.

We are excited about this partnership to bring Iton 77 to a global audience in this stable open access format and encourage all to browse and use it! 

Iton 77 double issue 84-85 (January 1987). https://hdl.handle.net/2152/75368

[1] “Iton” is the transliterated form of the Hebrew word for “newspaper” (עתון).

[2] This count includes 67 double-issues. Three issues (200; 293; 341-342) were published as printed books and are not included in the project.

Back to Egypt via Türkiye

In December 2024, after classes came to a close, I took a brief trip to Istanbul, Türkiye, with the hope of acquiring pivotal Arabic-language journals that had been published in Egypt. I’ve written for the TexLibris blog before on the importance of looking for essential research materials in unexpected places, such as Arabic in Türkiye. This trip was yet another example.

So, what were these texts that I traveled across the Atlantic in order to secure for the UT Libraries researcher community? One of them is مجلس النواب مجموعة المضابط (Majlis al-Nuwwab: Majmu’at al-Madabit/Meeting Minutes of the House of Representatives). I acquired 18 volumes of this title, representing the record of the discussions and decisions taken by a House of the Egyptian Parliament in the late 1920s and 1930s. This title had been on my radar ever since I acquired مجلس الشيوخ مجموعة المضابط (Majlis al-Shuyukh: Majmu’at al-Madabit/Meeting Minutes of the House of Lords) a few years ago. That title consists of the records of the House of Lords of the Egyptian Parliament from the 1930s to the 1950s. My goal was to complement the House of Lords collection with the House of Representatives’ records from nearly the same time period so that UT Libraries is able to offer researchers a comprehensive record of Egyptian Parliamentary activity from the early 20th century. These types of government records may seem fairly mundane, but they are, in fact, remarkably difficult to locate outside of official copies kept at the Egyptian National Archives. In North America, UT Austin is one of four holding institutions for Majlis al-Shuyukh: Majmu’at al-Madabit, and one of five holding institutions for Majlis al-Nuwwab: Majmu’at al-Mudabit. I am eager to see the scholarship that arises from the presence of these crucial and rare titles at the UT Libraries, and I encourage scholars from other research institutions to consider visiting UT Libraries to consult these materials.

The second title that I acquired is الموسوعة الجنائية (al-Mawsu’ah al-Jina’iyyah/Encyclopedia on Criminal Law) by legal scholar Jindi Abd al-Malik Bayk. This work, published in the 1930s, is an encyclopedia of Egyptian criminal law structures and standards. It chronicles the historical development of criminal law, doctrinal formation, and the rules that came to be adopted in modern Egyptian criminal law. This title also includes the substantive case law that underpins some of the key assumptions and orientations for criminal procedure and criminality in Egypt.

The third title, الدنيا المصورة (al-Dunya al-Musawwarah/The Illustrated World), was published between 1929-1932. It was a weekly journal from the famous Dar al-Hilal publishing house, responsible for numerous impactful intellectual and popular periodicals in early 20th century Egypt. Edited by Emil and Shukri Zaydan, al-Dunya al-Musawwarah was renowned for its caricatures and the artists behind them, as well as for its plethora of photographs. It also featured influential articles by foundational litterateurs and political commentators, such as Fikri Abaza (فكري أباظة). UT Austin is now one of a only a handful of North American institutions with any holdings of this important title. al-Dunya al-Musawwarah complements our existing collection of early 20th century Arabic periodicals that I have been building since joining UT Austin 10 years ago. Other notable titles include البلاغ الأسبوعي (al-Balagh al-Usbu’i/The Weekly Calling), الهلال (al-Hilal/The Crescent), المصور (al-Musawwar/The Illustrated), and الكواكب (al-Kawakib/The Planets).

As I continue my work to maintain our existing collections and expand upon them, it is my hope that complementary titles such as these—titles that work together and extend the knowledge already present in the UT Libraries’ collections—will make crucial connections for UT Austin researchers and beyond. I invite anyone interested to learn more about these materials and/or our Middle Eastern Studies collections to reach out for a consultation.

Libraries Hosts Literary Salon in Houston with Hamilton Winner Hillis

In a celebration of literature, biodiversity, and Texas’ natural beauty, the Libraries hosted a literary salon in Houston on Monday, February 24, featuring acclaimed author and UT Austin professor David M. Hillis. The event, generously hosted by Tom and Reggie Nichols—former Libraries Advisory Council members and proud UT alumni—highlighted UT Libraries’ role in supporting critical research and advancing fundraising initiatives.

L-R: Tom and Reggie Nichols, Lorraine Haricombe, Claire Burrows.

The evening centered around Hillis’ latest book, Armadillos to Ziziphus: A Naturalist in the Texas Hill Country, a deeply personal and scientifically rich exploration of the Hill Country’s diverse landscapes. Guests received copies of the book and were treated to a special reading of the chapter The Last Wild River, in which Hillis wove together the history of the Lower Pecos River with his own experiences.

Armadillos to Ziziphus was named grand prize winner at the 2024 Hamilton Book Awards.

Vice Provost Lorraine Haricombe welcomed attendees and invited them to browse a curated selection of materials from the Life Science Library, showcasing works on Texas’ biodiversity and environmental history.

Hillis, who serves as director of the Biodiversity Center at UT Austin’s College of Natural Sciences, is renowned for his contributions to evolutionary biology. A MacArthur Fellow and member of the U.S. National Academy of Sciences, he has discovered numerous species, including Austin’s iconic Barton Springs Salamander. His book reflects his lifelong passion for conservation, encapsulated in his belief:

“The more we understand and experience nature, the more of it we will appreciate, and the more we will seek to protect it for future generations to enjoy.”

The evening reinforced the Libraries’ commitment to fostering intellectual engagement while celebrating the invaluable research and scholarship at The University of Texas at Austin.

Read, Hot and Digitized: The Mixtape Museum

Read, hot & digitized: Librarians and the digital scholarship they love — In this series, librarians from UTL’s Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship.  Our hope is that these monthly reviews will inspire critical reflection of and future creative contributions to the growing fields of digital scholarship.


Before we had Spotify playlists, we had the mixtape. Scrawled handwritten track lists; the precise practice of hitting PLAY and REC at the same time; the dread of ejecting the cassette from the player to find the tape had been pulled into a tangled mess and using a pencil to carefully respool it.

A definition of the word mixtape. Noun: Traditionally recorded on to a compact cassette, a mixtape is a compilation of songs from various sources arranged in specific order.
A screenshot from the About section of The Mixtape Museum.

The Mixtape Museum (MXM) is a digital archive project and educational initiative committed to the collection, preservation, and celebration of mixtape history. The project seeks to both further mixtape scholarship and foster public dialogue, raising awareness of the artistry and far-reaching impact of mixtapes as a cultural form.

An image of a handwritten track list on a mixtape
Stephen J. Tyson Sr. Collection.

The Mixtape Memory Collection is the heart of the MXM, bringing together interviews, anecdotes, photographs, and reflections from a mix of contributors. There are tributes to mixtape pioneers, reminiscences about childhood introductions to Hip Hop, and laments for tapes not saved. I appreciated this brief recollection of a “mixtape correspondence” as it underscores the richness of the form as a mode of communication.

While mixtapes are the anchor point, these memories are also about people and places, relationships and phases, marking connections between a cultural era and the personal eras of our lives. The Collection reveals how music indexes experiences and moments in time, and also attests to the way particular objects can become imbued with layers of meaning and cultural significance. Even in the digital space of the MXM, I am struck by the affective resonance of the physical cassettes themselves, each containing a story that stretches beyond the tape wound inside it.

An image of eight cassettes with handwritten labels
DJ Red Alert, Ismael Telly Collection.

In addition to the Memory Collection, the MXM includes a News section of related articles and public events, and a Mixtape Scholarship Library featuring key texts in the field. Appropriately, there is also a Listen section, which takes visitors to the Mixtape Museum Soundcloud page, where today’s creators might upload their tracks instead of passing out their tapes.

Aligned in a sense with the ethos of the format it highlights, the MXM operates from a simple WordPress site—a platform with a relatively low barrier of entry for producing digital content. The project was founded by scholar, arts administrator, and community archivist Regan Sommer McCoy, who serves as Chief Curator, supported by a group of advisors and institutional collaborators.

As I browse the collection my own mixtape memories surface—a tape gifted to me by a former best friend that I played on repeat during my freshman year of high school; my painstaking efforts to create the perfect mix to let a crush know the way I really felt about him. Does the MXM spark a mixtape memory for you? The project welcomes submissions to the archival collection and invites a variety of formats. Contributors have the option to make memories public or keep them password protected, respecting the boundaries of each offering.


Want to learn more about mixtape culture and history? Several of the titles featured in the MXM are available from the UT Libraries:

Auerbach, Evan, and Daniel Isenberg. Do Remember! : The Golden Era of NYC Hip-Hop Mixtapes / Evan Auerbach, Daniel Isenberg. New York, NY: Rizzoli International Publications, Inc., 2023. 

Burns, Jehnie I. Mixtape Nostalgia : Culture, Memory, and Representation / Jehnie I. Burns. Lanham: Lexington Books, 2021.

Moore, Thurston. Mix Tape : The Art of Cassette Culture / Edited by Thurston Moore.1st ed. New York, NY: Universe Pub., 2004

Taylor, Zack, Georg Petzold, and LLC Seagull and Birch. Cassette : A Documentary Mixtape / a Film by Zack Taylor ; Directed, Produced, and Filmed by Zack Taylor ; Edited, Produced, and Additional Camera by Georg Petzold ; Seagull and Birch, LLC. El Segund

Walker, Lance Scott. DJ Screw : A Life in Slow Revolution. Austin: University of Texas Press, 2022. 

Scholars Lab Series Launches with Talk on Data Storytelling and Visualization

The Scholars Lab Speaker Series at the University of Texas Libraries welcomed renowned data visualization expert Dr. Alberto Cairo on February 10 for a thought-provoking discussion on the intersection of visualization, art, and insight. Cairo, the Knight Chair in Visual Journalism at the University of Miami, engaged an audience of students, faculty, and researchers in a conversation about how data storytelling can enhance both comprehension and communication.

During his talk, “Visualization: An Art of Insight,” Cairo explored the aesthetic and analytical dimensions of data visualization, emphasizing that effective visuals go beyond aesthetics to provide clarity, context, and meaning. He shared examples from journalism, scientific research, and public policy to illustrate how well-crafted visual representations can inform, persuade, and even challenge assumptions.

Cairo discussed how data visualization is not merely about following set rules but rather a reasoning process that carefully considers content, audience, and purpose. He emphasized that designing a visualization is an intentional, iterative dialogue, requiring deliberate choices about how information is encoded—using length, area, angle, or color—to effectively represent data.

Cairo illustrated how data visualization can reveal trends, exceptions, and broad patterns, pointing to, and underscored the challenge of balancing showing vs. explaining data, using the cone of uncertainty in hurricane forecasts as an example of how misinterpretation can arise without adequate context and annotation.

Finally, he urged designers to move beyond software defaults, encouraging thoughtful refinement in visual composition, from color choices to line breaks, to create clear and effective graphics. His insights reinforced the idea that data visualization is both an analytical tool and a form of storytelling, shaping how people understand and engage with information.

The event, held in Perry-Castañeda Library’s Scholars Lab, was the first installment of the new Scholars Lab Speaker Series, launched to highlight emerging trends in digital scholarship. Audience members had the opportunity to engage with Cairo in a Q&A session that followed the talk.

The event also marked the commencement of International Love Data Week 2025, celebrating the significance of data in modern research and decision-making. For more information on upcoming events and resources, visit the UT Libraries’ official website.

Latino USA Radio Program Episodes Published

The digitized episodes have been made available online by the Benson Latin American Collection


More than 160 digitized episodes of Latino USA, the newsmagazine of Latino news and culture founded at UT in 1993, have been published by the Benson Latin American Collection. Published records include metadata and transcriptions for the episodes, which are available to the public on the open-access University of Texas Libraries Collections Portal. The publication and transcription of the episodes was made possible by a grant from the Latin Americanist Research Resources Project (LARRP).

The selected episodes, which total 168, span the years 1997–2000. They are part of a larger archival collection held by the Benson—Latino USA Records, which documents the history of the radio program from early planning stages in the late 1980s through the program’s first seventeen years (1993–2010).

A newspaper page with the title On Campus features a large black-and-white photograph of people in a radio station. In the foreground, four people are in the control room—Christina Cuevas speaks on the phone, Frank Contreras holds a reel-to-reel tape, María Martin smiles and holds papers in front of a microphone on a boom stand, and Dolores García has headphones on and is smiling and looking at something. Behind the soundproof glass, a room with other people can be seen. On the wall are the words Latino USA. The people are smiling and looking into the room that is being photographed.
OnCampus feature on Latino USA’s 200th program. Latino USA Records, Benson Latin American Collection.

The newly published episodes consist of over 80 hours of material covering Latin American and Latina/o topics, including interviews with figures such as labor activist Dolores Huerta, singer Little Joe Hernandez, San Francisco mayor Willie Brown, and writers Claribel Alegria, Américo Paredes, and Sandra Cisneros. Prior to their digitization by UT Libraries, these episodes had only existed in a legacy audio cassette format known as DAT, which made them inaccessible to the public.

The published episodes are accompanied by complete transcriptions, funded with a grant from the Latin Americanist Research Resources Project (LARRP). The transcriptions meet accessibility requirements of the digital collections platform, expanding access for the hearing impaired and people with better reading than listening knowledge of English.

Transcriptions can also provide expanded searching and digital scholarship opportunities for researchers and support additional use of these recordings in instructional settings. The transcriptions were provided by UT Austin’s Captioning and Transcription Services Team.

A stack of six audio tapes in clear plastic cases sits atop dozens more such tapes. Each one is labeled "LATINO USA" along with a number and other information.
Latino USA DAT audio tapes at the Benson Latin American Collection

The Latino USA Records at the Benson include nearly 900 program episodes that aired between 1993 and 2010, in addition to correspondence, photographs, ephemera, and other records documenting the program’s history. The Benson and University of Texas Libraries have digitized and transcribed additional episodes that they hope to publish in the future. Archival footage from the Benson was included in various episodes during the program’s 30th anniversary year in 2023, including a special episode dedicated to the anniversary and an episode that focused on the Benson. Latino USA’s special episode dedicated to the memory of the program’s founder, María Martin, also included archival footage and documents from the Benson.

Black-and-white close-up photo of journalist María Martin, who has dark hair, large hoop earrings, a beaded necklace, and a dark striped shirt on. She smiles broadly as she speaks into a large, metallic radio microphone that is suspended in front of her.
The late María Martin at the Latino USA studio. Latino USA Records, Benson Latin American Collection.

Over 30 Years of History

Launched on May 5, 1993, Latino USA is an award-winning weekly English-language radio journal created to fill a Latina/o-themed void in nationally distributed radio. It was initially produced by the Center for Mexican American Studies (CMAS) in collaboration with KUT at the University of Texas at Austin. Radio veterans María Emilia Martin and Maria Hinojosa joined the staff in the roles of producer and host, respectively, while CMAS director Gilberto Cardenas acted as the program’s first executive producer (Martin and Hinojosa would both eventually serve in this role). Latino USA moved to Futuro Media Group in 2010.

The program was established at a time when the U.S. Latina/o population was one-third of what it is today. As Maria Hinojosa notes, the show traces the history of this immense growth, as well as that population’s participation in all aspects of politics, culture, and society.

A black-and-white photo of journalist Maria Hinojosa. She has long, dark hair and is wearing a white top with a collar. She smiles fully. One hoop earring is visible on the left.
Latino USA co-founder and journalist Maria Hinojosa, undated photo. Latino USA Records, Benson Latin American Collection.

Among the staff members who worked in this project are two graduate research assistants (GRAs), Fernanda Agüero, a graduate student at the School of Information (iSchool), and Rosa de Jong, a dual-master’s student at LLILAS and the iSchool. 

As LLILAS Benson Digital Initiatives GRA, Agüero worked on the project during fall 2024, giving her the opportunity to listen to a large majority of the Benson’s now-digitized collection.

“The Latino USA collection provides a distinct opportunity to observe the key events and cultural developments that defined Latin American identity through the turn of the 20th century,” Agüero said. “It covers significant moments such as the Elián González case, the Clinton-Gore campaign, and a large focus on the arts, including my favorite episode, which featured a compilation of Latin American female ballad artists. This collection serves as a historical record, allowing listeners to situate themselves within the specific timeframes in which these episodes were produced, offering insight into the political and cultural climate of the period.” 

In this black-and-white photo, a row of five people of diverse ages sits at a rectangular table, each with a microphone. At the far end, journalist María Martin looks at the others, leaning her head on her chin. In the center, media scholar Federico Subervi looks down, smiling. A young woman in the foreground is speaking into her microphone.
Undated photo, Latino USA Records, Benson Latin American Collection

De Jong, a Special Collections Graduate Research Assistant, singled out her highlights in the newly transcribed episodes.

“I especially loved the episodes focused on Tejano and Chicano traditions and cultural workers. One that stands out is titled Tejano Literary Traditions, which features interviews with literary icons Sandra Cisneros and Américo Paredes. In the episode, the authors talk about how their experiences growing up and living in the U.S.–Mexico borderlands shaped their work.” The newly transcribed programs also focused on Puerto Rico, says de Jong. “I was also impressed by the depth and scope of the reporting on Puerto Rico. Covering topics such as the Independence Movement, Puerto Rican political prisoners, and the 1999 Vieques Island protests, Latino USA episodes provide varied and rich accounts of the complex and evolving socioeconomic, political, and cultural contexts both on the Island and within the diaspora. Two episodes that highlight this reporting are Latino USA Program 275, Week #39-98 and Latino USA Program 348, Week #51-99.”

Navigating the Data Landscape: An Open Source Workflow

Recent years have witnessed explosive growth in the volume of research publications (Hanson et al., 2024). In order to maintain the basic tenets of scholarship, stakeholders such as funders and publishers are increasingly introducing policies to promote research best practices. For example, the 2022 Nelson Memo directed federal agencies that dispense at least $100m in research funding to revise policies around making the outputs of federally funded research available. Concurrent with the evolution of these policies, research institutions are innovating and developing the necessary infrastructure to support researchers, for which the libraries are an essential component.

These stakeholders and various subgroups within them have a range of interests in tracking the publishing of research outputs. In order to make data-driven decisions around what services we provide in the libraries and how we provide them, we need data about our research community. There is a long history of tracking publication of articles and books, and the infrastructure for doing so is relatively well-developed (e.g., Web of Science, Scopus, Google Scholar). In this regard, we are well-positioned to continue monitoring these outputs in line with the new stipulations for immediate public access in the Nelson Memo. However, the Nelson Memo also stipulated that the research data supporting publications need to be shared publicly. Compared to open access publishing, open sharing of data is less developed culturally and structurally, which makes it all the more important to develop a workflow to begin to gather data on this front.

Predictably, the infrastructure for tracking the sharing of data is not nearly as well-developed as that for articles or books. While some of this is likely due to the relative lack of emphasis on data publishing, there are a variety of reasons why tracking data isn’t quite as easy for motivated parties. Journals, in spite of wide-ranging aesthetic and syntax standards, have relatively uniform metadata standards. In large part, this is because of the homogeneity of their products, across disciplines, which are primarily peer-reviewed research articles that are typeset into PDFs. This allows proprietary solutions like Web of Science and Scopus to harvest vast amounts of metadata (through CrossRef) and to make it available in a readily usable format with relatively little work required to format, clean, or transform. In contrast, research data are published in a wide variety of formats, ranging from loosely structured text-based documents like letters or transcripts to objects with complex or structured formatting like geospatial data and genomic data. As a result, there can be significant differences between platforms that host and publish research data, ranging from general to discipline-specific metadata and file support, level of detail in author information, use of persistent identifiers like DOIs, and curation and quality assurance measures (or lack thereof).

Horizontal bar chart comparing the frequency of different name permutations of UT Austin that were entered in UT Austin datasets. A total of eight different permutations were detected, ranging from 'University of Texas at Austin' to 'UT Austin.' The most common is to use 'at Austin' rather than some form of punctuation like a comma or hyphen instead of 'at.'
Comparison of annual volume of dataset publications. ‘All’ refers to the volume across all discovered repositories and is compared to our institutional repository, the Texas Data Repository, and two common generalists, Dryad and Zenodo.

While a few proprietary solutions are beginning to emerge that purport to be able to track institutional research data outputs (e.g., Web of Science), these products have notable shortcomings, including significant cost, difficulty assessing thoroughness of retrieval, and limited number of retrievals. In order to create a more sustainable and transparent solution, the Research Data Services team has developed a Python-based workflow that uses a number of publicly accessible APIs for data repositories and DOI registries. The code for running this workflow has been publicly shared through the UT Libraries GitHub at https://github.com/utlibraries/research-data-discovery so that others can also utilize this open approach to gathering information about research data outputs from user-defined institutions; the code will continue to be maintained and expanded to improve coverage and accuracy. To date, the workflow has identified more than 3,000 dataset publications by UT Austin researchers across nearly 70 different platforms, ranging from generalist repositories that accept any form of data like Dryad, figshare, and Zenodo to highly specialized repositories like the Digital Rocks Portal (for visualizing porous microstructures), DesignSafe (for natural hazards), and PhysioNet (for physiological signal data).

Horizontal bar chart comparing the total number of UT-Austin-affiliated datasets published in different repositories. Only repositories with at least 30 datasets are individually listed; the remainder are grouped into an 'Other' category. The Texas Data Repository has the most discovered datasets (nearly 1,250), followed by Dryad, Zenodo, Harvard Dataverse, the aggregated 'other', ICPSR, figshare, DesignSafe, Mendeley Data, the Digital Rocks Portal, and EMSL. No repository other than the Texas Data Repository has more than 400 datasets.
Comparison of total number of dataset publications between repositories. Only repositories with more than 30 UT-affiliated publications are depicted individually; all others are grouped into ‘Other.’

This work is still very much in progress. Perhaps equally important to the data that we were able to obtain are the data we suspect exist, but were unable to retrieve via our workflow (e.g., we didn’t retrieve any UT-affiliated datasets from the Qualitative Data Repository, even though we are an institutional member), as well as the variation in metadata schemas, cross-walks, and quality, which can help to inform our strategies around providing guidance on the importance of high-quality metadata. For example, this process relies on proper affiliation metadata being recorded and cross-walked to DataCite. Some repositories simply don’t record or cross-walk any affiliation metadata, making it essentially impossible to identify which, if any, of their deposits are UT-affiliated. Others record the affiliation in a field that isn’t the actual affiliation field (e.g., in the same field as the author name); some even recorded the affiliation as an author. All of this is on top of the complexity introduced by the multiple ways in which researchers record their university affiliation (UT Austin, University of Texas at Austin, the University of Texas at Austin, etc.)

Horizontal bar chart comparing the frequency of different name permutations of UT Austin that were entered in UT Austin datasets. A total of eight different permutations were detected, ranging from 'University of Texas at Austin' to 'UT Austin.' The most common is to use 'at Austin' rather than some form of punctuation like a comma or hyphen instead of 'at.'
Comparison of the frequency of different permutations of ‘UT Austin’ that were entered as affiliation metadata in discovered datasets.

We also have to account for variation in the granularity of objects, particularly those that receive a PID. For example, in our Texas Data Repository (TDR), which is built on Dataverse software, both a dataset and each of its constituent files receives a unique DOI – each file is also recorded as a ‘dataset’ because the metadata schema used by the DOI minter, DataCite, doesn’t currently support a ‘file’ resource type. We thus have to account for a raw data output that will initially inflate the number of datasets in TDR by at least two orders of magnitude. The inverse of this is Zenodo, which assigns a parent DOI that always resolves to the most recent version, with each version of an object getting its own DOI (so all Zenodo deposits have at least two DOIs, even if they are never updated).

The custom open source solution that we have developed using Python, one of the most common software languages (per GitHub), offers the flexibility to overcome the challenges posed by differences between data repositories and variations in the metadata provided by researchers. Our approach also avoids the shortcomings of proprietary solutions as it offers transparency so that users can understand exactly how dataset information is retrieved, and it is available at no cost to anyone who might want to use it. In many ways, this workflow embodies the best practices that we encourage researchers to adopt – open, freely available, transparent processes. It also allows others (at UT or beyond) to adopt our workflow, and if necessary, to adapt it for their own purposes.

Spanish Paleography + Digital Humanities Institute Focuses Research on Colonial Texts

Scholars and graduate students from institutions across the country gathered at the Benson Latin American Collection for the Spanish Paleography + Digital Humanities Institute. The immersive three-day program provided intensive training in reading and transcribing Spanish manuscripts from the 16th to 18th centuries while introducing participants to digital humanities tools that enhance historical research.

Funded by  LLILAS’s U.S. Department of Education’s Title VI Program and the Excellence Fund for Technology and Development in Latin America, the institute sought to equip researchers with specialized skills to navigate colonial texts, visualize historical data, and foster a collaborative academic community. The event was spearheaded by LLILAS Benson Digital Scholarship Coordinator Albert A. Palacios, and brought together a cohort of graduate students and faculty members specializing in history, literature, linguistics, and related disciplines.

The institute focused on three key objectives: providing paleography training, introducing participants to digital humanities tools, and fostering a collaborative research network. Participants engaged in hands-on workshops to develop their ability to accurately read and transcribe colonial manuscripts. They also received instruction on open-source technologies for text extraction, geospatial analysis, and network visualization. The program fostered a community of scholars who will continue sharing insights and resources beyond the institute.

Participants had the opportunity to work with historical materials, including royal documents, inquisition records, religious texts, and economic transactions. Case studies were examined through paleography working groups, where scholars collaboratively deciphered difficult handwriting styles and abbreviations.

To apply their newly acquired digital humanities skills, each participant developed a pilot research project using Spanish colonial manuscripts. These projects utilized handwritten text recognition (HTR) technology, geographical text analysis, and data visualization tools to enhance historical inquiry. The final day of the institute featured a lightning round of presentations, allowing scholars to showcase their preliminary findings and discuss future applications.

This year’s participants hailed from universities across the U.S., including the University of Chicago, the University of North Texas, Columbia University, the University of Texas at El Paso, the University of California-Santa Barbara, Purdue University, City College of New York, West Liberty University, Oklahoma State University, and the University of California-Merced. The interdisciplinary nature of the group enriched discussions, providing diverse perspectives on archival research and manuscript interpretation.

A highlight of the institute was the introduction and use of the handwritten text recognition (HTR) model the LLILAS Benson Digital Scholarship Office trained and recently launched on 17th and 18th century Spanish handwriting preserved at the Benson. This innovation is expected to significantly accelerate the study of colonial-era documents and democratize access to these historical resources.

Additionally, the program provided a comprehensive list of recommended paleography resources, including books, digital collections, and online tools to support continued scholarship in Spanish manuscript studies.

Palacios is leading an online Spanish version of the institute for participants worldwide this spring and fall. He will be leading another onsite institute June 4-6, 2025.The demand for the LLILAS Benson Spanish Paleography + Digital Humanities Institute in the Colonial Latin Americanist field underscores the growing interest in merging traditional archival research with computational methodologies. By equipping scholars with both paleographic expertise and digital tools, the institute is paving the way for innovative research on the Spanish Empire and its historical records.

Transforming Text: A Year of the Scan Tech Studio

The Scan Tech Studio (STS), located in the new PCL Scholars Lab, is a self-service facility designed to empower scholars and researchers in digitization, image processing, and text analysis projects. Equipped with advanced scanning equipment and software, the STS allows the UT community to independently digitize materials, apply optical character recognition (OCR) and handwritten text recognition (HTR), and engage in digital text analysis. From helping patrons scan historical documents to applying machine-readable techniques to modern texts, the STS has had an exciting first year guiding users in elevating their research.

The team behind this effort is the Scan Tech Studio Working Group, composed of seven librarians and digitization experts dedicated to helping scholars maximize the studio’s resources. We’re also grateful for the support of UT Libraries IT and the Scholars Lab Graduate Research Assistants, who keep everything running smoothly behind the scenes. The working group develops workshops, creates research guides, and promotes the use of digital scholarship tools related to OCR, HTR, and text analysis. Additionally, we offer guidance on copyright considerations and assist users in navigating the complexities of text recognition and analysis. Over the past year, the STS Working Group has been instrumental in fostering a dynamic learning environment within the Scholars Lab and building campus-wide connections to unlock the studio’s potential.

The working group has been dedicated to developing services that meet the evolving needs of the campus community. So far, our primary focus has been providing consultations and instruction related to digitization, OCR/HTR, and text analysis. With the diverse expertise of our team, we’ve been able to offer tailored, one-on-one consultations and small group sessions that help users think through the various stages of their digital projects, from planning to execution. Scheduling time with STS experts is simple through our user-friendly request form, ensuring patrons have easy access to specialized support.

Overall, we received 18 reservation requests, which meant that users had a consultation with one of the STS Working Group members, needed the space for digitization, and/or used our digital tool to OCR their materials. Many of these requests came from graduate students, specifically from the Department of History and the School of Information.

In addition to consultations, we’ve developed instructional tools such as a comprehensive research guide on research data management and the use of the studio’s equipment and software. The STS has also become a valuable teaching space, regularly hosting classes that integrate the studio’s technology into their curriculum, allowing students hands-on experience with advanced digitization tools and methods.

Reflecting on the past year, the STS has hosted several workshops inside and outside the studio to showcase its tools and demonstrate the possibilities to the campus community. For example, STS team members led workshops at this past summer’s Digital Scholarship Pedagogy Institute, focusing on digitization, OCR, and text analysis. Additionally, we contributed to the Digital Humanities Workshop Series, providing training in these specialized areas. 

It’s also worth noting that the working group dedicates time to internal development by hosting workshops for ourselves, allowing us to learn from one another and build up our collective skillset. As the saying goes, the best way to learn is to teach—and we’ve embraced this approach to better serve our users!

Due to the Scan Tech Studio being a new service, we wanted to partner with existing programs and reach out to various centers. We invited and provided an overview of our services to different centers around campus, such as JapanLab and the Center for Middle Eastern Studies. This gave us great insight into the needs around campus regarding digitization and OCR.

Additionally, we provided training in using specialized OCR tools such as Abbyy FineReader, a paid program under Adobe that is exclusively available at STS. It works exceptionally well for accurately OCRing text and training. We had about 36 uses in just our first year in the space.

As we continue to see the success of our space, we are planning to expand our services and tools. We aim to create additional resources covering various OCR tools and processes. We also plan to continue to collaborate with the Digital Humanities Workshop series to present different OCR and text analysis tools. Additionally, we intend to develop workshops tailored to researchers, including pre-research and post-research workshops. These workshops will help researchers understand what they need to do when conducting their research to ensure a successful OCR experience and facilitate the beginning of text analysis upon their return. We look forward to seeing how the groundwork we laid during the first year will impact our service in the upcoming year.

As you can see, we have a lot of promising plans to build off the Scan Tech Studio’s successful first year. We look forward to continuing to grow the space as a new hub for digitization and text analysis on campus. Scan you feel the excitement? 

UT Libraries