Tag Archives: AILLA

Digital Preservation and the Archive of the Indigenous Languages of Latin America

Vea abajo para versión en español / Veja em baixo para versão em português

In honor of World Digital Preservation Day, members of the University of Texas Libraries’ Digital Preservation team have written a series of blog posts to highlight preservation activities at UT Austin, and to explain why the stakes are so high in our ever-changing digital and technological landscape. This post is part three in a series of five. Read part one and part two.

By SUSAN SMYTHE KUNG, PhD, Manager, (@SusanKung), and RYAN SULLIVANT, PhD, Language Data Curator, (@floatingtone), Archive of the Indigenous Languages of Latin America @AILLA_archive

At AILLA, we are developing guidelines for language researchers and activists that are intended to facilitate the organization and ingestion of their collections of recordings and annotations of Indigenous, and often endangered, languages into digital repositories so that these valuable digital resources can be preserved for the future. One of the areas of focus for these guidelines is on the importance of using open and sustainable file formats to increase the likelihood that digital files can be opened and read in the future. To help explain these ideas, we produced a short animated video that is available under a Creative Commons license on YouTube at https://youtu.be/2JCpg6ICr8M.

Screenshot from AILLA. 2018. Sustainable File Types , https://youtu.be/2JCpg6ICr8M, CC-By license.

Many digital documents are produced using proprietary software, and future users will need to have the same, or similar, software to open the files or read their contents. While documents in proprietary formats can be put into a digital repository so their bitstreams (all the ones and zeroes) are preserved well into the future, the exact copy of the file a user downloads years from now may be impossible to use if the proprietary software it was made with is no longer available. Documents preserved in these non-open and non-sustainable formats then end up like cuneiform tablets: objects whose marks and features have survived a long passage through time but can only be read by a small number of people after considerable effort and study.

A group of Cañari leaders leaving a meeting in which they discussed the formation of cooperatives to buy land. Cooperativa de San Rafael, man reading: José Zhinin, secretary, law, Antonio Guamán Zhinin president. Man in the door, José María Pichisaca. Front left, Paolo Guamán. photo right, in blue, Francisco Quishpilema; in red Manuel Guamán. Ecuador, 1968. https://ailla.utexas.org/islandora/object/ailla:259974 Photo © Preston Wilson.

Choosing sustainable open formats helps ensure that materials are not just preserved but are accessible and usable into the future, since open-source applications can be more easily built to read files stored in non-proprietary formats.

Archivo de las Lenguas Indígenas de Latinoamérica

Traducido por Jennifer Isasi

@AILLA_archive

En AILLA (por sus siglas en inglés), estamos desarrollando pautas para lingüistas y activistas con la intención de facilitar la organización e ingesta de sus colecciones de materiales de documentación de idiomas en repositorios digitales para que estos valiosos recursos digitales puedan conservarse para el futuro. Una de las áreas que resaltamos en estas guías es la importancia de utilizar formatos de archivo abiertos y sostenibles para aumentar la probabilidad de que estos archivos digitales puedan ser abiertos y leídos en el futuro. Para explicar estas ideas hemos producido un video animado corto que está disponible con licencia de Creative Commons en Youtube: https://youtu.be/2JCpg6ICr8M.

Captura de video de AILLA. 2018. Tipos de archivo , https://youtu.be/SuAUGDzKTol, licencia CC-By.

Muchos documentos digitales se producen con software propietario y se necesita el mismo software (o un software parecido) para abrirlos o leer su contenido. Es cierto que se puede meter documentos en formatos propietarios en un repositorio digital y sus bitstreams (todos los unos y ceros) serán preservados hasta el futuro, pero cuando el usuario del futuro lo descarga, no existe garantía de que aquella copia fiel sea accesible porque es posible que el software necesario ya no exista. Los documentos así preservados en formatos no abiertos y no sostenibles entonces terminan como tableta escritas en cuneiforme cuyas marcas y figuras han sobrevivido tras el tiempo pero solo son legibles por un pequeño conjunto de personas muy especializadas.

Niels Fock con dos hombres cañari en Tacu Pitina, Ecuador, 1974. https://ailla.utexas.org/islandora/object/ailla:259355 Foto © Eva Krener

Escoger formatos sostenibles y abiertos ayuda a asegurar que los materiales no solo permanezcan sino que estén accesibles y útiles en el futuro ya que será más fácil crear una aplicación de fuente abierta para leer archivos almacenados en formatos no propietarios.

Arquivo dos Idiomas Indígenas da América Latina

Traduzido por Tereza Braga

@AILLA_archive

Na AILLA, estamos desenvolvendo diretrizes para pesquisadores linguísticos e ativistas com o objetivo de possibilitar a organização e inserção de suas coleções de gravações e observações em idiomas indígenas (muitos em perigo de extinção) em repositórios digitais para que esses valiosos recursos possam ser preservados para o futuro. Uma das áreas de enfoque para essas diretrizes é a importância de utilizar formatos de arquivo abertos e sustentáveis para aumentar a probabilidade de que esses arquivos digitais possam ser abertos e lidos no futuro. Para ajudar a explicar essas ideias, produzimos um vídeo curto com técnica de animação, que está disponibilizado sob licença da Creative Commons no YouTube, em https://youtu.be/2JCpg6ICr8M.

Captura de tela de AILLA. 2018. Organizing for Personal vs Archival Workflows , https://youtu.be/iZVACb_ShiM

Muitos documentos digitais são produzidos utilizando software proprietário. Assim sendo, o usuário do futuro terá que ter o mesmo software ou similar para poder abrir os arquivos ou ler seus conteúdos. É viável armazenar documentos criados em formatos proprietários em repositório digital, para que seus bitstreams (todos os uns e todos os zeros) sejam preservados por muitos e muitos anos; por outro lado, é também possível que a cópia exata do arquivo baixado pelo usuário daqui a muitos anos seja impossível de utilizar, se o software proprietário que o criou não esteja mais disponível. Documentos preservados nesses formatos não-abertos e não-sustentáveis podem acabar como as táboas de escrita cuneiforme: objetos cujas marcações e funcionalidades sobreviveram uma longa passagem pelo tempo mas só podem ser lidos por um número pequeno de pessoas após considerável esforço e estudo.

Transcrições de histórias tzeltal na Coleção Terrence Kaufman. https://ailla.utexas.org/islandora/object/ailla:257561 Foto © Gabriela Pérez Báez

A seleção de formatos abertos e sustentáveis ajuda a garantir que certos materiais sejam não só preservados mas também acessíveis e utilizáveis no futuro, considerando que é mais fácil construir aplicações de código-fonte aberto capazes de ler arquivos armazenados em formatos não-proprietários.

A Fruitful Trip to Europe Kicks Off Work on Indigenous Languages Grant

Featured photo: Howard Reid’s collection of research materials from his ethnographic field work with the Hup in Brazil; photo: S. Kung

Susan Kung, manager of the Archive of the Indigenous Languages of Latin America (AILLA), kicked off work on the new National Endowment for the Humanities grant, Archiving Significant Collections of Endangered Languages: Two Multilingual Regions of Northwest South America (PD-260978-18, Co-PIs Patience Epps and Susan Kung) with a seven-week trip to the UK and France to acquire and begin the work of digitizing three of the eight collections included in the grant.

Susan Kung scand slides from the collection of Elsa Gomez-Imbert; Linguistics Resource Room, SOAS; photo by Bernard Howard
Susan Kung scans slides from the collection of Elsa Gomez-Imbert; Linguistics Resource Room, SOAS; photo by Bernard Howard

Kung’s work in the UK relied heavily on collaboration with the Endangered Language Archive (ELAR) at the School of Oriental and African Studies (SOAS), University of London. ELAR, like AILLA, is a digital repository that specializes in providing online access to, and long-term preservation of, multimedia materials in and about endangered indigenous languages. Kung’s trip started in London with a series of meetings at SOAS, where she helped to provide training to researchers in language documentation, archiving, and preservation methodologies, and helped ELAR’s staff plan for its imminent data migration.

Open reel tape machine, Linguistics Resource Center, SOAS
Open reel tape machine, Linguistics Resource Center, SOAS; photo: S. Kung

From there, Kung headed to Cajarc in the southwest of France to work with Dr. Elsa Gomez-Imbert, a retired researcher from the French National Research Center who conducted linguistic fieldwork in the Colombian Vaupés from 1973 to 2010 on several different languages of the region, including Tatuyo, Barasana, Karapana, Eduria, Bará, and Makuna, all of which are members of the Eastern Tukanoan language family.

Susan Kung & Elsa Gomez-Imbert in Cajarc, France
Susan Kung & Elsa Gomez-Imbert in Cajarc, France; photo: S. Kung

Kung and Gomez-Imbert spent four days compiling metadata and creating an inventory of Gomez-Imbert’s audio tapes and slides, all of which Kung then transported to London for digitization at SOAS.

Cajarc, France; photo: S. Kung
Cajarc, France; photo: S. Kung

Back in London, Kung spent a day doing similar work with Dr. Howard Reid, an anthropologist, documentary filmmaker for the BBC, and chair of the Royal Anthropological Institute’s Film Committee, who lived with the hunter-gatherer Hup people in the Amazon basin in 1974–76.

Susan Kung and Howard Reid in London
Susan Kung and Howard Reid in London

 Howard Reid's collection of research materials from his ethnographic field work with the Hup in Brazil; photo: S. Kung
Howard Reid’s collection of research materials from his ethnographic field work with the Hup in Brazil; photo: S. Kung

Kung finished up the acquisition part of her trip with four days of inventory and metadata work with Dr. Stephen Hugh-Jones, Emeritus Research Associate at the Cambridge University Department of Social Anthropology, at his office in King’s College, Cambridge. Hugh-Jones and his wife, Christine Hugh-Jones, lived with the Barasana people in the Colombian Vaupés in 1968–1971 and again in 1978–1979, along with their two young children on the second occasion. Over the course of 50 years, Hugh-Jones has worked with Barasana, as well as the Bará, Eduria, Makuna, and Tatuyo people in the Colombian Amazon. His research has included ritual, symbolism and mythology, shamanism, kinship, architecture, barter and gift exchange, food and drugs, and ethno-education.

Stephen Hugh-Jones and Susan Kung, courtyard of King's College, Cambridge
Stephen Hugh-Jones and Susan Kung, courtyard of King’s College, Cambridge; photo: S. Kung

The Hugh-Jones collection consists of born-digital and analog (cassette and open reel) audio recordings, 45 field notebooks, manuscript transcriptions of recordings, photographs and negatives, and an unprecedented accumulation of indigenous artworks. Kung, along with Bernard Howard, the sound technician for the SOAS Linguistics Department, spent three weeks digitizing these collections at SOAS, where Howard concentrated on digitizing the 137 audio tapes (cassettes and open reels) and Kung focused on scanning slides and paper documents.

Bernard Howard, sound technician, SOAS, working with cassette tapes from the collection of Elsa Gomez-Imbert
Bernard Howard, sound technician, SOAS, working with cassette tapes from the collection of Elsa Gomez-Imbert

When it was time for Kung to return to Austin in mid-October, she and Howard had completely finished digitizing two of the three collections—those of Elsa Gomez-Imbert and Howard Reid—and Kung had finished digitizing the indigenous art compiled by the Hugh-Joneses.

50 years' worth of ethnographic research in a wooden cart (Hughs-Jones collection), courtyard of King's College, Cambridge
50 years’ worth of ethnographic research in a wooden cart (Hughs-Jones collection), courtyard of King’s College, Cambridge

Before returning home, Kung returned Reid’s and Gomez-Imbert’s collections to them, and shipped the remainder of the Hugh-Jones collection to AILLA, where it will be digitized during this academic year and then returned to the Hugh-Joneses. Once all the digital files from all three collections have been curated in collaboration with the Gomez-Imbert, Reid, and Hugh-Jones, they will be ingested into AILLA and available for public viewing.

NEH Grant Will Fund Transcription of Indigenous Language Collection

BY J. RYAN SULLIVANT

The Archive of Indigenous Languages of Latin America (AILLA) has received a pilot grant from the Humanities Collections and Reference Resources program of the National Endowment for the Humanities. This grant will improve access to some of the archive’s thousands of audio recordings in indigenous languages by supporting pilot efforts to crowdsource the creation of digital texts for manuscript transcriptions and translations that accompany recordings already in AILLA’s collections. Specifically, the grant will support the transcription of materials in the Mixtec languages of Mexico that are included in the MesoAmerican Languages Collection of Kathryn Josserand. These materials include a very broad survey of the grammar and vocabulary of the Mixtec languages spoken in over 100 towns and villages of southern Mexico.

Transcription of Tehuelche, from the AILLA archive of Jorge Suárez
Transcription of Tehuelche, from the AILLA archive of Jorge Suárez

Digital transcriptions will improve users’ access to these materials and will also facilitate their reuse for humanistic and especially linguistic research studying the dialectology of the Mixtec languages, which, decades after these materials were collected, is still not completely understood. They will also contribute to research on the prehistory of the Mixtec-speaking people, who today number almost a half-million in Mexico. One component of the project will be the development of educational modules that will use the transcription task to teach lessons on linguistic transcription, language description, and historical linguistics. This pilot project will also allow AILLA to develop transcription workflows that can be applied to other significant collections of handwritten documents in the archive’s collections.

Pilot project will improve access to a collection of Mixtec audio recordings.

The project’s principal investigator is Professor Virginia Garrard, director of LLILAS Benson Latin American Studies and Collections. The project manager is Ryan Sullivant, AILLA language data curator.

Survey in Chalcatongo Mixtec (with Spanish above), from the AILLA collection of J. Kathryn Josserand
Survey in Chalcatongo Mixtec (with Spanish above), from the AILLA collection of J. Kathryn Josserand

The National Endowment for the Humanities, created in 1965 as an independent federal agency, supports research and learning in history, literature, philosophy, and other areas of the humanities by funding selected, peer-reviewed proposals from around the nation. Additional information about the National Endowment for the Humanities and its grant programs is available at www.neh.gov.

For more information on the AILLA transcription project, contact Ryan Sullivant.

AILLA Awarded Grant from the National Endowment for the Humanities

The National Endowment for the Humanities (NEH) has awarded a Documenting Endangered Languages Preservation Grant of $227,365 to Patience Epps and Susan Smythe Kung of the Archive of the Indigenous Languages of Latin America (AILLA) for support of their upcoming project entitled “Archiving Significant Collections of Endangered Languages: Two Multilingual Regions of Northwestern South America.”

The AILLA grant is one among 199 grants, totaling $18.6 million, announced by the NEH on April 9, 2018.

This is a three-year project that will gather together, curate, and digitize a set of eight significant collections of South American indigenous languages, the results of decades of research by senior scholars. The collections will be archived at AILLA, a digital repository dedicated to the long-term preservation of multimedia in indigenous languages. These materials constitute an important resource for further linguistic, ethnographic, and ethnomusicological research, and are of high value to community members and scholars. They include six legacy collections from the Upper Rio Negro region of the northwest Amazon (Brazil and Colombia), and two collections focused on Ecuadorian Kichwa, most notably the Cañar variety.

Women spinning wool, Juncal, Cañar, Ecuador; photo: Niels Fock/Eva Krener, 1973
Women spinning wool, Juncal, Cañar, Ecuador; photo: Niels Fock/Eva Krener, 1973

All of the languages concerned are endangered or vulnerable to varying degrees, and the collections are heavily focused on threatened forms of discourse, such as ritual speech and song. Of the Upper Rio Negro set, the collections of Elsa Gomez-Imbert, Stephen Hugh-Jones, and Arthur P. Sorensen, Jr., include the East Tukanoan languages Bará, Barasana, Eduria, Karapana, Tatuyo, Makuna, and Tukano. The collections of Howard Reid and Renato Athias are focused on Hup, while Reid’s collection also contains a few materials from two languages of the wider region, Nukak and Hotï (yua, isolate). Robin Wright’s collection involves Baniwa. Of the Ecuadorian Kichwa set, Judy Blankenship’s and Allison Adrian’s collections are both focused on Cañar Highland Kichwa, while Adrian’s also includes some material from Loja Highland Kichwa (qvj, Quechua).

The two regions targeted by these collections are highly significant for our understanding of language contact and diversity in indigenous South America. The multilingual Upper Rio Negro region, famous for the linguistic exogamy practiced by some of its peoples, has much to tell us about language contact and maintenance, while Ecuadorian Kichwa varieties can shed light on the dynamics of pre-Colombian language shift. These collections will be made accessible in AILLA in standard formats, and will provide a foundation for further study of these fascinating regions and multilingual dynamics.

NEH Logo MASTER_082010

The National Endowment for the Humanities, created in 1965 as an independent federal agency, supports research and learning in history, literature, philosophy, and other areas of the humanities by funding selected, peer-reviewed proposals from around the nation. Additional information about the National Endowment for the Humanities and its grant programs is available at www.neh.gov.