Tag Archives: copyright

Read, Hot, and Digitized: Adventures in Data-Sitting

Read, hot & digitized: Librarians and the digital scholarship they love — In this series, librarians from the UT Libraries Arts, Humanities and Global Studies Engagement Team briefly present, explore and critique existing examples of digital scholarship. Our hope is that these monthly reviews will inspire critical reflection of, and future creative contributions to, the growing fields of digital scholarship.

It will come as no surprise that I, the English Literature Librarian, was a nerdy little bookworm as a child. I actively participated in the Book It! reading program, a literacy initiative sponsored by Pizza Hut. The premise of Book It! was simple: After completing five books and getting the sign-off from my teacher, I would “earn” a coupon for a personal pan pizza. When I was in 5th grade, I read enough Baby-Sitters Club (BSC) books in a single week to earn three pizzas. I felt a tinge of guilt because I had skipped early chapters in each book where the text was reused, word-for-word, from previous books in the series. It was always Chapter 2!

Every devoted Baby-Sitters Club fan knows the text was reused to introduce the characters and the premise of the series. There were over 200 books published in the span of 13 years – of course some of it would be repetitive! But let’s take it a step further. What if we could quantifiably demonstrate the reuse of Chapter 2 text, while also comparing stylistic and narrative changes across multiple ghostwriters and cultural trends? And how would you do this kind of analysis of 200+ novels, spin-offs, and graphic novel adaptations? Well, a feminist collective of scholars called the Data-Sitters Club (DSC) is attempting to do just that. 

Cover art for the Data-Sitters Club, by artist Claire Chenette

The Data-Sitters Club describe their project as “a fun way to learn about computational text analysis for digital humanities”. They created a corpus of Ann M. Martin’s influential young adult series and have analyzed it using a variety of DH methods and tools (Python, R, TEI, Voyant, just to name a few). The Baby-Sitters Club has had a long pop culture shelf-life for Gen X and Millennial readers, with the recent Netflix reboot (which was sadly canceled after two seasons) and the podcasts Stuck in Stonybrook and the Baby-Sitters Club Club. According to the publisher Scholastic, the series has been in print since 1986 and has sold more than 190 million copies. Given the series’ immense popularity and continued pop culture influence, the books are a gold mine for researchers interested in gender, race, class, and sexuality, but, like much of girl culture, the books haven’t been the subject of serious research.

So the Data-Sitters Club saw opportunity for new research, while also making DH more accessible, especially to women and other marginalized groups often sidelined in DH projects. The DSC does this through a series of 16 blog posts on their GitHub site, written to mimic the narrative style of the book series, including titles that riff off the originals. Each blog post covers a use case for the BSC corpus and features a different tool, coding language or technique. Two of my favorites are DSC #2: Katia and the Phantom Corpus and DSC #5: The DSC and the Impossible TEI Quandaries. (A running joke throughout the blog is that later posts refer the reader back to “Chapter 2” to explain the corpus and how it was created, an intentional reference to the Chapter 2 in the original series that reused text to explain the series’ premise.)

Cover art for DSC #2: Katia and the Phantom Corpus, which parodies an original Baby-Sitters Club book cover that I’m pretty sure I read in 3rd or 4th grade. Image courtesy of the Data-Sitters Club

One thing you won’t find on the DSC GitHub site is the corpus itself. The team scanned print books to create a legal corpus, but as of right now, it’s not available publicly online. The DSC has used the project as an advocacy tool to promote the loosening of ebook copyright restrictions to build literary corpra for private research. In partnership with the non-profit Authors Alliance, they wrote to the Librarian of Congress asking for exemptions to the Digital Millennium Copyright Act of 1998 to access the full BSC corpus. Of all the DSC blog posts, I found DSC #7: The DSC and the Mean Copyright Law to be the most fascinating – and frustrating.

I would recommend the Data-Sitters Club blog to any emerging DH scholar or librarian looking to try a new tool or method. Much of the content is highly technical, but the fun, approachable tone of each blog post makes the content accessible. I hope they are able to get legal access to the full ebook corpus so we can see more research on the Baby-Sitters Club books and better understand their cultural impact on a generation of women and girls.

You can find print copies of the original Baby-Sitters Club series in the PCL Youth Collection, and I highly recommend the recent essay collection We Are the Baby-Sitters Club: Essays and Artwork from Grown-up Readers, available at the PCL.  

Building Relations, Connecting UT Libraries to the Coast and Back

Jessica Trelogan discusses data management.
Jessica Trelogan discusses data management.

Knowledge, relationship, awareness, perception, assessment, responsiveness, realization, recognition, insight, creativity, vision, and GRASP! Bingo, a seminar!

After a year’s planning and one conversation between a marine science librarian and a faculty member, a grand opportunity came to fruition for the Marine Science Library to connect the Marine Science Institute and its regional partners with UT Libraries. On August 19, we hosted a 2-hour seminar on Scholarly Publishing & Data Management at the institute in Port Aransas. Yes, “that place on the beach!” By inviting expert librarians from UT Libraries, a diverse audience received an informative session on topics relevant to researchers, librarians and students.

Colleen Lyon covers copyright and the basics of scholarly communications.
Colleen Lyon covers copyright and the basics of scholarly communications.

Colleen Lyon, Scholarly Communications Librarian at UT Libraries, covered the basics of copyright, transfer agreements associated with copyright, open access publishing and how to legally share research on online tools like ResearchGate and Academia.edu.

Jessica Trelogan, Data Management Coordinator at UT Libraries, shared her expertise on basic data management planning and principles. Requirements from funding agencies, publishers, and institutions continue to create pressures on researchers who are already stretched for time and funds. Jessica discussed the process of creating and writing a Data Management Plan (DMP), how to make data more discoverable, accessible and reusable, and provided useful resources.

The event was held in the large seminar room located in the Estuarine Research Center building, creating a comfortable and relaxed atmosphere, with views of the dunes and Gulf of Mexico. The small group of participants included faculty, staff and students from the Marine Science Institute and librarians from Texas A & M University, Corpus Christi. Throughout the seminar, thought-provoking questions led to some great discussions and our presenters handled them with ease.

After the session, attendees had an opportunity to chat, while enjoying a delicious lunch provided by the Mustang Island Food Company of Port Aransas.

The Marine Science Library continues to find creative ideas for its role in providing opportunities in learning and research. The seminar event was a great success!

Jessica Trelogan, Liz De Hart and Colleen Lyon.
Jessica Trelogan, Liz De Hart and Colleen Lyon.

Ruminations on Copyright Reform

Image courtesy Horia Varlan's Flickr photostream under a Creative Commons license.

Thoughts by the University of Texas Libraries Scholary Communications Advisor and resident copyright expert Georgia Harper on Pam Samuelson’s article, “Reforming Copyright is Possible,” published in the July 9 edition of  The Chronicle of Higher Education.

Pam Samuelson is a visionary copyright scholar, winner of a MacArthur Grant, and an optimist. She believes that despite the dim prospects for badly needed comprehensive copyright reform, we can take small steps to make big improvements, both within and outside the legislative process. Several of her proposals for libraries’ independent action exhort us to rely more confidently on fair use, engage in concerted efforts to search for owners of out-of-commerce works and identify them so that people may use more freely those for whom owners cannot be found, and work together to bring our out-of-copyright works to digital life. For example, she applauds the efforts to create a Digital Public Library that would provide public access to public domain works. She is right. All of these ideas are good ones that deserve our attention and our action.

Her suggestions about how modest legislative efforts could improve the picture for public access to libraries’ holdings are more difficult to embrace. Continue reading Ruminations on Copyright Reform

On Litigating Fair Use

From Duke University Libraries:

When the Association of Research Libraries wrote a letter to the CCC expressing disappointment over the decision to help underwrite the lawsuit, CCC’s reply emphasized that no damages were being sought and maintained that their participation had the simple goal of “clarifying” fair use. This strikes me as disingenuous. There are more efficient ways to clarify fair use than litigation, and the CCC has a definite financial interest in the case even absent any request for damages. CCC’s aim here is not to clarify fair use but to narrow it dramatically, to their direct and immediate profit.

The argument developed here by Kevin Smith places the Copyright Clearance Center (CCC) in a harsh light – subvening copyright violation litigation in order to further restrict access options to intellectual property, thus securing its own role in the publishing community while attempting to prop up that foundering industry a little longer.  As Paul Courant observes elsewhere (and thanks to Paul for the pointer to this article) it forces a reluctant higher education community to seek alternatives to its own and commercial presses – an outcome potentially fatal to the industry.