This month I delivered Digital Humanities Second Library Lab, a hands-on showcase of digital library collections and tools created for the purpose of innovative research using computational methods. This three-hour session followed on from a previous event I ran in March and concludes a short run of events that form part of DH@Manchester.
The aim of the workshop was to inspire researchers at all levels to gain practical experience with tools and techniques in order to go on to develop individual research projects with these or similar collections. Participants did not need any technical experience to join in, other than basic office and web browsing skills. The workshop plan and instructions are available online.
What projects and collections did we look at?
The three activities focused on image searching, analysing text and analysing colour. We looked at projects including the following.
- Broadside Ballads Online from the Bodleian Libraries (University of Oxford), a digital collection of English printed ballad-sheets from between the 16th and 20th centuries that includes a feature to search for an image within an image. The collection includes digitised ballad-sheets from The University of Manchester Library’s Special Collections following work by visiting researcher Dr Giles Bergel with the John Rylands Research Institute.
- JSTOR Text Analyzer from JSTOR Labs, a beta tool which will identify what any document you give it is about and recommend articles and chapters from JSTOR about the same topics.
- Robots Reading Vogue from Yale University Library’s Digital Humanities Lab, a collection of tools to interrogate the text within the entire U.S. Vogue Archive (ProQuest) and its front covers, such as a topic modeller, N-gram viewer and various colour analysis methods.
While developing this workshop, I created a project of my own to visualise the average colour used in the front covers of all full-colour issues from Illustrated London News (Gale Cengage). Just a few short Python scripts were required to extract this information from the collection and display it in an interactive web page. This allowed us to look for trends with particular hues, such as the more common use of reds on December issues.
What did we learn?
After each activity we discussed some of the issues raised. (Incidentally, I captured key points on a Smart Kapp digital flipchart or smart whiteboard, continuing the “Digital First” principles that Library colleagues are adopting.)
- Image analysis and computer vision has many potential applications with library collections, such as identifying where printed or handwritten text occurs in an image, facial recognition, and detecting patterns or differences between different editions or issues within a series.
- For image analysis systems to work best, the image sets and algorithms will need to be carefully curated and trained. This is a time-consuming process.
- The text analyser worked quite well but, as with the image search, was not perfect. It is important to find out precisely what “goes wrong” and why.
- Other applications for the text analysis tool include checking your grant application for any gaps in topics you think should be covered, for checking your thesis development, or for lecturers to check their students’ use of references in submitted papers.
- Being able to visualise an entire collection in one display (and then dive into the content) can give one an idea of what is there before selecting which physical item to go to the trouble of visiting and retrieving. Whitelaw (2015) suggests that such “generous interfaces” can open up the reader to a broader, less prescriptive view into a collection than the traditional web search.
- It could be more useful to be able to compare different collections or publications against each other. This can be difficult when multiple licence holders or publishers are involved, with different technical or legal restrictions to address.
- Programming or other technical skills would need to be learned in order to develop or apply many tools. Alternatively, technical specialists would need to work in partnership with researchers, perhaps utilising the University’s Research IT service or the Library’s Digital Technologies & Services division.
Summary
Digital or computational tools and techniques are increasingly being applied to arts, humanities and social science methods. Many of the collections at The University of Manchester Library have potential for stimulating interdisciplinary research. Such Digital Scholarship projects would often require a greater level of technical knowledge or skill than many research groups might currently possess, so further training or provision for technical support might be necessary.
References
Whitelaw M. (2015). ‘Generous Interfaces for Digital Cultural Collections’, Digital Humanities Quarterly, 2015 9.1, [Online]. Available at http://www.digitalhumanities.org/dhq/vol/9/1/000205/000205.html (Accessed: 25 May 2017)