Monthly Archives: July 2023

AHRC iDAH Virtual Summer School in digitisation and analysis of textual sources 

Course 1: Text extraction from printed sources (11 & 12 September) 

Course 2: Text extraction from handwritten sources (13 & 14 September) 

Course 3: Digital ‘Distant reading’ with Voyant Tools and AntConc (11 & 12 September) 

Course 4: Digital ‘Close reading’ with NVivo (13 & 14 September) 

Optional: Bring Your Own Data Surgeries (15 September, a.m. & p.m.) 

The aim of these four short online training courses is to provide Arts & Humanities researchers with foundational knowledge and skills in automated text extraction and analysis of textual sources, using free or widely available commercial applications. The aim is to allow participants to apply digital visualization and analysis techniques to source materials, including those which only exist in physical form, such as books, manuscripts, and archives. 

Delivered by the University of Exeter as part of the AHRC’s iDAH Digital Skills Training Network, these courses are intended for academic professionals, researchers in Higher Education Institutions and Research Organizations, and librarians conducting research related to Arts and Humanities topics. They are foundation level, and do not require programming skills or prior experience. All courses are two, full-day sessions (+ optional half day). They will be conducted fully online, with a maximum of 10 participants on a first-come-first-served basis. To apply please fill in the application form

Course 1: Text extraction from printed sources 

This course will instruct participants in best practices for converting physical objects containing printed text, such as books, manuscripts and archives, into digital formats by using scanners, DSLR cameras and mobile phones. It will then focus on the use of OCR tools and techniques, such as Google Docs OCR, Adobe Acrobat Pro DC and ABBYY FineReader, to convert images of text into searchable and processable digital text. 

Course 2: Text extraction from handwritten sources 

This course will instruct participants in best practices for converting physical objects containing handwritten text, such as correspondence, manuscripts and archives, into digital formats by using scanners, DSLR cameras and mobile phones. It will then focus on the use of the online Transkribus HTR platform to convert images of text into searchable and processable digital text. 

Course 3: Digital ‘Distant reading’ with Voyant Tools and AntConc 

This course will cover a variety of methods for ‘distant reading’ (high-level analysis of texts and corpora), using freely available software. Voyant Tools will be used for rapid visualization and exploration of word patterns within a text. AntConc allows users to perform a variety corpus linguistics analysis techniques. 

Course 4: Digital ‘Close reading’ with NVivo 

This course introduces participants to NVivo, a software package widely available through university licenses and used for the close analysis of primary sources. Participants will learn how to transcribe and annotate texts and perform analysis and visualizations based on methods such as topic modelling, thematic clustering and sentiment analysis. 

Optional: Bring Your Own Data Surgeries 

An optional Friday morning or afternoon session will provide participants will some additional support when applying these methods to own materials and sources. 

For additional information or any further inquiries, please feel free to email us digitalhumanities@exeter.ac.uk and we will respond promptly.  

Digital Humanities Interns 2022/23 part 4

Leave a Reply

Each year we ask our interns to write a blog post at the end of their time working with us looking back on their time in the DH Lab. Here is the fourth of this year’s blogs from Heide:

Hello, I’m Heide, now a third year English Undergraduate (and set to graduate July 2023). I have been lucky enough to have another year working as an intern at the Digital Humanities Lab and have worked on more exciting projects since my last blog post.  

This year I continued to digitize letters and manuscripts from the Culver House collection using the A0 Copystand in Lab 1, exploring the “Busy Bee” manuscripts and the historical life of Culver House with our Lab Technician, Bronte Lyster. Other notable 2D photography projects included historical maps of Exeter from the University’s Special Collections and playbills from the Exeter Theatre Royal.  

One project in Lab 2 featured using the archival book cradle to digitise a text titled L’Historia Ecclesiastica owned by Queen Elizabeth’s head executioner. The text featuring marginalia written by the man himself! The text’s digitisation allowed a previously inaccessible text to be preserved and made available to researchers with specific page requests.  

This year brought the production of my first flawless photogrammetry model. Although the antler bone knife from the archaeology department’s teaching packs was not as cute as my gray crochet bunny from my first year, it certainly made a better photogrammetry model subject. I have been working alongside our other interns to produce photogrammetry models of the University’s archaeology teaching packs. These models would allow more accessible teaching, remote learning and easier access  to the artifacts for analysis outside of the classroom. This project follows on from the work I mentioned in my previous blog post – the Reflectance Transformation Imaging (RTI)  models that I helped produce of archaeological arrowheads. We have produced both RTI and Photogrammetry models of the teaching materials.  

My colleague, Julia Hopkin, sparked my interest in spinning by helping me to print my first 3D printed Turkish drop spindle on our Ultimaker printer at the start of my year. After the initiation of our new Formlabs resin printer I am now the proud owner of a resin printed Turkish drop spindle as well. The new resin printer has allowed us to create beeswax replicas of original beeswax fragments through the process of creating a photogrammetry model of the original fragment, printing the photogrammetry model in the resin printer, creating a cast of the print using silicone, then pouring and setting molten beeswax into the moulds to create a replica. These replicas are incredible, as the original wax artifacts are extremely fragile and prone to decay so are not on display at all.  The replica creating session was also quite fun to do and displays the full circle of digital humanities from the physical to digital and back again.  

This year I also worked alongside a team on the start of the Connecting Late Antiquities project which seeks to digitise all volumes of the Prosopography of the Latter Roman Empire (or PLRE for short). For most of my contribution I worked on troubleshooting and machine learning in ABBYY Fine Reader, a software that allows for some machine learning and training for Optical Character Recognition (OCR). OCR allows for the conversion of photographical text into editable text to be used for XML markup and eventual online publication. The PLRE volumes will also be updated with new knowledge that has developed in the field. For more information on Connecting Late Antiquities please visit the link provided below.  

Overall I have thoroughly enjoyed my two years as a Digital Humanities intern and have found it to be an invaluable experience. The skills I have learned in both digital archival technology as well as customer service have helped shape my career path and university experience. I enthusiastically recommend the intern position for both the invaluable skills and the enjoyable workplace experience that would benefit any Exeter University student.  

For anyone who would like to read my previous blog post please find a link to it here: https://digitalhumanities.exeter.ac.uk/2022/07/digital-humanities-intern-heide/  or visit digitalhumanities.exeter.ac.uk  

To find out more about Connecting late antiquities visit https://news-archive.exeter.ac.uk/homepage/title_953010_en.html#:~:text=Connecting%20Late%20Antiquities%20will%20begin,on%20the%20Cambridge%20Core%20platform.