Reading Moravian Lives: Overcoming Challenges in Transcribing and Digitizing Archival Memoirs
Start Date
30-10-2016 8:30 AM
End Date
30-10-2016 9:30 AM
Description
The Moravian Lives project aims to digitize, transcribe, and publish for analysis more than 60,000 manuscript and print memoirs, written by members of the Moravian Church between 1750-2012. These memoirs are housed in archives throughout the world, making it difficult for scholars to engage with them as an entire corpus. Furthermore, of the 18th century memoirs, over 90% are in manuscript form. As project collaborators establish the foundations of a massive digital archive that houses facsimiles of the memoirs, we wrestle with how best to publish the memoirs in machine-readable format: existing optical character recognition (OCR) software does not reliably manage 18th century German script; in addition, the volume of pages to be transcribed challenges traditional transcription capabilities. Research teams at Bucknell and the University of Gothenburg in Sweden are collaborating to develop a suite of tools that will support large-scale controlled crowdsourcing of transcription and exportation of text and data sets to support a wide range of research needs by scholars in fields ranging from autobiography to theology, religious history, social history, historical and computational linguistics, and gender studies. In this paper members of the Bucknell team, led by Katie Faull, will discuss the challenges we face as we establish best practice for developing an interactive platform for editing and accessing this critically significant collection.
Type
Presentation
Session
#s4a: Collaborating, Publishing, and Community Participation, moderator Kathleen McQuiston
Language
eng
Location
Elaine Langone Center, Walls Lounge
Reading Moravian Lives: Overcoming Challenges in Transcribing and Digitizing Archival Memoirs
Elaine Langone Center, Walls Lounge
The Moravian Lives project aims to digitize, transcribe, and publish for analysis more than 60,000 manuscript and print memoirs, written by members of the Moravian Church between 1750-2012. These memoirs are housed in archives throughout the world, making it difficult for scholars to engage with them as an entire corpus. Furthermore, of the 18th century memoirs, over 90% are in manuscript form. As project collaborators establish the foundations of a massive digital archive that houses facsimiles of the memoirs, we wrestle with how best to publish the memoirs in machine-readable format: existing optical character recognition (OCR) software does not reliably manage 18th century German script; in addition, the volume of pages to be transcribed challenges traditional transcription capabilities. Research teams at Bucknell and the University of Gothenburg in Sweden are collaborating to develop a suite of tools that will support large-scale controlled crowdsourcing of transcription and exportation of text and data sets to support a wide range of research needs by scholars in fields ranging from autobiography to theology, religious history, social history, historical and computational linguistics, and gender studies. In this paper members of the Bucknell team, led by Katie Faull, will discuss the challenges we face as we establish best practice for developing an interactive platform for editing and accessing this critically significant collection.