Title

Reading Moravian Lives: Overcoming Challenges in Transcribing and Digitizing Archival Memoirs

Item Type

Presentation

Location

Elaine Langone Center, Walls Lounge

Session

#s4a: Collaborating, Publishing, and Community Participation, moderator Kathleen McQuiston

Start Date

30-10-2016 8:30 AM

End Date

30-10-2016 9:30 AM

Description

The Moravian Lives project aims to digitize, transcribe, and publish for analysis more than 60,000 manuscript and print memoirs, written by members of the Moravian Church between 1750-2012. These memoirs are housed in archives throughout the world, making it difficult for scholars to engage with them as an entire corpus. Furthermore, of the 18th century memoirs, over 90% are in manuscript form. As project collaborators establish the foundations of a massive digital archive that houses facsimiles of the memoirs, we wrestle with how best to publish the memoirs in machine-readable format: existing optical character recognition (OCR) software does not reliably manage 18th century German script; in addition, the volume of pages to be transcribed challenges traditional transcription capabilities. Research teams at Bucknell and the University of Gothenburg in Sweden are collaborating to develop a suite of tools that will support large-scale controlled crowdsourcing of transcription and exportation of text and data sets to support a wide range of research needs by scholars in fields ranging from autobiography to theology, religious history, social history, historical and computational linguistics, and gender studies. In this paper members of the Bucknell team, led by Katie Faull, will discuss the challenges we face as we establish best practice for developing an interactive platform for editing and accessing this critically significant collection.

Related

Language

eng

Share

COinS
 
Oct 30th, 8:30 AM Oct 30th, 9:30 AM

Reading Moravian Lives: Overcoming Challenges in Transcribing and Digitizing Archival Memoirs

Elaine Langone Center, Walls Lounge

The Moravian Lives project aims to digitize, transcribe, and publish for analysis more than 60,000 manuscript and print memoirs, written by members of the Moravian Church between 1750-2012. These memoirs are housed in archives throughout the world, making it difficult for scholars to engage with them as an entire corpus. Furthermore, of the 18th century memoirs, over 90% are in manuscript form. As project collaborators establish the foundations of a massive digital archive that houses facsimiles of the memoirs, we wrestle with how best to publish the memoirs in machine-readable format: existing optical character recognition (OCR) software does not reliably manage 18th century German script; in addition, the volume of pages to be transcribed challenges traditional transcription capabilities. Research teams at Bucknell and the University of Gothenburg in Sweden are collaborating to develop a suite of tools that will support large-scale controlled crowdsourcing of transcription and exportation of text and data sets to support a wide range of research needs by scholars in fields ranging from autobiography to theology, religious history, social history, historical and computational linguistics, and gender studies. In this paper members of the Bucknell team, led by Katie Faull, will discuss the challenges we face as we establish best practice for developing an interactive platform for editing and accessing this critically significant collection.