Start Date
12-10-2019 9:00 AM
End Date
12-10-2019 10:30 AM
Description
Interdisciplinary collaboration between two faculty members in the humanities and computer science, a research librarian, and an undergraduate student has led to remarkable results in an ongoing international DH research project that has at its core 18th century manuscripts. The corpus stems from a vast collection of archival materials held by the Moravian Church in the UK, Germany, and the US. The number of pages to be transcribed, differences in handwriting styles, paper quality, and original language pose enormous problems for the feasibility of human transcription. This presentation will review the hypothesis, process, and findings of a summer research project that builds upon the Transkribus (Transkribus.eu) platform and seeks to refine the process for creating handwriting training recognition (HTR) models to further improve accuracy. An undergraduate student working with a faculty member in computer science developed a deep learning model to help overcome challenges of accuracy in computer transcription.
Keywords
machine learning, paleography, manuscript studies, Transkribus, collaboration
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Type
Presentation
Session
#s1a
Language
eng
Location
Center room
Included in
Collaborating on Machine Reading: Training Algorithms to Read Complex Collections
Center room
Interdisciplinary collaboration between two faculty members in the humanities and computer science, a research librarian, and an undergraduate student has led to remarkable results in an ongoing international DH research project that has at its core 18th century manuscripts. The corpus stems from a vast collection of archival materials held by the Moravian Church in the UK, Germany, and the US. The number of pages to be transcribed, differences in handwriting styles, paper quality, and original language pose enormous problems for the feasibility of human transcription. This presentation will review the hypothesis, process, and findings of a summer research project that builds upon the Transkribus (Transkribus.eu) platform and seeks to refine the process for creating handwriting training recognition (HTR) models to further improve accuracy. An undergraduate student working with a faculty member in computer science developed a deep learning model to help overcome challenges of accuracy in computer transcription.