Start Date

12-10-2019 9:00 AM

End Date

12-10-2019 10:30 AM

Description

Interdisciplinary collaboration between two faculty members in the humanities and computer science, a research librarian, and an undergraduate student has led to remarkable results in an ongoing international DH research project that has at its core 18th century manuscripts. The corpus stems from a vast collection of archival materials held by the Moravian Church in the UK, Germany, and the US. The number of pages to be transcribed, differences in handwriting styles, paper quality, and original language pose enormous problems for the feasibility of human transcription. This presentation will review the hypothesis, process, and findings of a summer research project that builds upon the Transkribus (Transkribus.eu) platform and seeks to refine the process for creating handwriting training recognition (HTR) models to further improve accuracy. An undergraduate student working with a faculty member in computer science developed a deep learning model to help overcome challenges of accuracy in computer transcription.

Keywords

machine learning, paleography, manuscript studies, Transkribus, collaboration

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License.

Type

Presentation

Session

#s1a

Language

eng

Location

Center room

Share

COinS
 
Oct 12th, 9:00 AM Oct 12th, 10:30 AM

Collaborating on Machine Reading: Training Algorithms to Read Complex Collections

Center room

Interdisciplinary collaboration between two faculty members in the humanities and computer science, a research librarian, and an undergraduate student has led to remarkable results in an ongoing international DH research project that has at its core 18th century manuscripts. The corpus stems from a vast collection of archival materials held by the Moravian Church in the UK, Germany, and the US. The number of pages to be transcribed, differences in handwriting styles, paper quality, and original language pose enormous problems for the feasibility of human transcription. This presentation will review the hypothesis, process, and findings of a summer research project that builds upon the Transkribus (Transkribus.eu) platform and seeks to refine the process for creating handwriting training recognition (HTR) models to further improve accuracy. An undergraduate student working with a faculty member in computer science developed a deep learning model to help overcome challenges of accuracy in computer transcription.