Start Date

6-10-2018 10:45 AM

End Date

6-10-2018 12:15 PM

Description

Access to research materials is an issue that cuts across disciplines and impacts most researchers as they gather information. For a digital scholar in need of a textual corpus, however, these challenges may be particularly acute. Those studying mid-to-late 20th century works may find themselves in uncertain territory with regard to copyright and licensing. Those studying historically marginalized populations may have trouble finding a pre-compiled corpus, or finding texts at all. Researchers at smaller institutions or in underfunded departments may find that existing datasets are not available to them due to cost, or that they run into copyright and licensing barriers when attempting to compile a large corpus of texts. Even an existing or easily harvested corpus may present structural challenges for our tools. How do we diversify and democratize digital scholarship while also navigating the difficulties of equitable access to information?

Keywords

text analysis, large text corpora, access

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Type

Presentation

Session

#s2b, moderator Diane Jakacki

Language

eng

Location

Elaine Langone Center, Center Room

Share

COinS
 
Oct 6th, 10:45 AM Oct 6th, 12:15 PM

A Critical Look at the Digital Scholarship Corpus: How Access Influences the Questions We (Can) Ask

Elaine Langone Center, Center Room

Access to research materials is an issue that cuts across disciplines and impacts most researchers as they gather information. For a digital scholar in need of a textual corpus, however, these challenges may be particularly acute. Those studying mid-to-late 20th century works may find themselves in uncertain territory with regard to copyright and licensing. Those studying historically marginalized populations may have trouble finding a pre-compiled corpus, or finding texts at all. Researchers at smaller institutions or in underfunded departments may find that existing datasets are not available to them due to cost, or that they run into copyright and licensing barriers when attempting to compile a large corpus of texts. Even an existing or easily harvested corpus may present structural challenges for our tools. How do we diversify and democratize digital scholarship while also navigating the difficulties of equitable access to information?