The AAPCAppE is a corpus of one million words of Appalachian speech, which will be available to the public in 2017.
Though often socially stigmatized, Appalachian English is historically central to the development of American English from its British origins, and this project will provide a resource unprecedented in scope and in public accessibility for cultural, historical, and linguistic research on the English of Appalachia.
The AAPCAppE is based on existing oral history projects housed at institutions around the Appalachian region; some of these have been vetted, transcribed, and organized by M. Montgomery (see AAPCAppE Interviews link above). The goals of this project included (a) digitizing the recordings, (b) time-aligning the digitized sound files with the transcripts, and (c) annotating the transcripts with detailed grammatical information, also known as “part-of-speech tagging” and “parsing.”
Digitizing the recordings have preserved this valuable cultural resource for future generations, and time-aligning the digitized recordings with the transcripts allows researchers to rapidly find desired parts of the speech signal by searching the transcribed text. The grammatical annotation allows in-depth analyses of particular constructions that are specific to Appalachian English, or typical of vernacular American English more generally, as well as comparisons of Appalachian English with other vernacular Englishes, and with earlier stages of the language.
Because the corpus is large, publicly available, and searchable online with standard, freely accessible, user-friendly computational tools, it will foster replicability, thereby contributing to increased empirical rigor in linguistic research. These same properties will also make it possible to use the corpus as a teaching tool at the primary and secondary education levels, as well as at college and graduate levels. On a more general level, this corpus will deepen our understanding of America’s linguistic heritage and promote a scientifically informed appreciation of regional language and culture.