Weblog Alex Reuneker

Annotation reliability as a preliminary for corpus research

— Posted in Taal by

On Friday 17 February, 2023 I gave a talk in the Sociolinguistics Series at Leiden University Centre for Linguistics (LUCL), entitled Annotation reliability as a preliminary for corpus research. Thanks Marina Terkourafi, Janet Connor, and Arie Elsenaar for organizing this series!

I presented an experiment on the reliabiliy of annotating conditionals in corpus data. I got some great questions and suggestions for further research. Much appreciated! Below you can find the abstract.

Annotation reliability as a preliminary for corpus research

In corpus research, language data are frequently annotated by analysts, but measures of reliability are rarely reported. When annotations concern interpretative features such as implicatures, this poses problems for subsequent steps in the analysis. In this talk, three connected issues are discussed in light of an experiment on classification of coherence relations in conditionals. First, different classifications produce incompatible results when applied to language data. Second, discourse studies observe a discrepancy between theory and data, i.e., existing classifications are “too detached” from actual discourse. Third, while language users construct various cognitive relations between clauses, they do so without relying on overt linguistic features, which poses problems for composing annotation schemes. Based on the results of the experiment, I discuss the implications for corpus research of implicatures.

See this LUCL page for the other talks in this series.