Aysu Nur Yaman, Exploring Attribution in Turkish Discourse: An Annotation-Based Analysis

M.S. Candidate: Aysu Nur Yaman
Program: Cognitive Science

Abstract: Attribution involves recognizing and crediting sources, a process integral to both written and spoken discourse. This study extends existing frameworks, particularly the Penn Discourse TreeBank (PDTB), which elucidates how sources and statements are attributed in English, to Turkish texts using the Turkish Discourse Bank version 1.2 (TDB 1.2). The aim is to understand the mechanisms of attribution in Turkish and reduce dependency on manual annotation for text analysis. Employing insights from the literature, including lexical control and eventuality specific to Turkish, a tailored annotation scheme was developed. Data annotation achieved strong inter-annotator agreement with Cohen’s kappa coefficients: 0.83 for Arg1, 0.80 for Arg2, and 0.77 for Entire Discourse Relation (Entire Drel), indicating near-perfect to substantial agreement. Analysis of the annotated data revealed that the Other (Ot) category dominated with 294 instances, followed by Arg1 (234 instances) and Arg2 (217 instances). The majority of verbs were communicative such as dedi (‘said’) 130 times, dedim (‘I said’) 62 times, söyledi (‘told’) 49 times), with communicative verbs comprising 75.3% of occurrences in relevant categories. Novels and news emerged as rich domains for attribution study, with 1365 and 559 relations respectively. This analysis enriches the TDB and sets a foundation for future automated text analysis.