Repository logo
 
Publication

Linguistics parameters for zero anaphora resolution

dc.contributor.advisorBaptista, Jorge
dc.contributor.advisorEvans, Richard J.
dc.contributor.authorPereira, Simone Cristina
dc.date.accessioned2012-10-30T17:41:25Z
dc.date.available2012-10-30T17:41:25Z
dc.date.issued2010
dc.descriptionDissertação de mest., Natural Language Processing and Human Language Technology, Univ. do Algarve, 2009por
dc.description.abstractThis dissertation describes and proposes a set of linguistically motivated rules for zero anaphora resolution in the context of a natural language processing chain developed for Portuguese. Some languages, like Portuguese, allow noun phrase (NP) deletion (or zeroing) in several syntactic contexts in order to avoid the redundancy that would result from repetition of previously mentioned words. The co-reference relation between the zeroed element and its antecedent (or previous mention) in the discourse is here called zero anaphora (Mitkov, 2002). In Computational Linguistics, zero anaphora resolution may be viewed as a subtask of anaphora resolution and has an essential role in various Natural Language Processing applications such as information extraction, automatic abstracting, dialog systems, machine translation and question answering. The main goal of this dissertation is to describe the grammatical rules imposing subject NP deletion and referential constraints in the Brazilian Portuguese, in order to allow a correct identification of the antecedent of the deleted subject NP. Some of these rules were then formalized into the Xerox Incremental Parser or XIP (Ait-Mokhtar et al., 2002: 121-144) in order to constitute a module of the Portuguese grammar (Mamede et al. 2010) developed at Spoken Language Laboratory (L2F). Using this rule-based approach we expected to improve the performance of the Portuguese grammar namely by producing better dependency structures with (reconstructed) zeroed NPs for the syntactic-semantic interface. Because of the complexity of the task, the scope of this dissertation had to be limited: (a) subject NP deletion; b) within sentence boundaries and (c) with an explicit antecedent; besides, (d) rules were formalized based solely on the results of the shallow parser (or chunks), that is, with minimal syntactic (and no semantic) knowledge. A corpus of different text genres was manually annotated for zero anaphors and other zero-shaped, usually indefinite, subjects. The rule-based approached is evaluated and results are presented and discussed.por
dc.identifier.other81'36 PER*Lin
dc.identifier.urihttp://hdl.handle.net/10400.1/1787
dc.language.isoengpor
dc.peerreviewedyespor
dc.subjectResolução de anáforapor
dc.subjectAnáfora zeropor
dc.subjectAbordagem baseada em regras linguisticamente motivadaspor
dc.subjectPortuguês do Brasilpor
dc.titleLinguistics parameters for zero anaphora resolutionpor
dc.typemaster thesis
dspace.entity.typePublication
rcaap.rightsopenAccesspor
rcaap.typemasterThesispor
thesis.degree.grantorUniversidade do Algarve.Faculdade de Ciências Humanas e Sociaispor
thesis.degree.levelMestrepor
thesis.degree.nameMestrado em Natural Language Processing and Human Language Technologypor

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dissertation_2010_05_26_revisada.pdf
Size:
409.02 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: