Authors
Advisor(s)
Abstract(s)
The analysis of the co-occurrence patterns between words allows for a better understanding of the use (and meaning) of words and its most straightforward applications are lexicography and linguist description in general. Some tools already produce co-occurrence information about words taken from Portuguese corpora, but few can use lemmata or syntactic dependency information. Syntax Deep Explorer is a new tool that uses several association measures to quantify several co-occurrence types, defined on the syntactic dependencies (e.g. subject, complement, modifier) between a target word lemma and its co-locates. The resulting co-occurrence statistics is represented in lex-grams, that is, a synopsis of the syntactically-based co-occurrence patterns of a word distribution within a given corpus. These lex-grams are obtained from a large-sized Portuguese corpus processed by STRING [19] and are presented in a user-friendly way through a graphical interface. The Syntax Deep Explorer will allow the development of finer lexical resources and the improvement of STRING processing in general, as well as providing public access to co-occurrence information derived from parsed corpora.
Description
Keywords
Natural Language Processing (NLP) Co-occurrence Collocation Association measures Graphic interface Lex-gram Portuguese
Citation
Publisher
SPRINGER INT PUBLISHING AG