Browsing by Author "Mamede, Nuno"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- Apoio ao planeamento de viagens em transportes públicosPublication . Correia, Marisol B.; Mamede, NunoApresenta-se um sistema informático que elabora planos de viagens para pessoas utilizando transportes públicos e que escolhe os melhores de entre esses planos, em função dos critérios indicados pelo utilizador, como sejam, o tempo de duração da viagem, o preço dos bilhetes e a qualidade dos transportes.
- Assisting European Portuguese teaching: linguistic features extraction and automatic readability classifierPublication . Curto, Pedro; Mamede, Nuno; Baptista, JorgeThis paper describes two automatic systems: a linguistic features extractor and a text readability classifier for European Portuguese texts. Its main goal is to assist the selection of adequate reading materials to support Portuguese teaching, especially as a second language. To the feature extraction from texts, the system uses several Natural Language Processing (NLP) tools. Currently, 52 features are extracted: parts-of-speech (POS), syllables, words, chunks and phrases, averages and frequencies, among others. A classifier was created using these features and a corpus, previously annotated readability level, adopting the five-levels language classification official standard for Portuguese as Second Language. In a five-levels (from A1 to C1) scenario, the best-performing learning algorithm (LogitBoost) achieved an accuracy of 75.11% with a root mean square error (RMSE) of 0.269. In a three-levels (A, B and C) scenario, the best-performing learning algorithm (C4.5 grafted) achieved 81.44% accuracy, with a RMSE of 0.346.
- Automated anonymization of text documentsPublication . Dias, Francisco; Mamede, Nuno; Baptista, JorgeSharing data in the form of text is important for a wide range of activities but it also raises a concern about privacy when sharing data that could be sensitive. Automated text anonymization is a solution for removing all the sensitive information from documents. However, this is a challenging task due to the unstructured form of textual data and the ambiguity of natural language. In this work, we present our implementation of an automated anonymization system, built in a modular structure, for documents written in Portuguese. Four different methods of anonymization are evaluated and compared. Two methods replace the sensitive information by artificial labels: suppression and tagging. The other two methods replace the information by textual expressions: random substitution and generalization. Evaluation showed that the use of the tagging and the generalization methods facilitates the reading of an anonymized text while preventing some semantic drifts caused by the remotion of the original information.
- Syntax Deep ExplorerPublication . Correia, José; Baptista, Jorge; Mamede, NunoThe analysis of the co-occurrence patterns between words allows for a better understanding of the use (and meaning) of words and its most straightforward applications are lexicography and linguist description in general. Some tools already produce co-occurrence information about words taken from Portuguese corpora, but few can use lemmata or syntactic dependency information. Syntax Deep Explorer is a new tool that uses several association measures to quantify several co-occurrence types, defined on the syntactic dependencies (e.g. subject, complement, modifier) between a target word lemma and its co-locates. The resulting co-occurrence statistics is represented in lex-grams, that is, a synopsis of the syntactically-based co-occurrence patterns of a word distribution within a given corpus. These lex-grams are obtained from a large-sized Portuguese corpus processed by STRING [19] and are presented in a user-friendly way through a graphical interface. The Syntax Deep Explorer will allow the development of finer lexical resources and the improvement of STRING processing in general, as well as providing public access to co-occurrence information derived from parsed corpora.