Browsing by Author "Marco, Marcos Eduardo Zampieri de"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- A supervised machine learning method for word sense disambiguation of Portuguese nounsPublication . Marco, Marcos Eduardo Zampieri de; Orasan, Constantin; Baptista, JorgeWord Sense Disambiguation (WSD) is vital in many Natural Language Processing (NLP) applications. This work aims to explore supervised machine learning techniques for the disambiguation of Portuguese nouns. The primary motivation for this work was the conceptualization of WSD integrated in an Information Retrieval (IR) engine in order to show how WSD may improve document retrieval from the world-wide web. After a brief overview of the most relevant applications for WSD, the main approaches and state-of-the-art techniques available for the task are presented. For the comparison of different WSD algorithms and techniques, a selection of ambiguous words from a Portuguese academic vocabulary was taken and a catalogue of word senses was established for each of them. A training corpus of real occurrences of each word in context was collected, providing manually annotated contextual data for each sense of the ambiguous word. The corpus was processed and features were extracted using Python and the Natural Language Tool Kit (NLTK) to feed into machine learning algorithms. Results are evaluated and discussed.