Repository logo
 
Loading...
Thumbnail Image
Publication

Estimating lexical availability of European Portuguese proverbs

Use this identifier to reference this record.
Name:Description:Size:Format: 
submission.pdf1.03 MBAdobe PDF Download

Advisor(s)

Abstract(s)

This paper relates data on lexical availability with data on textual frequency of proverbs in European Portuguese. Each data source should provide different perspectives on the use of proverbs in the language. This should allow an empirically well-motivated selection of proverbs aiming at the development of NLP resources, specifically for applications for learning Portuguese as a Foreign Language and for the diagnosis/therapy of speech impairments/disabilities. A large database (over 114,000 proverbs and their variants) was independently classified by two annotators, according to intuitively estimated lexical availability. Next, a random, stratified sample was selected and lexical availability was then confirmed with an online survey. Frequency data was gathered from two web browsers and a large-sized, publicly available, corpus of journalistic texts. Results from the survey, the web and the corpus by and large confirm the initial intuitive classification and a core of commonly used proverbs was defined

Description

Keywords

European Portuguese Proverbs Frequency in corpus Lexical availability

Citation

Research Projects

Organizational Units

Journal Issue

Publisher

Springer Publishing Company

CC License

Altmetrics