Estimating lexical availability of European Portuguese proverbs

Reis, Sónia; Baptista, Jorge

http://hdl.handle.net/10400.1/10206

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
submission.pdf		1.03 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Reis, Sónia

Baptista, Jorge

Resumo(s)

This paper relates data on lexical availability with data on textual frequency of proverbs in European Portuguese. Each data source should provide different perspectives on the use of proverbs in the language. This should allow an empirically well-motivated selection of proverbs aiming at the development of NLP resources, specifically for applications for learning Portuguese as a Foreign Language and for the diagnosis/therapy of speech impairments/disabilities. A large database (over 114,000 proverbs and their variants) was independently classified by two annotators, according to intuitively estimated lexical availability. Next, a random, stratified sample was selected and lexical availability was then confirmed with an online survey. Frequency data was gathered from two web browsers and a large-sized, publicly available, corpus of journalistic texts. Results from the survey, the web and the corpus by and large confirm the initial intuitive classification and a core of commonly used proverbs was defined

Palavras-chave

European Portuguese Proverbs Frequency in corpus Lexical availability

URI

http://hdl.handle.net/10400.1/10206

Editora

Springer Publishing Company

DOI

10.1007/978-3-319-69805-2_17

Coleções

FCH3-Livros (ou partes, com ou sem arbitragem científica)

Métricas Alternativas

Ver registo completo