Publication
Authorship attribution in portuguese using character N-grams
dc.contributor.author | Markov, Ilia | |
dc.contributor.author | Baptista, Jorge | |
dc.contributor.author | Pichardo-Lagunas, Obdulia | |
dc.date.accessioned | 2018-12-07T14:58:22Z | |
dc.date.available | 2018-12-07T14:58:22Z | |
dc.date.issued | 2017 | |
dc.description.abstract | For the Authorship Attribution (AA) task, character n-grams are considered among the best predictive features. In the English language, it has also been shown that some types of character n-grams perform better than others. This paper tackles the AA task in Portuguese by examining the performance of different types of character n-grams, and various combinations of them. The paper also experiments with different feature representations and machine-learning algorithms. Moreover, the paper demonstrates that the performance of the character n-gram approach can be improved by fine-tuning the feature set and by appropriately selecting the length and type of character n-grams. This relatively simple and language-independent approach to the AA task outperforms both a bag-of-words baseline and other approaches, using the same corpus. | |
dc.description.sponsorship | Mexican Government (Conacyt) [240844, 20161958]; Mexican Government (SIP-IPN) [20171813, 20171344, 20172008]; Mexican Government (SNI); Mexican Government (COFAA-IPN); | |
dc.identifier.doi | 10.12700/APH.14.3.2017.3.4 | |
dc.identifier.issn | 1785-8860 | |
dc.identifier.uri | http://hdl.handle.net/10400.1/11987 | |
dc.language.iso | eng | |
dc.peerreviewed | yes | |
dc.publisher | Budapest University of Technology and Economics | |
dc.subject | Language | |
dc.subject | Authorship attribution | |
dc.subject | Character n-grams | |
dc.subject | Portuguese | |
dc.subject | Stylometry | |
dc.subject | Computational linguistics | |
dc.subject | Machine learning | |
dc.title | Authorship attribution in portuguese using character N-grams | |
dc.type | journal article | |
dspace.entity.type | Publication | |
oaire.awardURI | info:eu-repo/grantAgreement/FCT/5876/UID%2FCEC%2F50021%2F2013/PT | |
oaire.citation.endPage | 78 | |
oaire.citation.issue | 3 | |
oaire.citation.startPage | 59 | |
oaire.citation.title | Acta Polytechnica Hungarica | |
oaire.citation.volume | 14 | |
oaire.fundingStream | 5876 | |
person.familyName | Baptista | |
person.givenName | Jorge | |
person.identifier.ciencia-id | 7010-5366-22C5 | |
person.identifier.orcid | 0000-0003-4603-4364 | |
person.identifier.rid | H-7699-2013 | |
person.identifier.scopus-author-id | 14035269500 | |
project.funder.identifier | http://doi.org/10.13039/501100001871 | |
project.funder.name | Fundação para a Ciência e a Tecnologia | |
rcaap.rights | openAccess | |
rcaap.type | article | |
relation.isAuthorOfPublication | e817fa28-a005-40e2-9ba4-03fdaedd7df3 | |
relation.isAuthorOfPublication.latestForDiscovery | e817fa28-a005-40e2-9ba4-03fdaedd7df3 | |
relation.isProjectOfPublication | 4b33c456-e2db-4613-a2ef-db1484b29ab7 | |
relation.isProjectOfPublication.latestForDiscovery | 4b33c456-e2db-4613-a2ef-db1484b29ab7 |
Files
Original bundle
1 - 1 of 1