Repository logo
 
Publication

Authorship attribution in portuguese using character N-grams

dc.contributor.authorMarkov, Ilia
dc.contributor.authorBaptista, Jorge
dc.contributor.authorPichardo-Lagunas, Obdulia
dc.date.accessioned2018-12-07T14:58:22Z
dc.date.available2018-12-07T14:58:22Z
dc.date.issued2017
dc.description.abstractFor the Authorship Attribution (AA) task, character n-grams are considered among the best predictive features. In the English language, it has also been shown that some types of character n-grams perform better than others. This paper tackles the AA task in Portuguese by examining the performance of different types of character n-grams, and various combinations of them. The paper also experiments with different feature representations and machine-learning algorithms. Moreover, the paper demonstrates that the performance of the character n-gram approach can be improved by fine-tuning the feature set and by appropriately selecting the length and type of character n-grams. This relatively simple and language-independent approach to the AA task outperforms both a bag-of-words baseline and other approaches, using the same corpus.
dc.description.sponsorshipMexican Government (Conacyt) [240844, 20161958]; Mexican Government (SIP-IPN) [20171813, 20171344, 20172008]; Mexican Government (SNI); Mexican Government (COFAA-IPN);
dc.identifier.doi10.12700/APH.14.3.2017.3.4
dc.identifier.issn1785-8860
dc.identifier.urihttp://hdl.handle.net/10400.1/11987
dc.language.isoeng
dc.peerreviewedyes
dc.publisherBudapest University of Technology and Economics
dc.subjectLanguage
dc.subjectAuthorship attribution
dc.subjectCharacter n-grams
dc.subjectPortuguese
dc.subjectStylometry
dc.subjectComputational linguistics
dc.subjectMachine learning
dc.titleAuthorship attribution in portuguese using character N-grams
dc.typejournal article
dspace.entity.typePublication
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/5876/UID%2FCEC%2F50021%2F2013/PT
oaire.citation.endPage78
oaire.citation.issue3
oaire.citation.startPage59
oaire.citation.titleActa Polytechnica Hungarica
oaire.citation.volume14
oaire.fundingStream5876
person.familyNameBaptista
person.givenNameJorge
person.identifier.ciencia-id7010-5366-22C5
person.identifier.orcid0000-0003-4603-4364
person.identifier.ridH-7699-2013
person.identifier.scopus-author-id14035269500
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.nameFundação para a Ciência e a Tecnologia
rcaap.rightsopenAccess
rcaap.typearticle
relation.isAuthorOfPublicatione817fa28-a005-40e2-9ba4-03fdaedd7df3
relation.isAuthorOfPublication.latestForDiscoverye817fa28-a005-40e2-9ba4-03fdaedd7df3
relation.isProjectOfPublication4b33c456-e2db-4613-a2ef-db1484b29ab7
relation.isProjectOfPublication.latestForDiscovery4b33c456-e2db-4613-a2ef-db1484b29ab7

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
11987.pdf
Size:
163.08 KB
Format:
Adobe Portable Document Format