Logo do repositório
 
Publicação

Leveraging NLP and machine learning for English (L1) writing assessment in developmental education

datacite.subject.sdg04:Educação de Qualidade
datacite.subject.sdg10:Reduzir as Desigualdades
datacite.subject.sdg09:Indústria, Inovação e Infraestruturas
dc.contributor.authorDa Corte, Miguel
dc.contributor.authorBaptista, Jorge
dc.date.accessioned2026-05-14T11:30:12Z
dc.date.available2026-05-14T11:30:12Z
dc.date.issued2024
dc.description.abstractThis study investigates using machine learning and linguistic features to predict placements in Developmental Education (DevEd) courses based on English (L1) writing proficiency. Placement in these courses is often performed using systems like ACCUPLACER, which automatically assesses and scores standardized writing assignments in entrance exams. Literature on ACCUPLACER’s assessment methods and the features accounted for in the scoring process is scarce. To identify the linguistic features important for placement decisions, 100 essays were randomly selected and analyzed from a pool of essays written by 290 native speakers. A total of 457 Linguistic attributes were extracted using COH-METRIX (106), the Common Text Analysis Platform (CTAP) (330), plus 21 DevEd-specific features produced by the manual annotation of the corpus. Using the ORANGE Text Mining toolkit, several supervised Machine-learning (ML) experiments with two classification scenarios (full and split sample essays) were conducted to determine the best linguistic features and bestperforming ML algorithm. Results revealed that the Naive Bayes, with a selection of the 30 highest-ranking features (21 CTAP, 7 COH-METRIX, 2 DevEd-specific) based on the Information Gain scoring method, achieved a classification accuracy (CA) of 77.3%, improving to 81.8% with 60 features. This approach surpassed the baseline accuracy of 72.7% for the full essay scenario, demonstrating enhanced placement accuracy and providing new insights into students’ linguistic skills in DevEd.eng
dc.identifier.doi10.5220/0012740500003693
dc.identifier.urihttp://hdl.handle.net/10400.1/28962
dc.language.isoeng
dc.peerreviewedyes
dc.publisherSCITEPRESS - Science and Technology Publications
dc.relationInstituto de Engenharia de Sistemas e Computadores, Investigação e Desenvolvimento em Lisboa
dc.relation.ispartofProceedings of the 16th International Conference on Computer Supported Education
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectDevelopmental education (DevEd)
dc.subjectAutomatic writing assessment systems
dc.subjectNatural language processing (NLP)
dc.subjectMachine-learning models
dc.titleLeveraging NLP and machine learning for English (L1) writing assessment in developmental educationeng
dc.typeconference object
dspace.entity.typePublication
oaire.awardNumberUIDB/50021/2020
oaire.awardTitleInstituto de Engenharia de Sistemas e Computadores, Investigação e Desenvolvimento em Lisboa
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F50021%2F2020/PT
oaire.citation.endPage140
oaire.citation.startPage128
oaire.citation.titleProceedings of the 16th International Conference on Computer Supported Education
oaire.citation.volume2
oaire.fundingStream6817 - DCRRNI ID
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyNameDa Corte
person.familyNameBaptista
person.givenNameMiguel
person.givenNameJorge
person.identifier.ciencia-id7010-5366-22C5
person.identifier.orcid0000-0001-8782-8377
person.identifier.orcid0000-0003-4603-4364
person.identifier.ridH-7699-2013
person.identifier.scopus-author-id14035269500
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.nameFundação para a Ciência e a Tecnologia
relation.isAuthorOfPublication4a524eae-b359-47fa-8978-028ac5ffb57e
relation.isAuthorOfPublicatione817fa28-a005-40e2-9ba4-03fdaedd7df3
relation.isAuthorOfPublication.latestForDiscovery4a524eae-b359-47fa-8978-028ac5ffb57e
relation.isProjectOfPublication0b14d63a-8f78-4e31-8a86-b72e1f07871f
relation.isProjectOfPublication.latestForDiscovery0b14d63a-8f78-4e31-8a86-b72e1f07871f

Ficheiros

Principais
A mostrar 1 - 1 de 1
A carregar...
Miniatura
Nome:
127405 (1).pdf
Tamanho:
337.44 KB
Formato:
Adobe Portable Document Format
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
3.46 KB
Formato:
Item-specific license agreed upon to submission
Descrição: