From words to visuals: a transformer-based multi-modal framework for emotion-driven tourism analytics

Calderón-Fajardo, Víctor; Rodríguez-Rodríguez, Ignacio; Puig-Cabrera, Miguel

Publicação

From words to visuals: a transformer-based multi-modal framework for emotion-driven tourism analytics

2025-07-22Artigo científico

datacite.subject.sdg	09:Indústria, Inovação e Infraestruturas
datacite.subject.sdg	08:Trabalho Digno e Crescimento Económico
datacite.subject.sdg	11:Cidades e Comunidades Sustentáveis
dc.contributor.author	Calderón-Fajardo, Víctor
dc.contributor.author	Rodríguez-Rodríguez, Ignacio
dc.contributor.author	Puig-Cabrera, Miguel
dc.date.accessioned	2026-03-09T09:57:03Z
dc.date.available	2026-03-09T09:57:03Z
dc.date.issued	2025-07-22
dc.description.abstract	Traditional tourism analytics have primarily relied on isolated sentiment analysis and image processing techniques, often failing to capture the subtle interaction between textual expressions and visual aesthetics inherent in tourist experiences. This study addresses these limitations by proposing a novel multi-modal framework that transforms textual reviews into AI-generated images using standardized prompts, thereby converting affective signals into explicit visual features. Leveraging stateof-the-art models—such as Distilled Bidirectional Encoder Representations from Transformers (DistilBERT) for fine-grained emotion recognition and Contrastive Language–Image Pre training (CLIP) for semantic extraction of visual attributes— our approach maps complex sentiments onto interpretable visual characteristics, integrating explainable features to uncover the underlying structure in tourist perceptions. This approach enhances classification performance and provides a transparent mechanism for understanding how distinct emotional states correspond to specific visual cues. Experimental evaluations on a dataset encompassing four diverse tourist destinations—Berlin, Dublin, Cairo, and Málaga—demonstrate high classification accuracy and robust correlations between text-derived emotions and image-based features, close to more powerful embedding methods. Significant correlations were observed between emotions and visual features, e.g., brightness and contentment, as well as between entropy and shame, indicating that our method efficiently captures the affective resonance between visual and textual modalities. Our findings underscore the transformative potential of converting textual sentiment into visual representations to facilitate more accurate, interpretable, and actionable analytics in the tourism sector. This framework suggests promising avenues for dynamic destination characterization, informed marketing strategies, and enhanced urban planning initiatives, laying the foundation for future advancements in multimodal tourism analytics.	eng
dc.description.sponsorship	RYC2023-045296-I; MICIU/AEI/10.13039/501100011033
dc.identifier.doi	10.1007/s40558-025-00334-2
dc.identifier.eissn	1943-4294
dc.identifier.issn	1098-3058
dc.identifier.uri	http://hdl.handle.net/10400.1/28375
dc.language.iso	eng
dc.peerreviewed	yes
dc.publisher	Springer
dc.relation	Research Centre for Tourism, Sustainability and Well-being
dc.relation.ispartof	Information Technology & Tourism
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	Multimodal tourism analytics
dc.subject	Transformer models
dc.subject	Text-toImage generation
dc.subject	Affective sentiment analysis
dc.subject	Explainable AI
dc.subject	Destination classification
dc.title	From words to visuals: a transformer-based multi-modal framework for emotion-driven tourism analytics	eng
dc.type	journal article
dspace.entity.type	Publication
oaire.awardNumber	UIDB/04020/2020
oaire.awardTitle	Research Centre for Tourism, Sustainability and Well-being
oaire.awardURI	info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F04020%2F2020/PT
oaire.citation.issue	4
oaire.citation.title	Information Technology and Tourism
oaire.citation.volume	27
oaire.fundingStream	6817 - DCRRNI ID
oaire.version	http://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyName	Puig-Cabrera
person.givenName	Miguel
person.identifier.ciencia-id	4816-E98C-E353
person.identifier.orcid	0000-0003-4524-9830
project.funder.identifier	http://doi.org/10.13039/501100001871
project.funder.name	Fundação para a Ciência e a Tecnologia
relation.isAuthorOfPublication	e926f262-ecb5-44df-9de7-454755dac26e
relation.isAuthorOfPublication.latestForDiscovery	e926f262-ecb5-44df-9de7-454755dac26e
relation.isProjectOfPublication	fa579efb-63c0-486e-b05d-859542b73647
relation.isProjectOfPublication.latestForDiscovery	fa579efb-63c0-486e-b05d-859542b73647

Ficheiros

Principais

A mostrar 1 - 1 de 1

Nome:: s40558-025-00334-2.pdf
Tamanho:: 3.72 MB
Formato:: Adobe Portable Document Format

Ver/Abrir

Licença

A mostrar 1 - 1 de 1

Nome:: license.txt
Tamanho:: 3.46 KB
Formato:: Item-specific license agreed upon to submission
Descrição:

Ver/Abrir

Coleções

CNT2-Artigos (em revistas ou actas indexadas)