From cues to engagement: a comprehensive survey and holistic architecture for computer vision-based audience analysis in live events

Lemos, Marco; Cardoso, Pedro; Rodrigues, Joao

Publicação

From cues to engagement: a comprehensive survey and holistic architecture for computer vision-based audience analysis in live events

2026-01-08Artigo científico

datacite.subject.sdg	09:Indústria, Inovação e Infraestruturas
datacite.subject.sdg	04:Educação de Qualidade
datacite.subject.sdg	12:Produção e Consumo Sustentáveis
dc.contributor.author	Lemos, Marco
dc.contributor.author	Cardoso, Pedro
dc.contributor.author	Rodrigues, Joao
dc.date.accessioned	2026-02-27T14:06:51Z
dc.date.available	2026-02-27T14:06:51Z
dc.date.issued	2026-01-08
dc.description.abstract	The accurate measurement of audience engagement in real-world live events remains a significant challenge, with the majority of existing research confined to controlled environments like classrooms. This paper presents a comprehensive survey of Computer Vision AI-driven methods for real-time audience engagement monitoring and proposes a novel, holistic architecture to address this gap, with this architecture being the main contribution of the paper. The paper identifies and defines five core constructs essential for a robust analysis: Attention, Emotion and Sentiment, Body Language, Scene Dynamics, and Behaviours. Through a selective review of state-of-the-art techniques for each construct, the necessity of a multimodal approach that surpasses the limitations of isolated indicators is highlighted. The work synthesises a fragmented field into a unified taxonomy and introduces a modular architecture that integrates these constructs with practical, businessoriented metrics such as Commitment, Conversion, and Retention. Finally, by integrating cognitive, affective, and behavioural signals, this work provides a roadmap for developing operational systems that can transform live event experience and management through data-driven, real-time analytics.	eng
dc.description.sponsorship	ALGARVE-FEDER-01180500
dc.identifier.doi	10.3390/mti10010008
dc.identifier.issn	2414-4088
dc.identifier.uri	http://hdl.handle.net/10400.1/28287
dc.language.iso	eng
dc.peerreviewed	yes
dc.publisher	MDPI
dc.relation.ispartof	Multimodal Technologies and Interaction
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	Affective computing
dc.subject	Crowd engagement
dc.subject	HCI
dc.subject	Real-time engagement
dc.subject	Real-time analytics
dc.subject	Computer vision
dc.subject	Emotion recognition
dc.subject	Crowd behaviour
dc.subject	Event monitoring
dc.title	From cues to engagement: a comprehensive survey and holistic architecture for computer vision-based audience analysis in live events	eng
dc.type	journal article
dspace.entity.type	Publication
oaire.citation.issue	1
oaire.citation.title	Multimodal Technologies and Interaction
oaire.citation.volume	10
oaire.version	http://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyName	Lemos
person.familyName	Cardoso
person.familyName	Rodrigues
person.givenName	Marco
person.givenName	Pedro
person.givenName	Joao
person.identifier.ciencia-id	5F10-1C37-FE45
person.identifier.ciencia-id	8A19-98F7-9914
person.identifier.orcid	0009-0004-8727-4254
person.identifier.orcid	0000-0003-4803-7964
person.identifier.orcid	0000-0002-3562-6025
person.identifier.rid	G-6405-2013
person.identifier.scopus-author-id	35602693500
person.identifier.scopus-author-id	55807461600
relation.isAuthorOfPublication	4a87a5d5-44c9-41a2-a7df-a65da18c5533
relation.isAuthorOfPublication	62bebc54-51ee-4e35-bcf5-6dd69efd09e0
relation.isAuthorOfPublication	683ba85b-459c-4789-a4ff-a4e2a904b295
relation.isAuthorOfPublication.latestForDiscovery	4a87a5d5-44c9-41a2-a7df-a65da18c5533