Name: | Description: | Size: | Format: | |
---|---|---|---|---|
6.11 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
A análise de sentimentos é um método eficaz para determinar a opinião pública. As publicações nas redes sociais têm sido objeto de muita investigação, principalmente devido à enorme e diversificada base de utilizadores dessas plataformas que partilham regularmente opiniões sobre praticamente todos os assuntos. No entanto, nas publicações (posts) compostas por um par texto-imagem, a descrição escrita pode ou não transmitir o mesmo sentimento que a imagem. Este estudo utiliza modelos de aprendizagem automática para a avaliação automática do sentimento de pares de texto e imagem(ns). Os sentimentos derivados da imagem e do texto são avaliados de forma independente e associados (ou não) para formar o sentimento global, devolvendo o sentimento da publicação e a discrepância entre os sentimentos representados pelo par texto-imagem. A classificação do sentimento da imagem é dividida em 4 categorias: “interior” (IND), “exterior feito pelo homem” (OMM), “exterior não feito pelo homem” (ONMM) e “interior/exterior com pessoas em segundo plano” (IOwPB). No final, os resultados são consolidados num modelo de classificação do sentimento da imagem (ISC), que pode ser comparado com um classificador holístico do sentimento da imagem (HISC), mostrando que o ISC obtém melhores resultados do que o HISC. Para um subconjunto de dados do Flickr, a classificação do sentimento das imagens, por categoria, atingiu uma exatidão de 68,50% para IND, 83,20% para OMM, 84,50% para ONMM, 84,80% para IOwPB e 76,45% para ISC, em comparação com 65,97% do HISC. Para a classificação do sentimento do texto, num subconjunto da base de dados B-T4SA, foi alcançada uma exatidão de 92,10%. Por fim, a combinação texto-imagem, num conjunto de dados privado, obteve uma exatidão de 78,84%.
Sentiment analysis is an effective method for determining public opinion. Social media posts have been the subject of much research, due to the platforms’ enormous and diversified user base that regularly share thoughts on nearly any subject. However, on posts composed by a text-image pair the written description may or may not convey the same sentiment as the image. The present study uses machine learning models for the automatic sentiment evaluation of pairs of text and image(s). The sentiments derived from the image and text are evaluated independently and merged (or not) to form the overall sentiment, returning the sentiment of the post and the discrepancy between the sentiments represented by the text-image pair. The image sentiment classification is divided into 4 categories: “indoor” (IND), “man-made outdoors” (OMM), “non-man-made outdoors” (ONMM), and “indoor/outdoor with persons in the background” (IOwPB). The results are then ensembled into an image sentiment classification model (ISC), that can be compared with a holistic image sentiment classifier (HISC), showing that the ISC achieves better results than the HISC. For the Flickr sub-dataset, the sentiment classification of images achieved an accuracy of 68.50% for IND, 83.20% for OMM, 84.50% for ONMM, 84.80% for IOwPB, and 76.45% for ISC, compared to the 65.97% for HISC. For the text sentiment classification, in a sub-dataset of B-T4SA, an accuracy of 92.10% was achieved. Finally, the text-image combination, in a private dataset, achieved an accuracy of 78.84%.
Sentiment analysis is an effective method for determining public opinion. Social media posts have been the subject of much research, due to the platforms’ enormous and diversified user base that regularly share thoughts on nearly any subject. However, on posts composed by a text-image pair the written description may or may not convey the same sentiment as the image. The present study uses machine learning models for the automatic sentiment evaluation of pairs of text and image(s). The sentiments derived from the image and text are evaluated independently and merged (or not) to form the overall sentiment, returning the sentiment of the post and the discrepancy between the sentiments represented by the text-image pair. The image sentiment classification is divided into 4 categories: “indoor” (IND), “man-made outdoors” (OMM), “non-man-made outdoors” (ONMM), and “indoor/outdoor with persons in the background” (IOwPB). The results are then ensembled into an image sentiment classification model (ISC), that can be compared with a holistic image sentiment classifier (HISC), showing that the ISC achieves better results than the HISC. For the Flickr sub-dataset, the sentiment classification of images achieved an accuracy of 68.50% for IND, 83.20% for OMM, 84.50% for ONMM, 84.80% for IOwPB, and 76.45% for ISC, compared to the 65.97% for HISC. For the text sentiment classification, in a sub-dataset of B-T4SA, an accuracy of 92.10% was achieved. Finally, the text-image combination, in a private dataset, achieved an accuracy of 78.84%.
Description
Keywords
Análise de sentimentos Computação afetiva Inteligência artificial centrada no humano