Cross-modal context-gated convolution for multi-modal sentiment analysis

H. Wen; S. You; Y. Fu

doi:https://doi.org/10.1016/j.patrec.2021.03.025

Cross-modal context-gated convolution for multi-modal sentiment analysis

Authors	H. Wen S. You Y. Fu
Publication date	06-2021
Journal	Pattern Recognition Letters
Volume \| Issue number	146
Pages (from-to)	252-259
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	When inferring sentiments, using verbal clues only is problematic because of the ambiguity. Adding related vocal and visual contexts as complements for verbal clues can be helpful. To infer sentiments from multi-modal temporal sequences, we need to identify both sentiment-related clues and their cross-modal interactions. However, sentiment-related behaviors of different modalities may not occur at the same time. These behaviors and their interactions are also sparse in time, making it hard to infer the correct sentiments. Besides, unaligned sequences from sensors also have varying sampling rates, which amplify the misalignment and sparsity mentioned above. While most previous multi-modal sentiment analysis works only focus on word-aligned sequences, we propose cross-modal context-gated convolution for unaligned sequences. Cross-modal context-gated convolution captures the local cross-modal interactions, dealing with the misalignment while reducing the effect of unrelated information. Cross-modal context-gated convolution introduces the concept of cross-modal context gate, enabling itself to catch useful cross-modal interactions more effectively. Cross-modal context-gated convolution also brings more possibilities to the layer design for multi-modal sequential modeling. Experiments on multi-modal sentiment analysis datasets under both word-aligned and unaligned conditions show the validity of our approach.
Document type	Article
Note	With supplementary raw research data.
Language	English
Published at	https://doi.org/10.1016/j.patrec.2021.03.025
Supplementary materials	1-s2.0-S0167865521001124-mmc1 1-s2.0-S0167865521001124-mmc2
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Cross-modal context-gated convolution for multi-modal sentiment analysis