왜 Audio-Visual인가?

꼼댕이·2023년 9월 21일
0

Affective Computing

목록 보기
11/13

Audio–Visual Fusion for Emotion Recognition in the Valence–Arousal Space Using Joint Cross-Attention

Although human emotions can be expressed through various modalities, vocal and facial modalities are the predominant contact-free channels, which carries complementary information

Although human emotions can be expressed through various modalities, vocal and facial modal-ities are the predominant contact-free channels, which carries complementary information [7]. Audio-visual (A-V) fusion has also been widely explored for various applications including identity verification [8], event localization [9], action recogni-tion [10], etc. Efficiently leveraging the complementary nature of A-V relationships captured in videos can play a crucial role in improving the performance of multimodal systems over unimodal systems

profile
사람을 연구하는 공돌이

0개의 댓글