왜 Audio-Visual인가?

꼼댕이·2023년 9월 21일

Affective Computing

목록 보기

11/13

Audio–Visual Fusion for Emotion Recognition in the Valence–Arousal Space Using Joint Cross-Attention

Although human emotions can be expressed through various modalities, vocal and facial modalities are the predominant contact-free channels, which carries complementary information

Although human emotions can be expressed through various modalities, vocal and facial modal-ities are the predominant contact-free channels, which carries complementary information [7]. Audio-visual (A-V) fusion has also been widely explored for various applications including identity verification [8], event localization [9], action recogni-tion [10], etc. Efficiently leveraging the complementary nature of A-V relationships captured in videos can play a crucial role in improving the performance of multimodal systems over unimodal systems

꼼댕이

사람을 연구하는 공돌이

이전 포스트

해보고 싶은 실험

다음 포스트

왜 Audio-Visual인가?

Affective Computing

해보고 싶은 실험

Positional Encoding

0개의 댓글