Abstract: In video-based emotion recognition, effective multi-modal fusion techniques are essential to leverage the complementary relationship between audio and visual modalities. Recent ...
SoundTube and Ford AV deliver intelligible, discreet audio for the Library of Congress Treasures Gallery using Dante-enabled ...
Abstract: Audio–visual event localization (AVEL) aims to recognize events in videos by associating audio–visual information. However, events involved in existing AVEL tasks are usually coarse-grained ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results