Abstract: Audio–visual event localization (AVEL) aims to recognize events in videos by associating audio–visual information. However, events involved in existing AVEL tasks are usually coarse-grained ...
Abstract: Currently, audio-visual speech separation methods utilize the speaker's audio and visual correlation information to help separate the speech of the target speaker. However, these methods ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results