ZHU

Updated 62 days ago
  • ID: 46290304/19
Korkeakoulunkatu 7, FI-33720, Tampere, Finland
In this paper, we introduce a two-stage visual sound source separation architecture, called Appearance and Motion network (AMnet), where the stages specialise to appearance and motion cues, respectively. We propose an Audio-Motion Embedding (AME) framework to learn the motions of sounds in a self-supervised manner. Furthermore, we design a new Audio-Motion Transformer (AMT) module to facilitate the fusion of audio and motion cues.
Primary location: Tampere Finland
  • 0
  • 0
Interest Score
3
HIT Score
0.43
Domain
ly-zhu.github.io

Actual
ly-zhu.github.io

IP
185.199.108.153, 185.199.109.153, 185.199.110.153, 185.199.111.153

Status
OK

Category
Company
0 comments Add a comment