T-FOLEY

A controllable waveform-domain diffusion model for temporal-event-guided foley sound synthesis (ICASSP 2024)

T-FOLEY is a waveform-domain diffusion model that synthesizes foley sound effects conditioned on temporal event sequences. Unlike spectrogram-based approaches, T-FOLEY operates directly in the waveform domain and allows fine-grained control over the timing of sound events in the generated audio.

The model was presented at IEEE ICASSP 2024 (Chung et al., 2024).

Links: Code

References

2024

  1. ICASSP
    T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis
    Yoonjin Chung, Junwon Lee, and Juhan Nam
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024