Please Select Your Location
Australia
Österreich
België
Canada
Canada - Français
中国
Česká republika
Denmark
Deutschland
France
HongKong
Iceland
Ireland
Italia
日本
Korea
Latvija
Lietuva
Lëtzebuerg
Malta
المملكة العربية السعودية (Arabic)
Nederland
New Zealand
Norge
Polska
Portugal
Russia
Saudi Arabia
Southeast Asia
Suisse
Suomi
Sverige
台灣
Ukraine
United Kingdom
United States
Please Select Your Location
België
Česká republika
Denmark
Iceland
Ireland
Italia
Latvija
Lietuva
Lëtzebuerg
Malta
Nederland
Norge
Polska
Portugal
Suisse
Suomi
Sverige
<< Back to Blog

Sketch2Sound: AI Turns Hums into Studio-Quality Sound Effects

VIVE POST-WAVE Team • Jan. 6, 2025

2-minute read

In the past, creating Foley sound effects for movies meant heading into a studio and using a bunch of unexpected props to mimic footsteps, ambient sounds, and more. Even though we now have convenient digital sound effect libraries, capturing specific rhythms and sound expressions can still be quite challenging, much like the art of ventriloquism.

Recently, Adobe Research and Northwestern University teamed up to publish a paper on arXiv titled "Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations". They introduced a cool AI sound generation tool called "Sketch2Sound". With just a few hums into a microphone, or by boldly imitating the sound of screeching tires, Sketch2Sound can transform your amateurish sounds into realistic audio effects, akin to professional audio quality.

Sketch2Sound

Got any ideas on how to use this? (Source: Sketch2Sound)

As shown in the demo video, you don't need to be a mimicry expert or have amazing vocal skills. Just use the right volume changes and pitch variations to express the sound you want, and with simple cue words like "car racing" or "forest ambience," your amateur humming can smoothly turn into professional sound effects. As the name suggests, Sketch2Sound is like sketching for sound effects, with AI helping you "transform" them, similar to how text to audio models work.

This isn't just for movie sound effects; Sketch2Sound can be used in ads, games, audiobooks, and even online video creation, offering a new dimension to AI audio applications.

Sketch2Sound: Adding a "Sound Control" Layer to "Text-to-Audio" Models

Simply put, Sketch2Sound is a text-to-audio model based on DiT (Latent Diffusion Transformer), enhanced with three main sound control signals: loudness, spectral centroid (brightness), and pitch probabilities. This means that even if your vocal skills are basic, the AI can recognize your sound expressions (loudness, brightness, pitch) and, using your prompts, convert them into completely different timbres, like a car engine sound, much like Adobe's other innovative tools.

The demo video also shows how an electric guitar can be used to "transform sound," illustrating that the AI recognizes sound expressions, while the text-to-audio model is responsible for converting them into various timbres, similar to how foley sound effects are crafted.

Hero Figure

Are you excited about Sketch2Sound? It might not be a groundbreaking innovation, but it certainly offers a more intuitive and flexible way to create. It significantly lowers the barrier for those without mixing and sound design expertise to produce professional audio with just a bit of humming. It's quite fascinating! Who knows, maybe Adobe will integrate Sketch2Sound into Premiere in the future?