Nothing's in my cart
2-minute read
In the past, creating Foley sound effects for movies meant heading into a studio and using a bunch of unexpected props to mimic footsteps, ambient sounds, and more. Even though we now have convenient digital sound effect libraries, capturing specific rhythms and sound expressions can still be quite challenging, much like the art of ventriloquism.
Recently, Adobe Research and Northwestern University teamed up to publish a paper on arXiv titled "Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations". They introduced a cool AI sound generation tool called "Sketch2Sound". With just a few hums into a microphone, or by boldly imitating the sound of screeching tires, Sketch2Sound can transform your amateurish sounds into realistic audio effects, akin to professional audio quality.
Got any ideas on how to use this? (Source: Sketch2Sound)
As shown in the demo video, you don't need to be a mimicry expert or have amazing vocal skills. Just use the right volume changes and pitch variations to express the sound you want, and with simple cue words like "car racing" or "forest ambience," your amateur humming can smoothly turn into professional sound effects. As the name suggests, Sketch2Sound is like sketching for sound effects, with AI helping you "transform" them, similar to how text to audio models work.
This isn't just for movie sound effects; Sketch2Sound can be used in ads, games, audiobooks, and even online video creation, offering a new dimension to AI audio applications.
Simply put, Sketch2Sound is a text-to-audio model based on DiT (Latent Diffusion Transformer), enhanced with three main sound control signals: loudness, spectral centroid (brightness), and pitch probabilities. This means that even if your vocal skills are basic, the AI can recognize your sound expressions (loudness, brightness, pitch) and, using your prompts, convert them into completely different timbres, like a car engine sound, much like Adobe's other innovative tools.
The demo video also shows how an electric guitar can be used to "transform sound," illustrating that the AI recognizes sound expressions, while the text-to-audio model is responsible for converting them into various timbres, similar to how foley sound effects are crafted.
Are you excited about Sketch2Sound? It might not be a groundbreaking innovation, but it certainly offers a more intuitive and flexible way to create. It significantly lowers the barrier for those without mixing and sound design expertise to produce professional audio with just a bit of humming. It's quite fascinating! Who knows, maybe Adobe will integrate Sketch2Sound into Premiere in the future?