Adobe has expanded its Firefly generative AI platform with a new “Generate Sound Effects” tool, which entered beta on July 17. The feature, available in the Firefly web app, allows creators to produce custom audio from text prompts.
In a unique approach that sets it apart, the tool also lets users record their own voice—making sounds like “whoosh” or “clip-clop”—to guide the AI in generating effects with specific timing and intensity. This launch is part of Adobe’s broader strategy to build a complete, commercially safe toolkit.
The “Generate Sound Effects” tool increases competition in the race to dominate AI audio creation, pitting Adobe against rivals like Meta, ElevenLabs, Stability AI, and NVIDIA.
From ‘Whoosh’ to Soundscape: A New Way to Generate Audio
Announced on July 17, 2025, the new Generate Sound Effects tool represents a significant step forward in intuitive content creation. Instead of relying solely on text, creators can now provide vocal cues to shape the final audio output. This audio-led prompting was first teased in Adobe’s Project Super Sonic experiment.
The system analyzes the cadence and rhythm of the user’s recording to place sound effects precisely where they belong in a video timeline. This innovative workflow aims to bridge the gap between a creator’s intent and the AI’s interpretation, a common friction point in generative tools.
Beyond Sound: Firefly’s Expanded Video and Partner Toolkit
This audio generator is just one piece of a larger update to Firefly’s video capabilities. Adobe also introduced “Composition Reference,” a feature that allows users to upload a reference video to mirror its composition in a new AI-generated clip. This gives creators more control over shot framing and consistency.
The update also includes “Style Presets” for applying visual styles like claymation or anime with a single click, and “Keyframe Cropping” to streamline editing workflows. In a nod to the growing demand for scalable content, Adobe also launched “Text to Avatar (beta),” which turns scripts into videos led by a digital presenter.
Further expanding its ecosystem, Adobe is integrating third-party AI models from partners like Runway, Google, Pika, and Luma AI directly into Firefly. Adobe’s Generative AI lead, Alexandru Costin, suggested that “similar controls and presets may be available to use with third-party AI models in the future”, signaling a future where Firefly acts as a central hub for various generative technologies.
Navigating the Crowded and Contentious AI Audio Market
Adobe’s entry into AI sound generation places it in a fiercely competitive field. ElevenLabs launched its own sound effects tool back in June 2024, emphasizing its use of ethically sourced data through a partnership with Shutterstock.
Meanwhile, Stability AI and Arm released an open-source, on-device model in May 2025, focusing on royalty-free audio to avoid copyright disputes. Meta in 2023 launched AudioCraft, a generative AI platform that allows users to create original music and audio content with just a few clicks.
The industry remains cautious, however. NVIDIA unveiled its advanced Fugatto model in November 2024 but has withheld its public release over ethical concerns. Bryan Catanzaro, a VP at Nvidia, told Reuters at the time, “any generative technology always carries some risks, because people might use that to generate things that we would prefer they don’t,” underscoring the risks of misuse. This cautious stance reflects the legal battles faced by other AI firms over copyright infringement.
By building its models on commercially safe datasets and integrating multiple tools into a single platform, Adobe is positioning Firefly as a reliable and comprehensive solution for creative professionals.