Stability AI releases a sound generator
Stability AI, the startup known for its AI-powered art generator Stable Diffusion, has recently unveiled an open AI model designed for creating sounds and songs. This innovative model, named Stable Audio Open, boasts training exclusively on royalty-free recordings, ensuring a diverse and accessible range of outputs.
The generative model operates by inputting a text description, such as "Rock beat played in a treated studio, session drumming on an acoustic kit," and producing a recording that can be up to 47 seconds in length. Stability AI reveals that the model was trained using approximately 486,000 samples sourced from free music libraries Freesound and the Free Music Archive.
Customizable and Versatile
Stability AI emphasizes the versatility of Stable Audio Open, highlighting its ability to generate drum beats, instrument riffs, ambient sounds, and production elements suitable for various visual media formats like videos, films, and TV shows. Additionally, the model can be used to modify existing songs or apply the style of one song, such as smooth jazz, to another.
The startup points out a significant advantage of this open-source release, which is the ability for users to fine-tune the model using their custom audio data. For instance, musicians can refine the model with samples of their own drum recordings to create fresh beats tailored to their unique style.
Limitations and Recommendations
Despite its capabilities, Stable Audio Open does have limitations. The model is not optimized for generating full songs, melodies, or vocals, and users seeking these functionalities are encouraged to explore Stability AI's premium Stable Audio service. Additionally, the open-source model is not intended for commercial use due to the terms of service restrictions.
Stability AI acknowledges that Stable Audio Open may exhibit biases in performance across different musical styles and cultures, as well as when processing descriptions in languages other than English. These biases are attributed to the composition of the training data, indicating areas for improvement in future iterations of the model.
Addressing Concerns and Moving Forward
Stability AI has been navigating challenges in its business operations and addressing controversies within the industry, including issues related to copyright and fair use in training AI models. The release of Stable Audio Open represents a strategic effort to shift the narrative surrounding the company while showcasing its commitment to innovation in generative AI technologies.
As the landscape of music generation evolves, discussions around copyright protection and responsible AI usage continue to gain prominence. Recent developments, such as Sony Music's warnings against unauthorized content use for training AI models and legislative actions aimed at regulating AI technologies in music, underscore the importance of ethical considerations in the industry.