Meta Releases AI Models That Generate Both Text and Images
Meta has released five new artificial intelligence (AI) research models, including ones that can generate both text and images and that can detect AI-generated speech within larger audio snippets. The models were publicly released by Meta’s Fundamental AI Research (FAIR) team.
Chameleon: Mixed-Modal Models
One of the new models, Chameleon, is a family of mixed-modal models that can understand and generate both images and text. These models can take input that includes both text and images and output a combination of text and images. Meta suggested that this capability could be used to generate captions for images or to use both text prompts and images to create a new scene.
Pretrained Models for Code Completion
Also released were pretrained models for code completion. These models were trained using Meta’s new multitoken prediction approach, where large language models (LLMs) predict multiple future words at once instead of one word at a time.
JASCO: AI Music Generation
The third new model, JASCO, offers more control over AI music generation by accepting various inputs including chords or beat. This model incorporates both symbols and audio in a text-to-music generation model.
AudioSeal: AI-Generated Speech Detection
AudioSeal features an audio watermarking technique that enables the localized detection of AI-generated speech. It can pinpoint AI-generated segments within a larger audio snippet and detect AI-generated speech much faster than previous methods.
Geographical and Cultural Diversity Model
The fifth new AI research model aims to increase geographical and cultural diversity in text-to-image generation systems. Meta has released geographic disparities evaluation code and annotations to improve evaluations of text-to-image models.
Meta Releases Five New AI Models for Audio and Visual Research
Meta also released five new AI models for audio and visual research, including models that can detect AI-generated speech within audio snippets and generate both text and images. These models are part of Meta's ongoing efforts to advance AI research and development.
In an April earnings report, Meta stated that capital expenditures on AI and the metaverse-development division Reality Labs will range between $35 billion and $40 billion by the end of 2024, exceeding the initial forecast by $5 billion.
Meta's CEO, Mark Zuckerberg, mentioned the development of various AI services, including an AI assistant, augmented reality apps, and business AIs during the company’s quarterly earnings call.
For more information, you can check the source.