Meta's New AI Models: Generating Text, Images, and Music

Published On Wed Jun 19 2024
Meta's New AI Models: Generating Text, Images, and Music

Meta Releases AI Models That Generate Both Text and Images

Meta has released five new artificial intelligence (AI) research models, including ones that can generate both text and images and that can detect AI-generated speech within larger audio snippets. The models were publicly released by Meta’s Fundamental AI Research (FAIR) team.

Chameleon: Mixed-Modal Models

One of the new models, Chameleon, is a family of mixed-modal models that can understand and generate both images and text. These models can take input that includes both text and images and output a combination of text and images. Meta suggested that this capability could be used to generate captions for images or to use both text prompts and images to create a new scene.

Why your AI Code Completion tool needs to Fill in the Middle

Pretrained Models for Code Completion

Also released were pretrained models for code completion. These models were trained using Meta’s new multitoken prediction approach, where large language models (LLMs) predict multiple future words at once instead of one word at a time.

JASCO: AI Music Generation

The third new model, JASCO, offers more control over AI music generation by accepting various inputs including chords or beat. This model incorporates both symbols and audio in a text-to-music generation model.

Latest Z-Wave Plus GE by Jasco Wireless Lighting Control Three-Way ...

AudioSeal: AI-Generated Speech Detection

AudioSeal features an audio watermarking technique that enables the localized detection of AI-generated speech. It can pinpoint AI-generated segments within a larger audio snippet and detect AI-generated speech much faster than previous methods.

Geographical and Cultural Diversity Model

The fifth new AI research model aims to increase geographical and cultural diversity in text-to-image generation systems. Meta has released geographic disparities evaluation code and annotations to improve evaluations of text-to-image models.

Meta Releases Five New AI Models for Audio and Visual Research

Meta also released five new AI models for audio and visual research, including models that can detect AI-generated speech within audio snippets and generate both text and images. These models are part of Meta's ongoing efforts to advance AI research and development.

Meta Releases Five New AI Models for Audio and Visual Research ...

In an April earnings report, Meta stated that capital expenditures on AI and the metaverse-development division Reality Labs will range between $35 billion and $40 billion by the end of 2024, exceeding the initial forecast by $5 billion.

Meta's CEO, Mark Zuckerberg, mentioned the development of various AI services, including an AI assistant, augmented reality apps, and business AIs during the company’s quarterly earnings call.

For more information, you can check the source.