Unveiling the Mystery: OpenAI's Q*/Strawberry AI Model

Published On Sat Aug 10 2024
Unveiling the Mystery: OpenAI's Q*/Strawberry AI Model

Radical Data Science | News and Industry Analysis for Data Science...

Welcome to the AI News Briefs Bulletin Board, a timely new channel bringing you the latest industry insights and perspectives surrounding the field of AI including deep learning, large language models, generative AI, and transformers. I am working tirelessly to dig up the most timely and curious tidbits underlying the day’s most popular technologies. I know this field is advancing rapidly and I want to bring you a regular resource to keep you informed and state-of-the-art. The news bites are constantly being added in reverse date order (most recent on top). With the bulletin board you can check back often to see what’s happening in our rapidly accelerating industry. Click HERE to check out previous “AI News Briefs” round-ups.

August 9, 2024

“Project Strawberry” – A new unknown AI model has appeared in the LMSYS Chatbot Arena (an open-source platform where AI startups often test upcoming releases), igniting rumors that it could be OpenAI’s highly anticipated Q* AI breakthrough or its evolution — codenamed “Strawberry.” Previously, OpenAI tested GPT-4o with gpt2-chatbot two weeks before releasing it to the public, which put the arena on high alert for new AI models. Testers of “anonymous-chatbot” report that it shows more advanced reasoning than GPT-4o and any other frontier model. To add fuel to the speculation, Sam Altman tweeted a picture of a Strawberry on X, which is the codename of OpenAI’s reported secret AI model.

As competitors like Anthropic and Meta start to catch up to GPT-4o, the Internet has been eagerly awaiting OpenAI’s next move. If this mystery model is indeed Q*/Strawberry, then we could be on the cusp of another seismic shift in AI capabilities.

August 9, 2024

OpenAI co-founder John Schulman is leaving the company – to join rival AI firm Anthropic, where he will focus on AI alignment and hands-on technical work. This marks the latest senior departure from OpenAI, following a series of high-profile exits and a lawsuit initiated by Tesla CEO Elon Musk against the organization.

![Image](https://azure.microsoft.com/en-us/blog/wp-content/uploads/2024/08/CTA-8.5.png)

August 9, 2024

Introducing Qwen2-Math – Alibaba just released Qwen2-Math, a specialized AI model series that outperforms GPT-4 in mathematical problem-solving capabilities.

August 9, 2024

Hugging Face acquired XetHubHugging Face serves and stores a lot of data, most of it in LFS. XetHub has written its own, powerful, alternative for scaling Git repositories.

August 9, 2024

A Language Model with Quick Pre-Training – The “1.5-Pints” Language Model presents a new approach to compute-efficient pre-training. By curating a high-quality dataset of 57 billion tokens, this model surpasses Apple’s OpenELM and Microsoft’s Phi in instruction-following tasks, as measured by MT-Bench.

August 8, 2024

AMD is becoming an AI chip company, just like NVIDIA – AMD’s Q2 2024 earnings show a significant shift toward data center products, with nearly half of its sales now in this area, thanks primarily to the Instinct MI300 AI chip. The company has committed to annual new AI chip releases, competing with Nvidia’s offerings, despite supply constraints projected to last until 2025. Although Nvidia remains ahead in the data center market, AMD has seen growth in its CPU and GPU segments, including its Ryzen processors and Radeon 6000 GPUs.

August 8, 2024

New Research Paper: “Scaling LLM Test-Time Compute Optimally can be more Effective than Scaling Model Parameters” – There is strong pressure to use compute at inference time to improve model performance. This paper showcases several methods that can be used and discusses the trade offs made between them. In general, this points towards a broader trend of squeezing performance out of smaller models.

August 8, 2024

Infinite Bookshelf: Generate entire new books in seconds using Groq and Llama3 – Imagine you want to learn about the technology behind LLMs. You instantly get an 100 page book with chapters, content, and structure. What if you find the language too technical? You can change the prompt and the book – all 100 pages – adapts to your needs … That’s the power of an Infinite Bookshelf. This app, created by Benjamin Kleiger, AI Applications Engineer Intern at Groq, writes full books using Llama 3 (8b and 70b), Powered by Groq. Check out the GitHub repo HERE.

August 8, 2024

It’s practically impossible to run a big AI company ethically – It was supposed to be different with OpenAI, the maker of ChatGPT. In fact, all of Anthropic’s founders once worked at OpenAI but quit in part because of differences over safety culture there, and moved to spin up their own company that would build AI more responsibly. Very noble, that is until new AI regulations worked to trim their “profit and prestige.” After all, even Facebook believed at one time, they were on the side of the force, the dark side be damned.

August 7, 2024

Direct Preference Optimization (DPO) from Scratch – The “Direct Preference Optimization (DPO) for LLM Alignment” lecture by Sebastian Raschka provides a comprehensive walkthrough of aligning LLMs with user preferences using Direct Preference Optimization. This approach simplifies alignment compared to reinforcement learning from human feedback (RLHF) by directly optimizing model outputs to align with user expectations. The lecture includes a complete notebook that guides you through.

An underground model, InternLM 2.5 20B, outperforms Gemma 2 27B and rivals Llama 3.1 70B in benchmarks.

![Image](https://readwrite.com/wp-content/uploads/2024/07/OpenAIs-Strawberry-AI-model-aims-for-advanced-reasoning-reports-300x200.png)

August 7, 2024

OpenAI has launched JSON Structured Outputs and an updated model with 50% cheaper inputs, 100% reliability – OpenAI released the new GPT-4o-2024-08-06 model, which supports Structured Outputs. This model achieves perfect scores in JSON schema adherence, ensuring reliable and consistent output formats. It offers significant cost reductions and improved performance. The model features an increased context window of 16,384 output tokens, up from 4,096 tokens.

Structured Outputs make sure model outputs strictly follow developer-supplied JSON Schemas. This feature guarantees outputs match the provided schema by using constrained sampling techniques, converting JSON Schemas into context-free grammars. Structured Outputs provide 100% reliability in JSON schema adherence. This reduces the need for developers to manually validate and retry outputs.

Developers enforce strict schema adherence using the “strict”: true option in function definitions or the “type”: “json_schema” option for response formats. This feature is valuable for tasks requiring precise data formats, like dynamically generating user interfaces and extracting structured data.

![Image](https://gpt-chat.co.uk/wp-content/uploads/2024/07/Language-Processing-Units-1024x708.png)

Structured Outputs are available as function calling and response formats. Function calling supports models from GPT-3.5-turbo-0613 and GPT-4-0613, allowing strict schema enforcement within tool definitions. The response format option works with the latest GPT-4o models. OpenAI’s Python and Node SDKs now have native support for Structured Outputs, simplifying implementation.

Structured Outputs maintain consistent output formats, improving the reliability and performance of AI applications by dynamically generating user interfaces and extracting structured data. The feature is available for GPT-4o-2024-08-06 but not for all OpenAI models.

August 6, 2024

Groq secured $640M Series D round – BlackRock, a large asset manager, has led a $640M Series D round for Groq, a startup specializing in AI inference chips. This significant investment, valuing it at $2.8 billion, defies the prevailing “AI doomer” narrative and the current market meltdown which included key rival NVIDIA losing major market value. It also underscores BlackRock’s confidence in the long-term potential of AI hardware and Groq’s unique tech approach, even as some investors express skepticism about the industry’s ROI potential.

Groq’s Language Processing Units (LPUs) are designed to optimize AI inference, the process of deploying trained AI models for real-world applications. Its focus on efficient, high-throughput inference, particularly for LLMs, addresses a critical bottleneck in the AI ecosystem, especially with NVIDIA’s latest Blackwell GPU chips now being delayed by three months.

![Image](https://cloudfront-us-east-2.images.arcpublishing.com/reuters/Q2V6GTQXBVNPVHWAYZV4OJBQB4.jpg)

Currently GroqCloud’s limited scalable cloud space has been prohibitively expensive as they were focused on training their own models, meaning they were mainly used by enterprise clients among Fortune 500 companies. It was also available on hobbyist free plans with tiny rate limits, effectively preventing production use.

With this funding, however, Groq plans to expand its cloud-based inference service to enable developers to easily build and deploy AI applications using popular open LLMs like Meta’s shiny new Llama 3.1. This vision unlocks a “scale as you need” service providers like Octo offer. It also unlocks a BYOM “bring your own model” service for users to “fine-tune” leading off-the-shelf open models like Llama 3.1, served at Groq’s blazingly fast speeds.

BlackRock’s $300M investment in Groq can be seen as a strategic bet on the future of AI infrastructure. While some investors focus on the lack of immediate revenue streams, it seems to recognize the long-term value of companies like Groq that are building the foundational tech for the next generation of AI applications. This investment signals BlackRock’s belief that the AI industry is still in its early stages and that significant growth and revenue opportunities lie ahead.