10 Exciting Features of the Qwen2.5 72B Model

Published On Tue Sep 24 2024

OpenRouter

Qwen2.5 72B is the latest series of Qwen large language models, surpassing its predecessor Qwen2 with significant improvements:

Enhanced knowledge in coding and mathematics, thanks to specialized expert models
Improved capabilities in instruction following, generating long texts, understanding structured data, and generating structured outputs
Enhanced resilience to system prompts, benefiting role-play implementation and chatbot condition-setting
Support for long-context up to 128K tokens and generation of up to 8K tokens
Multi-lingual support for over 29 languages

Usage of this model is governed by the Tongyi Qianwen LICENSE AGREEMENT.

Google's Gemini 1.5 Pro enters public preview on Vertex AI ...

Gemini 1.5 Pro (0827) is an experimental version of the Gemini 1.5 Pro model, subject to Google's Gemini Terms of Use.

DeepSeek-V2.5 is an upgraded version combining DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct, integrating general and coding abilities. The model utilizes 236B total parameters, with 21B activated for each token. Compared to DeepSeek 67B, DeepSeek-V2 offers improved performance, cost savings, reduced cache, and boosted generation throughput.