This week in tech: 13.01.2025 - United States of Banan
Hugging Face introduces Smolagents
Hugging Face has introduced Smolagents, a minimalist open-source framework designed to streamline the development of AI agents. The core feature of Smolagents is the CodeAgent, which allows AI to write Python code directly instead of using traditional toolcalling methods, resulting in a 30% reduction in required steps.
Minimal code seems to be the name of the game with Smolagents. They boast the ability to develop an agent in just three lines of Python code, making it highly accessible for developers of all skill levels.
For more information, you can check out the official blog announcement here.
NVIDIA's Project DIGITS
At CES 2025, NVIDIA CEO Jensen Huang unveiled Project DIGITS, a $3,000 personal AI supercomputer powered by the GB10 Grace Blackwell Superchip. This innovation enables users to run large 200B parameter models locally, allowing average consumers to operate ChatGPT-scale models from their desks using standard power outlets.
For further details, you can read the full story here.
Russia-China AI Partnership
Russian President Vladimir Putin has directed government departments and Sberbank to form AI partnerships with China. This collaboration aims to develop AI cooperation between the two countries, potentially creating a parallel AI ecosystem that could rival Western dominance in the field.
To read the full story, click here.
Microsoft's AI Investments
Microsoft is investing 80 billion USD in new data centers by the end of fiscal 2025 to support significant AI workloads and strengthen its position in the AI space. Over half of these data centers will be built in the United States, demonstrating Microsoft's confidence in the American economy and its commitment to creating jobs and stimulating economic growth domestically.
To learn more about Microsoft's AI move, check out the article here.
European Commission's Data Protection Controversy
The European Commission has faced criticism for violating its own data protection rules. The EU General Court ordered the Commission to compensate a German national over a data privacy breach related to the use of Facebook login functionality in its conference registration system.
For further details, you can read the full story here.
ByteDance's LatentSync
ByteDance has introduced LatentSync, an end-to-end lip sync system that eliminates intermediate motion representations to preserve subtle facial expressions and solve pixel space limitations common in two-stage approaches. This system has shown exceptional performance across various metrics.
For more information, you can access the paper here and the repository here.
NVIDIA's Cosmos Family of Models
NVIDIA has launched the Cosmos family of extremely multimodal models, offering three variants: Nano, Super, and Ultra. These models support multiple generation modes and are trained on a vast dataset to enable physics-aware video simulations.
To explore more about the Cosmos family of models, visit the project page here.
MetaMorph Multimodal Model
MetaMorph is a new multimodal model that demonstrates how large language models can understand and generate visual content through Visual-Predictive Instruction Tuning (VPiT). The model has shown strong generation capabilities with a modest number of samples, emphasizing the importance of visual understanding training.
For in-depth information, you can refer to the paper here.
OpenAI's o1 Model Reproduction
A recent paper showcases the reproduction of OpenAI's o1 model from scratch, highlighting advancements in reverse engineering in the field of AI.
To delve deeper into this topic, you can access the paper here.