huggingface/open-r1: Fully open reproduction of DeepSeek .. - GitHub
We read every piece of feedback, and take your input very seriously. To see all available qualifiers, see our documentation.
A fully open reproduction of DeepSeek-R1. This repo is a work in progress, let's build it together!
Table of Contents
The goal of this repo is to build the missing pieces of the R1 pipeline such that everybody can reproduce and build on top of it. The project is simple by design and mostly consists of:

Caution: Libraries rely on CUDA 12.4. If you see errors related to segmentation faults, double check the version your system is running with nvcc --version. To run the code in this project, first, create a Python virtual environment using e.g. uv. To install uv, follow the UV Installation Guide.
Tip: For Hugging Face cluster users, add export UV_LINK_MODE=copy to your .bashrc to suppress cache warnings from uv
Note: As a shortcut, run make install to setup development libraries (spelled out below). Afterwards, if everything is setup correctly you can try out the Open-R1 models.
Tip: If you scale up/down the number of GPUs, we recommend also scaling up the per-device batch size or number of gradient accumulation steps to keep the global batch size constant.
We provide support to filter datasets by generating and computing pass rate on veriable tasks, see this README
🚨 WARNING 🚨: Most base models like meta-llama/Llama-3.2-1B do not have a chat template, so we set ChatML as the default during training. However, for Qwen base models like Qwen/Qwen2.5-1.5B, a chat template is pre-defined in the tokenizer, so the EOS token must be set accordingly.




















