Meta researchers distill System 2 thinking into LLMs, improving ...
Large language models (LLMs) excel at answering simple questions but struggle with complex tasks that require reasoning and planning. To address this limitation, researchers at Meta FAIR have introduced a technique called "System 2 distillation." This method enhances the reasoning capabilities of LLMs without the need for intermediate steps.
Understanding System 1 and System 2 Thinking
In cognitive science, System 1 and System 2 represent two modes of thinking. System 1 is fast, intuitive, and automatic, used for quick judgments and pattern recognition. On the other hand, System 2 is slow, deliberate, and analytical, employed for complex problem-solving.
LLMs are typically associated with System 1 thinking, being able to generate text rapidly but lacking in deliberate reasoning ability. Recent research has explored ways to prompt LLMs to mimic System 2 thinking, requiring them to generate intermediate steps before providing a final answer.
System 2 Distillation Technique
System 2 distillation involves teaching LLMs complex tasks without the explicit need for intermediate steps. This technique leverages the model's own System 2 reasoning capabilities and distills that knowledge into its System 1 generation, making the process more efficient.
The researchers prompt the LLM to solve a problem using System 2 techniques, verify responses for correctness, and discard intermediate steps, retaining only the final answers. By fine-tuning the model on the initial question and answer, they enable the LLM to skip reasoning steps and directly provide solutions.
Evaluation and Results
The researchers evaluated System 2 distillation on various reasoning tasks using different System 2 prompting techniques. Results indicate that this distillation method can significantly enhance LLM performance on complex tasks, often outperforming original System 2 methods while generating responses faster and with less computational cost.
While System 2 distillation shows promise, researchers acknowledge that certain tasks may still require deliberate reasoning and not be suitable for distillation into System 1 generation. Future research will delve into the efficacy of this technique on smaller models and its broader impact on LLM performance.
Overall, System 2 distillation presents a powerful optimization tool for improving LLM pipelines and freeing up time for reasoning about tasks that require more deliberate effort, akin to human thinking processes.