Claude 3.7 lets users control AI’s cognitive effort with extended thinking mode

· 2 min read
Claude

Anthropic has introduced a new feature in its Claude 3.7 Sonnet AI model called "Extended Thinking Mode," which allows users to toggle the model's cognitive effort for more complex tasks. This capability lets developers set a "thinking budget," controlling how much time and computational resources Claude allocates to problem-solving. Unlike switching to an entirely different model, this feature enhances the same model's ability to perform deeper reasoning. The addition of a visible thought process further supports transparency, enabling users to observe Claude's reasoning in raw form. However, this visibility also raises concerns about safety, faithfulness, and potential misuse.

Claude
Thinking UI

The visible thought process is designed to improve trust and alignment by allowing users to verify answers and detect inconsistencies. For researchers, it offers insights into the model's reasoning, which often mimics human-like problem-solving approaches. However, challenges include the potential for incorrect or incomplete intermediate thoughts, the risk of malicious actors exploiting visible reasoning for jailbreaks, and uncertainty about whether the displayed thought process fully represents the model's internal decision-making.

Claude 3.7 Sonnet also introduces "action scaling," enabling it to perform iterative tasks like virtual computer use. This capability allows it to issue mouse clicks and keyboard presses, achieving better outcomes in complex scenarios such as solving open-ended tasks or playing games like Pokémon Red. The model demonstrated significant progress compared to its predecessors by successfully completing advanced in-game objectives.

In terms of computational scaling, Claude 3.7 employs both serial and experimental parallel test-time compute methods. Serial scaling improves accuracy by allowing sequential reasoning steps, while parallel methods involve sampling multiple independent thought processes and selecting the best outcome. Though parallel scaling is not yet publicly available, it shows promise for future iterations.

Safety remains a priority for Anthropic. The model adheres to ASL-2 safety standards but incorporates enhanced measures such as encryption of potentially harmful thought processes and defenses against prompt injection attacks during computer use. These safeguards aim to mitigate risks while maintaining functionality.

Claude 3.7 Sonnet is now available through Claude and Anthropic’s API for Pro, Team, Enterprise, and API users. This release marks a step forward in AI transparency and capability while acknowledging the complexities of balancing innovation with safety and ethical considerations.

Source