---
type: "Evidence Item"
title: "Introducing Claude Opus 4.6"
description: "We’re upgrading our smartest model. The new Claude Opus 4.6 improves on its predecessor’s coding skills. It plans more carefully, sustains agentic tasks for longer, can operate."
resource: "https://www.anthropic.com/news/claude-opus-4-6"
tags: ["appendix-iii", "benchmark", "anthropic"]
timestamp: "2026-02-05"
category: "benchmark"
publisher: "Anthropic"
cope_score: 96
confidence: 0.88
---

# Introducing Claude Opus 4.6

# Claim

We’re upgrading our smartest model. The new Claude Opus 4.6 improves on its predecessor’s coding skills. It plans more carefully, sustains agentic tasks for longer, can operate more reliably in larger codebases, and has better code review and debugging skills to catch its own mistakes. And, in a first for our Opus-class models, Opus 4.6 features a 1M token.

# Relevance

Appendix III, section one: model and benchmark capability evidence

# Oracle Verdict

Anthropic is describing a frontier or production capability that pushes directly on the thesis. The important signal is not the marketing language; it is the widening set of tasks now being routed through model-driven execution rather than ordinary software or headcount.

# Metadata

* Publisher: Anthropic
* Category: benchmark
* Sector: Software engineering
* Capability: Frontier model release and benchmark movement
* Cope score: 96
* Confidence: 0.88

# Related Concepts

* [Live evidence index](index.md)
* [Thesis](../thesis.md)

# Citations

[1] [Introducing Claude Opus 4.6](https://www.anthropic.com/news/claude-opus-4-6)