---
type: "Evidence Item"
title: "OpenAI and Anthropic share findings from a joint safety evaluation"
description: "OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other’s models for misalignment, instruction following, hallucinations."
resource: "https://openai.com/index/openai-anthropic-safety-evaluation"
tags: ["appendix-iii", "vendor", "openai"]
timestamp: "2025-08-27"
category: "vendor"
publisher: "OpenAI"
cope_score: 36
confidence: 0.9
---

# OpenAI and Anthropic share findings from a joint safety evaluation

# Claim

OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other’s models for misalignment, instruction following, hallucinations, jailbreaking, and more—highlighting progress, challenges, and the value of cross-lab collaboration.

# Relevance

Appendix III, section two: vendor threshold and platform capability evidence

# Oracle Verdict

This is a low-signal vendor radar item. Keep it as context only unless a later benchmark, deployment, procurement change, or labour-market datapoint turns it into direct Appendix III evidence.

# Metadata

* Publisher: OpenAI
* Category: vendor
* Sector: General AI capability
* Capability: Vendor platform capability signal
* Cope score: 36
* Confidence: 0.9

# Related Concepts

* [Live evidence index](index.md)
* [Thesis](../thesis.md)

# Citations

[1] [OpenAI and Anthropic share findings from a joint safety evaluation](https://openai.com/index/openai-anthropic-safety-evaluation)