OpenAI GPT-5.6 — Sol / Terra / Luna (June 26). Flagship Sol plus a balanced Terra and low-cost Luna; positioned for coding, scientific reasoning, long-horizon planning and agentic workflows. Notably released to only ~20 organizations after OpenAI shared plans with the US government, with general release "in coming weeks"; Sol slated to run on Cerebras at up to ~750 tok/s in July. Reported / vendor (OpenAI; TechCrunch).
Anthropic Claude Sonnet 5 (June 30). Most agentic Sonnet to date (plans, tool/browser/terminal use, autonomous runs); reported at/near Opus 4.8 quality at lower cost. Third-party benchmark reads: SWE-bench Pro 63.2% (Opus 4.8 69.2%, Sonnet 4.6 58.1%), OSWorld-Verified 81.2%, Humanity's Last Exam 57.4% with tools (≈Opus 4.8's 57.9%). Intro pricing $2/$10 per M tokens through Aug 31, then $3/$15. Anthropic also reports lower deception/sycophancy/jailbreak-susceptibility vs Sonnet 4.6. Reported / vendor (Anthropic).
Meituan LongCat-2.0 (June 30). 1.6T-parameter MoE, 1M context, MIT license (weights "coming soon" at time of announcement). Empirical SWE-bench Pro 59.5 (edging GPT-5.5's 58.6), Terminal-Bench 2.1 70.8, SWE-bench Multilingual 77.3. First Chinese frontier model pre-trained end-to-end on domestic ASIC superpods (>50k chips); had been quietly topping OpenRouter as stealth model "Owl Alpha." Reported (VentureBeat).
US lifts export controls on Anthropic Fable 5 & Mythos 5 (approved June 30, effective July 1). Ends a ~19-day freeze that began June 12 after Amazon researchers demonstrated a jailbreak eliciting vulnerability-exploit code from Fable 5. Fable 5 resumes globally across Claude Platform/Claude.ai/Code/Cowork (cloud marketplaces "as fast as possible"); Mythos 5 restored to approved US organizations. Reported (CNBC; Al Jazeera). Directly updates the June 13 suspension logged last cycle.
No standout new peer-reviewed result in-window; ongoing threads (Anthropic interpretability-in-deployment, OpenAI "internals-based lie detector," Fellows Program July cohort) are continuations, not new deltas. The GPT-5.6 government-preview and the Fable 5 jailbreak-then-restore episode are the period's most concrete safety-governance signals.
| Org | What they do | Stage / license | This period's update | Tier |
|---|---|---|---|---|
| OpenAI | Frontier models | Closed, limited preview | GPT-5.6 Sol/Terra/Luna launched (June 26) to ~20 orgs under US-gov preview; Cerebras serving in July | Reported |
| Anthropic | Frontier models | Commercial API | Claude Sonnet 5 (June 30); Fable 5/Mythos 5 export controls lifted, global rollout July 1 | Reported |
| Meituan | Open-weights frontier LLM | MIT (weights pending) | LongCat-2.0 1.6T open-sourced (June 30); trained fully on domestic Chinese ASICs | Reported |
| Thinking Machines Lab | Agentic-AI infra/models | Series B (reported) | Reported ~$2B raise at ~$10B valuation (date not firmly in-window — verify) | Reported |
No materially new in-window items on: novel training/architecture or scaling-law results; inference/quantization/serving methods; new evaluation methodology or contamination findings (LongCat scores are vendor-reported, unreplicated); agent frameworks beyond the model releases themselves; major new compute/datacenter deals; EU AI Act (Digital Omnibus timeline changes were logged last cycle).