Anthropic takes the lead on agentic
Claude Opus 4.8 shipped May 28, six weeks after 4.7. It's the only model to complete end-to-end every case on the Super-Agent benchmark, and it beats GPT-5.5 at cost parity. More importantly: Anthropic coupled it with an agent infrastructure (scheduled deployments, environment variables, dynamic workflows). It's a system, not an isolated model.
OpenAI plays hide-and-seek on GPT-5.6
As we write, GPT-5.6 isn't officially announced. No model card, no API doc, no blog post. But prediction markets (Polymarket) give 85% probability of a release before end of June. Classic OpenAI strategy: let the hype build then ship something impressive. Remains to be seen if the jump is as big as on previous versions.
Gemini 3.5 Pro: Google catches up (again)
Announced at I/O 2026 on May 19 with a « next month » timing, Gemini 3.5 Pro is expected on Vertex AI between June 1 and June 30. Google stays strong on multimodal (vision, audio, video) and native Workspace integration. But on pure reasoning, they remain a notch behind Claude and GPT.
What really distinguishes 2026 models
The criterion that separates the best from the mediocre in 2026 is no longer raw « quality » (they all became excellent). It's three things: (1) sustained attention time — ability to stay on a complex task for hours; (2) tooling mastery (API calls, MCP servers); (3) cost per end-to-end task, not per token.
On these three criteria, Anthropic leads. OpenAI catches up on cost (GPT-5-mini variants). Google stays competitive on multimodal. But none crushes the others — the market stays open.
How to choose for your product
For the apps we build, choice depends on use case: Claude for long agentic workflows and code, GPT for consumer conversational chatbots, Gemini for multimodal apps (video, audio, vision). Our team uses all three depending on context — no religion.