Latest

Wearables Garmin Venu 4 review: the UK verdict on the £469.99 AMOLED health watch, on t…
Smart Home Best smart doorbell UK 2026: the wired and battery picks that respect your pr…
Wearables Mobvoi TicWatch Atlas 2 review: the Wear OS underdog that makes battery life …
Editorials The DMA reaches your UK iPhone in 2026 — but the good bits are stuck in Bru…
Comparisons OnePlus Pad 3 UK: the £529 tablet that makes the iPad Air sweat
Editorials Oppo Find N5 UK: the world’s thinnest foldable Britain still can’…
Wearables COROS Vertix 2S UK review: the £599 adventure watch aimed straight at Garmin

All news

News

GPT-5.5 is OpenAI’s bid to win the assistant war on your phone

GPT-5.5 shows OpenAI wants the phone assistant market, not just benchmarks, with ChatGPT and Codex pushing agentic work into daily mobile use this year.

Daniel Reid 15 May 2026 Updated 22 Jun 2026 5 min read

UPDATED · News · 15 May 2026 · Daniel Reid

GPT-5.5 is the model OpenAI built to win the assistant war, not just the benchmark league, and TechCrunch confirmed on 23 April 2026 that it shipped to Plus, Pro, Business and Enterprise users in ChatGPT and Codex the same day. OpenAI calls GPT-5.5 its “smartest and most intuitive to use model yet” and frames it as the next step toward a “super app” that does your work for you. Strip away the marketing and the thesis is simpler and sharper: this is the first OpenAI release engineered around the model doing things rather than answering things — and that is exactly the capability your phone’s assistant has been missing.

Key facts

GPT-5.5 launched 23 April 2026, scoring a state-of-the-art 82.7% on Terminal-Bench 2.0 (up from GPT-5.4’s 75.1%) and 78.7% on OSWorld-Verified computer use (GPT-5.4: 75.0%).
API pricing is £4 (about $5) per million input tokens and £24 (about $30) per million output tokens — double GPT-5.4’s £2 (about $2.50) / £12 (about $15) — with a 1M-token context window and GPT-5.5 Pro at £24 (about $30) / £140 (about $180).
GPT-5.5 Pro scored 33.2% on the GeneBench scientific benchmark; the standard model hit 84.9% on GDPval knowledge work and 58.6% on SWE-Bench Pro.
GPT-5.5 Instant became the new default free-tier ChatGPT model on 5 May 2026, putting the agentic engine into every phone running the app.

GPT-5.5 is a computer-use model wearing a chatbot’s clothes

The headline number for GPT-5.5 is not a trivia score. It is 78.7% on OSWorld-Verified, the benchmark that measures whether a model can actually operate a real computer environment — open apps, click menus, fill forms, recover when something goes wrong. GPT-5.4 managed 75.0%; the jump to 78.7% sounds modest until you realise these last percentage points are where autonomous task completion either works or quietly fails halfway through. Pair that with 82.7% on Terminal-Bench 2.0, a command-line benchmark demanding planning and tool coordination, and the shape of the release becomes obvious. OpenAI did not optimise GPT-5.5 to be a better talker. It optimised it to be a better doer.

The 1M-token context window is the unsung mobile feature

GPT-5.5 carries a one-million-token context window in the API, with a 128,000-token maximum output. On a desktop that means feeding a whole codebase or a quarter’s worth of documents in one shot. On a phone it means something subtler and arguably more important: an assistant that can hold the full thread of your day — your messages, your calendar, the document you were editing, the three tabs you left open — without losing the plot halfway through a task. The reason today’s mobile assistants feel forgetful is not personality. It is context budget. GPT-5.5 quietly removes that ceiling.

OpenAI also claims GPT-5.5 matches GPT-5.4’s per-token latency in real-world serving while operating at a much higher level of intelligence. That is the line that should worry competitors most. Historically every capability jump cost speed, and on mobile latency is the whole experience — an assistant that thinks for eight seconds is an assistant you stop using. Holding latency flat while pushing computer-use accuracy up is precisely the trade-off that makes an on-phone agent tolerable rather than a novelty you disable after a week.

Where GPT-5.5 still loses, and why that is fine

Let us be honest about the gaps, because OpenAI will not be. On SWE-Bench Pro, GPT-5.5’s 58.6% trails Anthropic’s Claude Opus 4.7 at 64.3%, and Tom’s Guide reported the model lost in all seven of its head-to-head test categories against the same Claude release. Critics also flagged a persistent tendency to hallucinate confidently rather than admit a knowledge gap — the oldest and most dangerous failure mode for anything you let act autonomously on your behalf. There was even a documented quirk where the model kept inserting goblins and gremlins into outputs until OpenAI filtered it out, which is funny until you imagine it inside an agent booking your travel.

But raw coding supremacy is not the metric that decides the mobile assistant war — reliable, low-latency action across everyday apps is. On the computer-use and knowledge-work benchmarks that actually map to “do my admin for me”, GPT-5.5 leads. Anthropic is winning the developer’s terminal; OpenAI is aiming at the other billion people who never open one. Both can be true. The doubled API price (£4 (about $5)/£24 (about $30) against GPT-5.4’s £2 (about $2.50)/£12 (about $15)) is the real cost question for businesses, the same calculus we covered when Claude for Small Business launched with QuickBooks and PayPal inside — capability is now cheap to access and expensive to run at scale.

Video: OpenAI

The verdict: GPT-5.5 is the first credible agent your phone will run

GPT-5.5 is not the model that beats every rival on every chart, and anyone telling you it is has not read the SWE-Bench Pro line. It is something more strategically dangerous: the first frontier model deliberately tuned for agentic computer use, shipped free into the most-installed AI app on the planet, with a context window big enough to hold a working day and a latency profile that does not punish you for the privilege. For coders chasing the absolute top score, Claude Opus 4.7 still edges it. For the mobile assistant war — the fight over who runs the agent in your pocket — GPT-5.5 just moved OpenAI into the lead, and Apple and Google are now responding to OpenAI’s roadmap rather than setting their own.

Watch the ChatGPT app, not the benchmark tables, over the next two quarters. If OpenAI ships native computer-use actions to the mobile app on top of GPT-5.5 — and the bet looks even sharper once you read how OpenAI fast-tracked its own AI phone to 2027 — the question stops being “which assistant answers best” and becomes “which assistant actually finishes the job”, and right now OpenAI has the only honest claim to the second.

Buyer action

Where to buy or check next

Use this as the final check before ordering a phone, changing network or trusting a headline monthly price.

Currys mobile phonesCompare unlocked phones and UK retail prices.Argos mobile phonesCheck mainstream UK phone stock and pricing.EE mobileCheck contract, SIM and network options.Vodafone mobileCompare UK network deals and SIM options.O2 shopCheck O2 phone, SIM and tariff availability.Ofcom coverage checkerCheck local mobile coverage before switching.

Editorial standards

Related coverage

Wearables

Garmin Venu 4 review: the UK verdict on the £469.99 AMOLED health watch, on the numbers

Jul 17, 2026

Smart Home

Best smart doorbell UK 2026: the wired and battery picks that respect your privacy

Jul 16, 2026

Wearables

Mobvoi TicWatch Atlas 2 review: the Wear OS underdog that makes battery life the whole point

Jul 13, 2026

Editorials

The DMA reaches your UK iPhone in 2026 — but the good bits are stuck in Brussels

Jul 12, 2026

Comparisons

OnePlus Pad 3 UK: the £529 tablet that makes the iPad Air sweat

Jul 11, 2026

Editorials

Oppo Find N5 UK: the world’s thinnest foldable Britain still can’t officially buy

Jul 10, 2026

Reader discussion

Leave a comment

Comments are moderated. Keep it useful, accurate, and on topic.

Join the discussion Cancel reply

Keep reading

Today on MTW

The latest stories moving through the newsroom.

Wearables / 17 Jul 2026

Garmin Venu 4 review: the UK verdict on the £469.99 AMOLED health watch, on the numbers

Smart Home / 16 Jul 2026

Best smart doorbell UK 2026: the wired and battery picks that respect your privacy

Wearables / 13 Jul 2026

Mobvoi TicWatch Atlas 2 review: the Wear OS underdog that makes battery life the whole point

Editorials / 12 Jul 2026

The DMA reaches your UK iPhone in 2026 — but the good bits are stuck in Brussels

Keep reading

Latest reviews

Recent hands-on verdicts and product reads.

Reviews / 6 Jul 2026

Bowers & Wilkins Px8 S2 review: the UK verdict on the £629 headphones for grown-ups

Reviews / 4 Jul 2026

Cambridge Audio Melomania P100 review: the British ANC pair that undercuts Sony

Reviews / 3 Jul 2026

Bowers & Wilkins Zeppelin review: the design speaker that finally sounds the part

Keep reading

Buying guides

Practical UK buying advice and comparisons.

Buying Guides / 8 Jul 2026

Best premium wireless earbuds UK 2026: Sony WF-1000XM6 vs Technics EAH-AZ100 and Bowers & Wilkins Pi8

Buying Guides / 28 Jun 2026

The best laptop for UK photo and video work in 2026: which premium machine I’d actually buy

Buying Guides / 27 Jun 2026

Best NAS for UK creators in 2026: Synology, QNAP or Asustor?

Keep reading

From the archive

Legacy reporting from the MobileTechWorld back catalogue.

Archive / 22 Oct 2013

Nokia Lumia 1520 Announced: Specifications

Archive / 22 Oct 2013

Instagram, Vine, Xbox Video and many more hitting Windows Phone 8 in the coming weeks

Archive / 14 Oct 2013

Microsoft launches Windows Phone 8 developer preview program and releases GDR3 Update today

Archive / 8 Sep 2013

Nokia Lumia 1520 shows up in real life with MicroSD Card slot, 2GB of Ram and Snapdragon S800 SoC