If you refreshed your ChatGPT app this week, you’ll see yet another change. New models, organized in a tighter, more streamlined setup. Gone is the long list of experimental models (o3, 4.5, etc). Instead, we have three models to work with:
ChatGPT 5 (Flagship Model)
ChatGPT 5 Thinking (Get more thorough answers)
ChatGPT 5 Research (Research Grade Intelligence).
These models are still new. Most of us have only had access to them for 24 hours. But in that short time, I have compiled a few observations:
A cleaner interface and model routing
Unified “auto‑switching” system: GPT‑5 hides most model choices from users and automatically routes simple questions to lightweight models and harder queries to models that “think” longer. Paid users can still choose between GPT‑5 and GPT‑5 Thinking, while the most expensive “Pro” tier gets GPT‑5 Thinking Pro.
Simple UI changes: The model picker is gone, the interface looks more polished and “smooth,” and you can change accent colours. These are more aesthetic than functional, but I always like me a smoother, cleaner, streamlined design.
Small improvement in capabilities and performance
Competent but not a dramatic leap: Developer Simon Willison, who tested the model early, calls GPT‑5 his “new favorite model.” He notes it “rarely screws up and generally feels competent” but reminds us that it is still an LLM, not a radical breakthrough .
Reasoning modes: GPT‑5 includes regular, mini and nano models with four reasoning levels (minimal, low, medium, high), and the system uses a router to decide how much computation to spend on a question. The Pro tier (GPT‑5 Thinking Pro) uses extra compute to deliver “research‑grade” answers .
Reduced hallucinations and sycophancy: OpenAI’s system card claims significant reductions in hallucinations and “sycophancy” (agreeing with user biases). It introduces a new “safe‑completions” mechanism that moderates answers rather than refusing them outright. I’m glad to see progress on this front, however, it’s been a while since I’ve personally experienced hallucinations on the platform. That could be a function of my prompting methods. I could use less sycophancy though, so very pleased to see this improvement.
Context and memory: GPT‑5 offers up to a 272k‑token input limit with separate mini and nano variants. That means this model can draw on conversation history, but some early testers are still reporting incorrect assumptions in GPT-5’s “memory”. Here too, I tend to avoid this problem by being disciplined about my threads and conversations. Don’t rely on one massive thread and you’ll be alright.
Less robotic tone. Claude has long been better at writing in a natural, human tone. ChatGPT by contrast has always suffered from sounding robotic. Well, I’m only 24 hours into testing it, but it is much improved on this front. I’ve got no quantifiable proof of this, just my early qualitative observations. It’s encouraging nonetheless.
Feature additions and integrations
Voice and multimodal improvements: GPT‑5’s “advanced voice mode” improves instruction following and adapts responses based on context . Voice mode is now available to free users with limits, while paid plans get near‑unlimited use. I don’t use this feature nearly enough. I have friends and colleagues who love using this feature when driving, to think through a problem with GPT, much like how they may think through a problem with a friend on the phone.
Enhanced coding features: OpenAI demonstrated new “vibe coding” capabilities that let users generate interactive web apps via natural‑language descriptions . This aligns with a broader industry shift toward coding assistance as a primary commercial use for LLMs. And it certainly makes it easier for non-coders to produce prototypes to share with experts.
Google services integration: Pro users will soon gain Gmail and Google Calendar integration, letting ChatGPT automatically reference emails and schedules when planning tasks. I imagine a big efficiency gain on scheduling challenges in the near future.
Smaller models and cost controls: Mini and nano versions automatically activate after usage limits to keep costs down. Pricing is aggressively competitive: GPT‑5 charges $1.25 per million input tokens and $10 per million output tokens, significantly undercutting competitors like Claude Opus and Gemini Pro. OpenAI’s low pricing may spark a price war. Which in turn, should make wider adoption in our industry easier.
Overall takeaways
Manage expectations: GPT‑5 is a solid incremental improvement, but it’s not a revolution. Yes, it excels at coding and structured reasoning but it doesn’t magically produce expert‑level insights in humanities or policy analysis. Use it as a thought partner, not to produce your finished product.
Use appropriate modes: For simple tasks, the default GPT‑5 is faster and adequate. When deeper analysis is needed (e.g., preparing regulatory submissions), “GPT‑5 Thinking” or “Pro” will deliver more comprehensive answers, if you’re willing to pay for a pro plan.
Consider the broader AI arms race: GPT‑5’s release highlights intense competition and rapid iteration across AI labs . Public‑affairs strategies should factor in that capabilities will continue to evolve quickly—and that policy debates over safety, misinformation, and labor impacts will intensify.