While O1 was an important technological development, GPT5, above all, is a better product. During the press briefing, Sam Altman compared the GPT5 with Apple’s retina display, and this is a suitable resemblance, though probably not as it intended. Like most unusual crisp screen, GPT5 will offer more pleasant and smooth user experience. This is nothing, but it is much less than the future of change that Ultman has spent mostly in last year’s hypening. In the briefing, Altman called the GPT5 “an important step on the path of Agi”, or artificial general intelligence, and maybe he is fine-but if so, this is a very small step.
Take the demo of model capabilities that Openi showed MIT Technology Review Before his release, the post -Training Training, Yan Doboos, asked the GPT5 to design a web application, which would help his partner learn the French language so he could communicate with his family more easily. The model did a commendable job of following its instructions and formed a charming, user -friendly app. But when I gave almost the same indication to the GPT -4O, it developed an app with the same functionality. The only difference is that it was not aesthetically happy.
Some of the other improvements to the user experience are more important. Having a model rather than a user is to apply reasoning to every question, relieves a great deal of pain, especially for users who do not closely do LLM growth.
And, according to the opposite, the GPT -5 reasons are much faster than the model of the series. The fact is that the opening is releasing it to non -Paying consumers, which shows that the company is less expensive for running. This is a huge thing: running a powerful model cheap and quickly is a difficult problem, and solving it is the key to reducing AI’s environmental effects.
Open has also taken steps to reduce the deception, which has been a constant headache. Openi diagnosis shows that the GPT -5 models are much less likely to make false claims than their predecessors, O3 and GPT -4Os. If this progress is examined, it can help pave the way for more reliable and reliable agents. “Fraud can cause real issues of safety and security,” says Don Song, a professor of computer science at UC Berkeley. For example, an agent who deceives software packages can download malicious code on the user’s device.
The GPT -5 has achieved the state of art on several benchmarks, which includes the examination of agents’ capabilities and coding diagnosis of Savi Bench and the Eder Polyglite. But according to the company’s embroidered AI researcher Clementine Force, they are close to the diagnosis saturation, which means that existing models have achieved maximum success.
She says, “It’s mainly like seeing the performance of a high school on medium -sized issues.” “If the high school fails, it tells you something, but if it succeeds, it doesn’t tell you much.” Forsier said he would be affected if he scored 80 % or 85 % on the SWE bench-but he managed only 74.9 %.
Finally, the openings of the openings are that the GPT5 feels better to use. “The vibration of this model is really good, and I think people are really going to feel it, especially on average, people who aren’t spending time thinking about the model,” said Nick Tourley.
However, the websites will not only bring about the automatic future that the opposite has promised. The reasoning seemed to be an important step in the AGI path. We are still waiting for the next one.