OpenAI debuts “o1” AI models, promising PhD-level reasoning in science and math
In a nutshell: OpenAI has unveiled a new series of AI language models named the “o1,” specifically engineered to enhance reasoning capabilities, particularly for complex issues in science, coding, and mathematics. The company is so confident in these advancements that it has reset the model version counter to 1, starting anew after GPT-4o, and has notably moved away from the GPT branding.OpenAI debuts “o1” AI models, promising PhD-level reasoning in science and math
The first model in the “o1” series, named “o1-preview,” is available in both ChatGPT and OpenAI’s API. Despite its preview status, the company promises regular updates and enhancements are part of the plan.
The “o1” models have been trained to enhance their problem-solving approach by spending more time analyzing issues before providing an answer. This method allows the models to experiment with various strategies, identify their own errors, and tackle complex tasks in a more systematic, human-like manner.
The results shared by OpenAI suggest a significant advancement with the new “o1” models. According to the company, these models perform at a level comparable to PhD students on challenging benchmarks in fields such as physics, chemistry, and biology.
You can read more Technology articles
For example, it achieved an 83 percent accuracy rate on a test qualifying students for the International Math Olympiad, a notable improvement over the 13 percent accuracy of GPT-4o.
Of course, AI benchmarks can sometimes be unreliable, so the true performance of the “o1” models will become clearer as more users test them in various scenarios.
Additionally, the new models seem
to resolve some long-standing questions, such as the number of R’s in “strawberry,” finally putting the memes to rest. OpenAI also showcased a demo where the model successfully generated Python code for an arcade game, highlighting its advanced capabilities.
OpenAI was previously reported to be working on a project codenamed “Strawberry” to develop models capable of tackling complex reasoning tasks. Given that the “o1” series seems to be the result of the Strawberry project, it’s amusing to think that the project’s name might have been inspired by the “strawberry” test.OpenAI debuts “o1” AI models, promising PhD-level reasoning in science and math
In addition to enhancing reasoning capabilities, OpenAI also focused on strengthening defenses against “jailbreaking,” a technique used to bypass safety mechanisms. According to the company, the “o1-preview” scored 84 out of 100 in one of its most challenging jailbreaking tests, compared to only 22 for GPT-4o.
To make these models more accessible, especially for developers, OpenAI is also releasing a lighter “o1-mini” version designed for coding tasks.
Access to both “o1-mini” and “o1-preview” is now rolling out for paid ChatGPT Plus and Teams plans. While the advanced reasoning capabilities are currently opt-in with weekly usage limits, OpenAI is working to expand capacity and enable automatic model selection based on the complexity of the prompt.
Follow HiTrend on X