|
OpenAI has unveiled a new version of ChatGPT, based on a new AI technology called OpenAI o1. What are the new features and how is the new version better for scientists?
ChatGPT: New AI technology OpenAI o1
On September 12, OpenAI unveiled a new version of the ChatGPT virtual assistant based on OpenAI's new o1 AI technology. Since the launch of the first version of ChatGPT in 2022, the chatbot has been constantly improving, providing users with new capabilities. In this article, we will look at the latest updates and find out how the new version differs from the previous ones and how it is better.
Demo and features of the new ChatGPT
OpenAI said a chatbot based on OpenAI o1 technology can "reason" by solving problems related to math, coding and science.
During a demonstration for The New York Times, the chatbot was presented with some puzzles and chemistry questions, which it answered at a PhD level and diagnosed an illness based on a detailed report of the patient's symptoms and medical history.
The company also noted that the new technology could help physicists generate complex mathematical formulas and assist health researchers in their experiments.
Experts have trained these models to spend more time analyzing problems before on page seo service providing an answer, mimicking the human approach. Through this learning process, the models improve their thinking, try out different strategies, and are able to admit their mistakes.
The need to improve artificial intelligence
ChatGPT learned by analyzing large amounts of text from various sources on the internet, including Wikipedia articles, books, and chat rooms. By analyzing patterns in the texts, it gained the ability to generate new text on its own. However, due to the prevalence of misinformation on the internet, the model can reproduce these inaccuracies and sometimes even make them up.
The developers built the new OpenAI system using a method called reinforcement learning, which allows the system to learn through repeated trials and errors, which can take weeks to months. For example, when solving math problems, the system learns which methods lead to the right answer and which don’t. After solving a lot of these problems, it begins to notice patterns, but that doesn’t mean its thinking is human-like. OpenAI’s technical team emphasizes that the system can still make mistakes and isn’t perfect, but users can expect it to work harder and be more likely to get the right answers.
Testing the new OpenAI o1 technology
OpenAI said the new technology performed better than previous technologies on some standardized tests.
In tests, the new version of the model demonstrates graduate-level performance on difficult tests in physics, chemistry, and biology. The model also shows excellent results in mathematics and programming. On the 2024 AIME exam, the GPT-4o model was able to solve only 12% (1.8/15) of the problems on average. In contrast, the o1 model achieved 74% (11.1/15) of the solutions when faced with a single attempt per problem, 83% (12.5/15) when consensus was reached across 64 attempts, and 93% (13.9/15) when re-ranking 1,000 attempts using the learned scoring function. A score of 13.9 places the model in the top 500 students nationally and exceeds the cutoff score for the US Mathematical Olympiad.
The model initialized on o1 and fine-tuned for programming scored 213 points, ranking in the 49th percentile, at the 2024 International Olympiad in Informatics (IOI). In the real-world competition, 10 hours were allocated to solve 6 algorithmic problems with 50 attempts each. Submissions were scored based on public and generated tests. If answers were selected randomly, the average score would have been only 156 points, indicating that the applied strategy added almost 60 points in a highly competitive environment. With relaxed trial restrictions, the model achieved 362.14 points, exceeding the gold medal threshold. On the Codeforces platform, the GPT-4o model achieved an Elo3 rating of 808, ranking in the 11th percentile among humans.
|
|