GPT-5 is struggling to be developed, foreign media reports: performance improvements are limited, OpenAI executive breaks down publicly on Slack

Machine Heart Report

Editor: Zenan, Yang Wen

The winner is "GPT-5".

GPT-5 has yet to appear, and netizens have started making various memes to "mock":

Actually, rumors about GPT-5 have not stopped these days.

First, some netizens found traces of the GPT-5-Auto and GPT-5-Reasoning models in the macOS ChatGPT app:

Then, netizens revealed that Microsoft Copilot and Cursor have already secretly joined the test of GPT-5.

On August 1st, The Information even wrote a long article titled "Inside OpenAI's Rocky Path to GPT-5," revealing more insider information about GPT-5.

Here is the condensed version:

GPT-5 will have real improvements compared to its predecessor, but it cannot be compared to the performance leap between earlier GPT model versions.

OpenAI faced a series of technical issues this year, which threatened the progress of o3 and other models.

Meta lured away OpenAI executives, causing "defeat" for OpenAI, with complaints about team changes in Slack.

Next, let's take a closer look at the gossip.

GPT-5 has improvements, but the performance leap is not as significant as before

Last December, OpenAI demonstrated the results of Test-Time Scaling, becoming a key breakthrough in the era of large models after pre-training. This test showed that when AI has more time and computing power to handle tasks, its performance continues to increase. This technical path has already shown initial effectiveness in the practices of OpenAI-o1 and DeepSeek-R1. It seems that many ChatGPT users will be amazed by the powerful features of the new AI.

However, the excitement did not last long.

According to two people involved in the development, when OpenAI researchers transformed this new AI into a chat-based version o3, enabling it to respond to instructions from ChatGPT users, the performance improvements seen on previous benchmarks basically disappeared.

This is just one example of the numerous technical challenges OpenAI has faced this year. More and more difficulties are slowing down the pace of AI development, and may even affect the ChatGPT business, an AI hit application.

According to reports, OpenAI researchers have already found a way to improve in the upcoming GPT-5.

According to sources and internal engineers at OpenAI, the upcoming flagship AI model, GPT-5, has significantly improved capabilities in programming, mathematics, and other areas.

A source said that the new model can better add functions when writing application code, making it more user-friendly and aesthetically pleasing. He stated that GPT-5 also performs better than its predecessors in driving AI agents to handle complex tasks with minimal human supervision. For example, it can follow complex instructions and determine the list of rules for when customer service should issue refunds. Previously, models needed to test several tricky customer cases (i.e., edge cases) before handling such refunds.

Another insider said these improvements cannot be compared to the performance leap of earlier GPT models, such as the improvement between GPT-3 in 2020 and GPT-4 in 2023. The slowdown in performance improvements over the past 12 months at OpenAI suggests that it may be difficult for it to surpass its biggest competitor, at least in terms of AI capabilities.

Currently, OpenAI's models have created significant commercial value through ChatGPT and various applications. Even incremental improvements would increase customer demand. These improvements can also give investors confidence to fund OpenAI's plan to spend $4.5 billion on purchasing GPUs for developing and operating products over the next three and a half years.

Improving automated coding capabilities becomes OpenAI's top priority

Recent developments also explain why OpenAI executives have told some investors in recent weeks that they believe the company can achieve the goal of "GPT-8." This statement aligns with CEO Sam Altman's public remarks, stating that with existing technological knowledge, OpenAI is expected to create artificial intelligence technology that can rival the abilities of the smartest humans, i.e., AGI.

Although there is still a long way to go to achieve AGI, the upcoming GPT-5 model may have other attractions besides better coding and reasoning.

According to a Microsoft employee who was informed, Microsoft has exclusive rights to use OpenAI's intellectual property. Some of Microsoft's leadership told employees that the test results of the model showed that GPT-5 can generate higher quality coding and other text-based answers without consuming more computing resources.

The person said part of the reason is that it can better judge which tasks require relatively more or less computing resources than previous models.

Internal assessments at OpenAI show that after competitors like Anthropic first developed and sold such models to software developers and coding assistants like Cursor, improving the ability of artificial intelligence to automatically perform coding tasks has become OpenAI's top priority.

OpenAI employees believe that automated coding is not only crucial for the company's business, but also for automating the work of AI researchers themselves.

Pressure from organizational restructuring

OpenAI's progress has not been smooth, as its researchers and management have faced new pressures this year.

First, the delicate relationship with Microsoft.

Although Microsoft is OpenAI's largest external shareholder, and according to the contract agreement between the two parties, Microsoft has the right to use some of OpenAI's technologies before 2030, some senior researchers at OpenAI do not agree with handing their innovations and inventions to Microsoft.

In financial terms, Microsoft and OpenAI have a very close partnership, but there are disputes over the specific terms of the cooperation, with both sides demanding concessions from each other.

OpenAI hopes to restructure its profit-making department to prepare for future listings. Although there are still uncertainties in some details, some important aspects have reached preliminary consensus, such as Microsoft possibly obtaining about 33% equity after OpenAI's restructuring.

Secondly, Meta's continuous "digging holes."

Recently, Meta invested a lot of money to recruit more than ten researchers from OpenAI, some of whom had participated in the recent technological improvements at OpenAI.

This talent loss and subsequent personnel adjustments have put pressure on OpenAI's management.

Last week, Jerry Tworek, the research vice president of OpenAI, complained to his supervisor Mark Chen in the company's internal Slack about the team's changes, and many colleagues saw his complaint. Tworek said he needed a week off to reassess the situation, but later he didn't take leave.

The "loss" of the Orion model

Although OpenAI has made some progress in the business, there are still concerns within the company about whether it can continue to improve AI and maintain its leading position, especially facing well-funded competitors like Google, xAI, and Anthropic.

In the second half of 2024, OpenAI developed a model called Orion, originally planned to be released as GPT-5, and it was expected to perform better than the current GPT-4o model. However, Orion did not meet the expected level of improvement, so OpenAI released it as GPT-4.5, but the impact of this model seemed not much.

One of the reasons for Orion's failure was the limitations in its pre-training phase. Pre-training is the first step in model development, where the model needs to process a large amount of data to understand the connections between different concepts. OpenAI faced a lack of high-quality data, and they also found that the optimizations done on the Orion model were effective when the model was small, but when the model size increased, these optimizations no longer worked.

o3's strength comes from more NVIDIA chips

Additionally, OpenAI's researchers face other problems.

Last year, OpenAI developed inference models, which performed better when given more computing resources to process answers. These models originated from a breakthrough called Q * at the end of 2023, which shocked the company's researchers because it could solve mathematical problems that had never been seen before. By 2024, the inference models seem to have helped the company overcome the problem of slowed performance growth during the pre-training phase.

Last autumn, OpenAI turned the first major inference model into o1. This release brought new influence to OpenAI in the AI field and laid the foundation for the development of AI Agents relying on inference models.

According to those involved in the development, OpenAI created the next inference model o3 before the end of 2024, whose base large language model is the same as o1, both being GPT-4o. Although o3 and o1 share the same model lineage, the mother model of o3 (also called the teacher model) has made significant progress in understanding various scientific fields and other areas compared to the mother model of o1.

One reason for the progress is that OpenAI decided to use more NVIDIA chip servers to develop the mother model of o3, which essentially provides the model with stronger processing power to understand complex concepts.

Another reason is that researchers gave the o3 mother model the ability to search the web or retrieve information from code repositories, which also helped its performance surpass that of the mother model of o1.

Models developed two months ago are not considered GPT-5

OpenAI publicly shared special test results highlighting the advantages of the model, which not only made headlines around the world but also sparked wild hype on social media. However, reality soon set in.

Those involved in the development said that when OpenAI converted the o3 mother model into a ChatGPT version that allows people to ask questions (also known as a student model), the benefits significantly decreased, and even performed no better than o1. They said the same problem occurred when OpenAI created the model version for commercial APIs.

An informed source said one of the reasons for this situation is related to the unique way the model understands concepts, which may differ from human communication methods. He said that creating a chat-based version actually lowers the level of the original model because it forces it to use human language instead of its own language.

We know that the nonsense sometimes appearing in the thinking process of inference models in ChatGPT reflects some communication differences.

According to another informed source, the company did not invest much effort in training to communicate better with humans.

Although there has been some regression, the o3 inference model released by OpenAI this year has still helped scientists in fields such as nuclear fusion and pathogen detection propose new hypotheses and conduct experiments.

However, the models in large language models and ChatGPT did not go as smoothly as OpenAI executives and researchers expected. Altman told employees that the o-series models seemed to confuse ChatGPT customers, so the company reverted to the original naming convention for GPT models.

According to a person involved in the development of GPT-5, as of June, due to technical issues, the models being developed by OpenAI did not seem to be sufficient to be labeled as GPT-5.

The technology of GPT-5, and the final ace

Nevertheless, OpenAI still has a trick up its sleeve: According to an informed source, they have been developing something called a "universal verifier," which can automatically execute the process of ensuring the model generates high-quality answers during reinforcement learning. This process essentially allows a large language model to check and rate the answers of another model using various sources of research.

In early summer this year, after an unannounced model achieved IMO gold medal results, senior researcher Alexander Wei posted on X that the reinforcement learning method he had been using was "general," meaning it could also verify the quality of answers in more subjective categories.

These advances seem to be helping OpenAI develop GPT-5, which has shown progress in more verifiable areas like software programming and more subjective areas like creative writing.

Other companies, including xAI and Google, also place great importance on reinforcement learning, considering it a promising technology for improving AI models. OpenAI's reinforcement learning department head Tworek recently publicly stated that he agrees the reinforcement learning system behind OpenAI's models is the true component of general artificial intelligence (AGI).

OpenAI's upcoming GPT-5 is highly anticipated. Last week, Sam Altman promoted GPT-5's features in a podcast with comedian Theo Von and introduced how the model easily answered questions he didn't understand. Altman said, "GPT-5 is smarter than us in almost every aspect."

It is precisely because of the promising prospects that OpenAI has made significant progress in its latest round of financing.

New round of financing, venture capital vying to buy in

According to a report by The New York Times this Friday, OpenAI has just raised $8.3 billion, bringing its valuation to $300 billion. This deal is part of a broader strategy for OpenAI to raise $4 billion this year.

The report states that this round of financing exceeded expectations and was completed several months ahead of schedule. OpenAI initially raised $2.5 billion from venture capital firms in March, at the time announcing plans to raise $4 billion in a round led by SoftBank. OpenAI originally planned to raise another $7.5 billion by the end of the year, but due to investors' eagerness to join its equity structure amid strong growth, OpenAI ultimately secured funding at a lower cost.

ChatGPT has over 700 million weekly active users, driving OpenAI's annual revenue to nearly $13 billion, and it is expected to reach $20 billion by the end of the year. In addition, the U.S. government's "AI Action Plan" and negotiations with Microsoft may help this large startup achieve its annual net profit target.

This round of financing was led by Dragoneer Investment Group, which invested $2.8 billion. Many new investors also participated in the financing, including private equity giants Blackstone and TPG, mutual fund management companies T. Rowe Price, other participants include Altimeter Capital, Andreessen Horowitz, Coatue Management, D1 Capital Partners, Fidelity Management, Founders Fund, Sequoia Capital, Tiger Global, and Thrive Capital.

It is reported that some of OpenAI's early investors are also frustrated with the smaller amounts of funds they received in this round of financing.

Reference content:

https://www.theinformation.com/articles/inside-openais-rocky-path-gpt-5

https://www.nytimes.com/2025/08/01/business/dealbook/openai-ai-mega-funding-deal.html

Original: https://www.toutiao.com/article/7534551604717126179/

Statement: This article represents the views of the author. Please express your opinion by clicking the 【up/down】 buttons below.