OpenAI Launches First Model With Reasoning Abilities

OpenAI CEO Sam Altman described o1 as the company's most capable and aligned models yet, but admitted that “o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.”

TMTPOST -- In a groundbreaking move, OpenAI has unveiled its latest AI model, 'o1,' which promises to redefine the landscape of artificial intelligence with its advanced reasoning capabilities.

Two distinct versions were released: o1-preview and o1-mini. The former is designed for high-level reasoning tasks in mathematics, programming, and scientific inquiries, boasting performance close to that of PhD-level experts. The latter is a more compact model optimized for code generation.

The o1 model is the highly-anticipated and touted 'Strawberry' project. Some industry insiders suggest that 'o1' stands for 'Orion.'

OpenAI has emphasized that this new model represents a fresh start in AI's ability to handle complex reasoning tasks, meriting a new naming convention distinct from the 'GPT-4' series. Meanwhile, this also marks another new starting point of the AI era - the important arrival of large models that can perform general complex reasoning.

Despite its advanced capabilities, the current chat experience with o1 remains basic. Unlike its predecessor GPT-4o, o1 does not offer functions such as browsing the web or handling file analysis tasks. Although it has image analysis capabilities, this feature is temporarily disabled pending further testing. Additionally, there are message limits: the number of passages sent on o1-preview is capped at 30 per week, while o1-mini allows for 50 messages per week.

Starting Friday, both versions are available to ChatGPT Plus/Team users and via API channels, with enterprise and educational users gaining priority access next week.

OpenAI CEO Sam Altman described o1 as the company's most capable and aligned models yet, but admitted that “o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.”

The training behind o1 is fundamentally different from its predecessors, said OpenAI’s research lead, Jerry Tworek. He said o1 “has been trained using a completely new optimization algorithm and a new training dataset specifically tailored for it.”

OpenAI taught previous GPT models to imitate patterns from its training data. With o1, it trained the model to solve problems on its own applying a technique known as reinforcement learning, which teaches the system through rewards and penalties. It then uses a “chain of thought” to process queries, similarly to the way humans process problems in a step-by-step manner.

OpenAI's new training methodology has led to a model that, according to the company, is more accurate. "We've noticed this model hallucinates less," says Tworek. However, the issue hasn’t been fully resolved. "We can’t claim to have eliminated hallucinations."

What distinguishes this new model from GPT-4o is its enhanced ability to solve complex problems, particularly in coding and math, while also providing explanations for its reasoning, OpenAI explains.

“The model is definitely better at solving the AP math test than I am, and I was a math minor in college,” says Bob McGrew, OpenAI’s chief research officer. OpenAI tested o1 on a qualifying exam for the International Mathematics Olympiad, where it solved 83% of the problems, compared to GPT-4o’s 13%.

In Codeforces programming contests, the model ranked in the 89th percentile of participants. OpenAI also claims the next update will perform similarly to PhD students on challenging benchmark tasks in physics, chemistry, and biology.

Despite these advancements, o1 lags behind GPT-4o in certain areas, such as factual knowledge about the world. It also lacks web-browsing capabilities and the ability to process files and images. Still, OpenAI views o1 as representing a new class of AI capabilities, naming it to symbolize "resetting the counter back to 1."

It is clear that while the new OpenAI o1 model does not yet possess a fully comprehensive problem-solving ability, its significantly improved reasoning capability makes it far more useful in specialized fields like science, programming, and mathematics. Additionally, the overall lower and upper limits of AI agent-related technologies have been raised, greatly enhancing capabilities in scientific research and production. However, its significance for the consumer sector is relatively limited.

Jim Fan, the Chief Scientist of Nvidia, noted that the new o1 model requires more computational power and data, and it can generate a data flywheel effect—correct answers and their thought processes can become valuable training data. This, in turn, continuously improves the reasoning core, much like how AlphaGo’s value network improved as more refined data was generated through MCTS (Monte Carlo Tree Search).

OpenAI's o1 series models significantly enhance reasoning capabilities and have introduced a new scaling paradigm: unlocking test time compute through reinforcement learning, according to Tianfeng Securities.

However, the model has its critics. Some users have noted delays in response times due to the multi-step processing involved in generating answers. Others have pointed out that while o1 excels in certain benchmarks, it does not yet surpass GPT-4o in all metrics. OpenAI's product manager, Joanne Jang, has cautioned against unrealistic expectations, emphasizing that o1 is a significant step forward but not a miracle solution.

The AI community remains divided over the terminology used to describe o1's capabilities. Terms like 'reasoning' and 'thinking' have sparked debate, with some experts arguing that these anthropomorphic descriptions can be misleading. Nonetheless, the o1 model's ability to perform tasks that require planning and multi-step problem-solving marks a notable advancement in AI technology.

Founded in 2015, OpenAI has been at the forefront of the tech industry's rapid shift towards AI. Its chatbot product, ChatGPT, first launched in 2022, sparked a global investment frenzy in AI.

OpenAI is in discussions to raise funds at a valuation of $150 billion, Bloomberg reported. The company is aiming to secure approximately $6.5 billion from investors including Apple, Nvidia and Microsoft, and is also exploring $5 billion in debt financing from banks.

OpenAI's CFO Sarah Friar recently mentioned in an internal memo that the upcoming round of financing will support the company's needs for increased computational capacity and other operational expenses. She emphasized that the company's goal is to allow employees to sell a portion of their shares in a buyback offer later this year.

(Sources: CNN, TechCrunch, The Verge.)

转载请注明出处、作者和本文链接
声明:文章内容仅供参考、交流、学习、不构成投资建议。
想和千万钛媒体用户分享你的新奇观点和发现,点击这里投稿 。创业或融资寻求报道,点击这里

敬原创,有钛度,得赞赏

赞赏支持
发表评论
0 / 300

根据《网络安全法》实名制要求,请绑定手机号后发表评论

登录后输入评论内容

快报

更多

2024-09-28 22:38

浙商证券:原本预计有望在岁末年初出现的中线攻势已经提前到来,反弹路径有两种可能

2024-09-28 22:18

长江证券:银行、地产、建筑和非银等板块或更有可能受益于破净公司估值提升计划

2024-09-28 21:51

股票ETF单周净流入超560亿,2只货币ETF流出均超百亿

2024-09-28 21:16

2024国庆档新片预售票房破亿

2024-09-28 21:15

伊朗伊斯兰革命卫队一指挥官在以色列对黎巴嫩的空袭中身亡

2024-09-28 21:02

招商蛇口南京某项目“买房送20万元股票”?售楼处回应:目前活动已下架

2024-09-28 20:13

9月28日新闻联播速览21条

2024-09-28 20:10

乘联分会秘书长崔东树:中国汽车不会被“卡脖子”,我们有超强的土壤根基

2024-09-28 19:51

江苏昆山“发霉蛋糕事件”涉事企业已被立案调查,暂无相关病例报告

2024-09-28 19:48

黎巴嫩真主党声明证实领导人纳斯鲁拉已被杀害

2024-09-28 19:31

《中国综合算力指数(2024年)》发布:全球算力基础设施总规模去年底已达到910EFLOPS(FP32)

2024-09-28 19:20

以媒称27日夜间以来黎真主党与其领导人纳斯鲁拉失联

2024-09-28 19:07

我国科研人员开发出太阳能“盐湖提锂”新技术

2024-09-28 18:37

8月份国家铁路发送货物3.37亿吨,同比增长4.8%

2024-09-28 18:37

石井启一正式成为日本公明党新任党首

2024-09-28 18:36

国庆假期或超15亿人次自驾出游,热门区域完善交通保障措施

2024-09-28 18:18

Meta因密码存储不当在爱尔兰被罚款1亿美元

2024-09-28 18:16

中央气象台9月28日18时继续发布暴雨蓝色预警

2024-09-28 18:15

上交所定于9月29日组织开展竞价、综业等平台相关业务测试

2024-09-28 18:15

“疆电外送”第四通道电源项目在塔克拉玛干沙漠开工

扫描下载App