2024T-EDGE文章详情顶部

Exclusive: AI Unicorn MiniMax to Launch First End-to-End Real-Time Voice Conversation API Product in November

By 2026, the market size of conversational AI is expected to reach 10.8 billion yuan.

(Image source: Photo by TMTPost App editor Lin Zhijia)

(Image source: Photo by TMTPost App editor Lin Zhijia)

TMTPOST -- AI large model unicorn MiniMax will release a realtime API service in November, comparable to GPT-4o released in May. This will enhance end-to-end real-time multimodal processing capabilities and offer lower latency, more natural, and immersive real-time voice conversations, providing services for various scenarios such as enterprise collaboration, social networking, live streaming, and gaming.

This is MiniMax's first end-to-end real-time voice conversation product. Insiders told TMTPost App that they are refining this product internally and are very eager for the product's performance to directly compete with OpenAI GPT-4o upon its release in November.

GPT-4o, launched by OpenAI, is available for free and can perform real-time audio, visual, and text reasoning. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, matching human reaction speed in conversations. In terms of API usage, compared to the GPT-4-turbo released last November, GPT-4o's price is reduced by half (50%), and its speed is doubled (200%).

OpenAI CEO Sam Altman revealed in a tweet that the new GPT-4o is the best model OpenAI has ever created. It is intelligent, fast, natively multimodal, and available to all ChatGPT users, whether on the free version or the paid GPT-4 version.

In October this year, Agora, a real-time voice technology company and a sister company of Agora, appeared as a voice API collaborator in the public beta of OpenAI's Realtime API. MiniMax also saw an opportunity and began collaborating with Agora. Zhao Bin, the founder and CEO of Agora, stated at the RTE 2024 10th Real-Time Internet Conference that Agora and MiniMax are refining China's first Realtime API. Products based on this API can engage in easy and smooth real-time voice communication with humans.

In addition to MiniMax, other Chinese companies such as iFlytek, Zhipu AI, and SenseTime are also developing generative AI dialogue products, all of which are comparable in performance to GPT-4o. OpenAI has recently also opened the ChatGPT-4o dialogue function.

According to statistics from iResearch, the market size of conversational AI was 4.5 billion yuan in 2021, driving a scale of 12.6 billion yuan. It is expected that by 2026, the market size of conversational AI will reach 10.8 billion yuan, driving a scale of over 38.5 billion yuan, with a five-year compound annual growth rate (CAGR) of 32.5%.

(Author|Lin Zhijia, Editor|Hu Runfeng)

转载请注明出处、作者和本文链接
声明:文章内容仅供参考、交流、学习、不构成投资建议。
想和千万钛媒体用户分享你的新奇观点和发现,点击这里投稿 。创业或融资寻求报道,点击这里
发表评论
0 / 300

根据《网络安全法》实名制要求,请绑定手机号后发表评论

登录后输入评论内容

AWARDS-文章详情右上

扫描下载App