Baichuan Intelligence Claims Its Product Surpasses Meta’s Llama-钛媒体官方网站

It is expected that Baichuan Intelligence will release a trillion-parameter model in the fourth quarter of this year and launch a "super application" around the first quarter of next year.

Sogou founder, Baichuan Intelligence founder and CEO Wang Xiaochuan (source: interviewee)

With the launch of Baichuan's large model at the end of August, Wang Xiaochuan's team has accelerated the development of new products.

TMTPost App has learned that Baichuan Intelligence, an artificial intelligence (AI) large model company, released on Wednesday its latest two open-source large models Baichuan2-7B and Baichuan2-13B with 7 billion and 13 billion parameters respectively in Beijing. These models have comprehensively improved their abilities in liberal arts and science, and support dozens of languages including Chinese and English. They can be applied in academic research, the internet, finance, and other fields.

Compared to the first generation, Baichuan2 has increased its mathematical ability by 49%, code ability by 46%, security ability by 37%, logical ability by 25%, and semantic understanding ability by 15%, all of which are at the best level among open-source models.

Wang Xiaochuan, the founder and CEO of Baichuan Intelligence, said that the 7 billion-parameter Baichuan2-7B has surpassed Meta's open-source large model Llama2-13B in mainstream tasks in Chinese and English. With the release of Baichuan2's open-source large model in China, the era of using Llama 2 as a shared open-source model has passed.

"Now we can have a more friendly and powerful open-source model than Llama2, which can help us support the development of China's entire large model ecosystem. In addition to open-source models, we may have a new breakthrough in closed-source models next time, hoping to contribute to China's social and economic development in the field of large models." Wang Xiaochuan said.

Zhang Bo, a professor in the Department of Computer Science at Tsinghua University and an academician of the Chinese Academy of Sciences, said that although China has released many large-scale models with parameter sizes ranging from several billion to several hundred billion and corresponding enterprises, these models are mostly applied in the industrial field, and their applications in academic research are relatively few. The problem of the illusion of large models is particularly serious. The application of Baichuan's open-source large models in academic research is particularly important and urgent, as it helps us to have a deeper explanation and understanding of large model technology.

"We must delve into and clarify these (interpretable, illusion) issues in order to better develop China's large-scale products," said Zhang Bo.

It is reported that Bai Chuan Intelligence was founded on April 10th of this year. It was jointly established by Wang Xiaochuan, the founder of Sogou, and Ru Liyun, the former COO of Sogou. Its aim is to create a Chinese version of OpenAI, build the best large-scale model base in China, and apply it in fields such as education and healthcare. Up to now, Bai Chuan Intelligence has announced a first round of $50 million in financing.

In the past 149 days, Bai Chuan Intelligence has released an average of one large-scale model every 28 days. It has released two open-source large-scale models, Baichuan-7B and Baichuan-13B, with parameters of 7 billion and 13 billion respectively, as well as the closed-source general-purpose large-scale model Baichuan-53B with 53 billion parameters, which was announced in August of this year. Its capabilities in writing and text creation have reached a relatively high level in the industry.

Wang Xiaochuan previously stated to TMTPost App that among open-source large-scale models, Bai Chuan Intelligence can already replace them in the Chinese domain and has surpassed closed-source GPT models in certain applications. He emphasized that in the future, 80% of scenarios may use open-source models. Bai Chuan Intelligence has currently completed the parallel layout of "open-source + closed-source" large-scale models and hopes to achieve the best model in China that benchmarks GPT.

It is reported that, as of now, the total download volume of Bai Chuan's open-source large-scale models has exceeded 5 million in the open-source community. Among them, Hugging Face reached a million downloads in the first week and 3.37 million in the past month. Moreover, on GitHub, the Baichuan series of models is the fastest-growing star-tagged Chinese large-scale model.

As for the enterprise side, as of now, more than 200 companies have applied for Bai Chuan's open-source and commercial licenses for large-scale models, and have already put the Bai Chuan models into practical production scenarios. The companies cover a wide range of fields, including internet, software and information technology, finance, law, education, manufacturing, and enterprise services. Customers include Alibaba Cloud, Tencent, Volcano Engine, JD Technology, SF Technology, Inspur, Agricultural Bank of China, and NIO.

This year, on August 31st, more than 10 large-scale model products under Baichuan Intelligence, including "Baichuan Big Model", completed their record filing, becoming the first batch of AI large-scale model products in China to provide similar ChatGPT services to the public.

In this release, Baichuan Intelligence announced the latest open-source large-scale model series, Baichuan2, which has significantly improved in both liberal arts and science subjects. It has been trained on a massive corpus of 2.6TB, with large-scale, comprehensive, and high-quality data. It includes scoring for chapters, paragraphs, and sentences, supports fine-grained sampling, and the training process is efficient, stable, and predictable. In terms of safety, it has implemented aligned safety values and achieved multi-stage and multi-objective reinforcement learning. At the same time, the Baichuan2 open-source large-scale model series provides greater transparency and openness by disclosing the intermediate process of model training with a scale of 300 billion to 26 trillion tokens, which will contribute to large-scale model research.

In addition, Wang Xiaochuan also announced the establishment of the Large-scale Model Research Fund jointly by the China Computer Federation (CCF) and Baichuan Intelligence. The fund aims to promote research on various technical aspects of large-scale models in different stages and dimensions, and to support medical and open-world agent fields. Furthermore, Baichuan Intelligence will collaborate with Amazon Web Services to establish an AI hackathon event, supporting AI large-scale model research in the fields of medical health and gaming entertainment, with a champion prize of over 200,000 yuan.

In terms of partners, Baichuan Intelligence has collaborated with Alibaba Cloud, Qualcomm, Inspur Digital Power, Hanbo Semiconductor, Volcano Engine, Cambricon Technologies, Huawei, and other companies to implement the Baichuan Big Model.

Wang Xiaochuan previously revealed to TMTPost App that Baichuan Intelligence will release a model with trillion parameters in the fourth quarter of this year, and is expected to launch a "super application" around the first quarter of next year.

(This article was first published on TMTPost App. Reporting by Lin Zhijia.)