Baidu Says Ernie 3.5 Outdid ChatGPT and GPT-4 in Key Metrics

UTC by Ibukun Ogundare · 3 min read
Baidu Says Ernie 3.5 Outdid ChatGPT and GPT-4 in Key Metrics
Photo: Depositphotos

The result of the AGIEval and C-Eval tests shows that Ernie 3.5 achieved higher scores than other large models, including ChatGPT, and surpassed GPT-4.

Competition in the AI market heats up as China’s Baidu said its AI model Ernie 3.5 beats the popular OpenAI’s ChatGPT AND GPT-4 on key tests. The Chinese internet company unveiled the Ernie bot at an event in March. At the time of the announcement, CEO Robin Li said that the new product was imperfect and would continue to improve as people use it and give feedback. Within an hour after revealing the Ernie bot, Baidu stated that about 30,000 corporate clients joined the waitlist to access the chatbot.

Meanwhile, Baidu has been publicly testing Ernie Bot since it was unveiled in March. The chatbot, built on the Chinese search engine’s foundational AI model called Ernie, is trained on extensive data. On the other hand, ChatGPT, which Baidu said Ernie 3.5 outperformed, is based on OpenAI’s GPT 3.5 model. It also added its AI model beats OpenAI’s latest and more advanced model, GPT-4. It noted that Ernie 3.5 performed better than OpenAI’s product in Chinese language tests.

Baidu Claims Ernie 3.5 Is Better than ChatGPT in Multiple Key Area

The Chinese company made the claim while citing a report by China Science Daily. According to the report, a “Few-Shot evaluation” reveals that Ernie 3.5 outperformed ChatGPT in multiple test sets. The three evaluation benchmarks are AGIEval, C-Eval, and MMLU. Microsoft Research the AGIEval evaluation benchmark to examine the model’s performance level in the “human-oriented” standardized test. The focus is on 20 official, public, and distinct qualifying exams, such as SAT exam in the US and college entrance examinations in China. More include Bar exams, American GMAT, GME, and so on. In addition, Berkeley University, Columbia University, the University of Illinois at Urbana-Champaign, and the University of Chicago jointly release MMLU. The large-scale multi-task language understanding test measures the models’ English interdisciplinary professional ability. This test covers different educational areas like social sciences, humanities, science, technology, engineering and mathematics (STEM), and more.

Furthermore, the c-Eval evaluation is a Chinese basic model evaluation containing 13,948 multiple-choice questions covering 53 subjects. The evaluation benchmark was created and released by the joint effort of Tsinghua University, the University of Edinburgh, and Shanghai Jiaotong University.

The result of the AGIEval and C-Eval tests shows that Ernie 3.5 achieved higher scores than other large models, including ChatGPT, and surpassed GPT-4. Also, the Baidu AI model also outdid ChatGPT’s 40.27 points and GPT -4’s 56.96 points. Ernie 3.5 scored a whopping 64.37 points, making the first position. For the Chinese c-Eval evaluation, Ernie 3.5 outstripped ChatGPT. While the Chinese AI model scored the highest at 71.93 points, ChatGPT measured 51.70 points, and GPT-4 got 68.57 points. In addition, Baidu mentioned more results that showed that Ernie 3.5 has “outstanding Chinese ability” and outperformed ChatGPT and GPT-4.

Artificial Intelligence, Business News, News, Technology News
Related Articles