Baidu Says Ernie 3.5 Outdid ChatGPT and GPT-4 in Key Metrics

On Jun 27, 2023 at 11:40 am UTC by · 3 min read

The result of the AGIEval and C-Eval tests shows that Ernie 3.5 achieved higher scores than other large models, including ChatGPT, and surpassed GPT-4.

Competition in the AI market heats up as China’s Baidu said its AI model Ernie 3.5 beats the popular OpenAI’s ChatGPT AND GPT-4 on key tests. The Chinese internet company unveiled the Ernie bot at an event in March. At the time of the announcement, CEO Robin Li said that the new product was imperfect and would continue to improve as people use it and give feedback. Within an hour after revealing the Ernie bot, Baidu stated that about 30,000 corporate clients joined the waitlist to access the chatbot.

Meanwhile, Baidu has been publicly testing Ernie Bot since it was unveiled in March. The chatbot, built on the Chinese search engine’s foundational AI model called Ernie, is trained on extensive data. On the other hand, ChatGPT, which Baidu said Ernie 3.5 outperformed, is based on OpenAI’s GPT 3.5 model. It also added its AI model beats OpenAI’s latest and more advanced model, GPT-4. It noted that Ernie 3.5 performed better than OpenAI’s product in Chinese language tests.

Baidu Claims Ernie 3.5 Is Better than ChatGPT in Multiple Key Area

The Chinese company made the claim while citing a report by China Science Daily. According to the report, a “Few-Shot evaluation” reveals that Ernie 3.5 outperformed ChatGPT in multiple test sets. The three evaluation benchmarks are AGIEval, C-Eval, and MMLU. Microsoft Research the AGIEval evaluation benchmark to examine the model’s performance level in the “human-oriented” standardized test. The focus is on 20 official, public, and distinct qualifying exams, such as SAT exam in the US and college entrance examinations in China. More include Bar exams, American GMAT, GME, and so on. In addition, Berkeley University, Columbia University, the University of Illinois at Urbana-Champaign, and the University of Chicago jointly release MMLU. The large-scale multi-task language understanding test measures the models’ English interdisciplinary professional ability. This test covers different educational areas like social sciences, humanities, science, technology, engineering and mathematics (STEM), and more.

Furthermore, the c-Eval evaluation is a Chinese basic model evaluation containing 13,948 multiple-choice questions covering 53 subjects. The evaluation benchmark was created and released by the joint effort of Tsinghua University, the University of Edinburgh, and Shanghai Jiaotong University.

The result of the AGIEval and C-Eval tests shows that Ernie 3.5 achieved higher scores than other large models, including ChatGPT, and surpassed GPT-4. Also, the Baidu AI model also outdid ChatGPT’s 40.27 points and GPT -4’s 56.96 points. Ernie 3.5 scored a whopping 64.37 points, making the first position. For the Chinese c-Eval evaluation, Ernie 3.5 outstripped ChatGPT. While the Chinese AI model scored the highest at 71.93 points, ChatGPT measured 51.70 points, and GPT-4 got 68.57 points. In addition, Baidu mentioned more results that showed that Ernie 3.5 has “outstanding Chinese ability” and outperformed ChatGPT and GPT-4.

Share:

Related Articles

OpenAI New GPT-4o Model Features Real-Time Conversations

By May 14th, 2024

In the near future, OpenAI plans to introduce the updated “Voice Mode” for GPT-4o users, and will release an alpha in the upcoming weeks. The GPT-4o model is faster than the existing GPT-4 model.

OpenAI Set to Challenge Google with New AI-Powered Search Product

By May 10th, 2024

By employing advanced AI, ChatGPT will able to fe­tch and display web data with proper citations, setting it apart from conve­ntional search engines.

OpenAI Faces Privacy Issues in Austria Due to Possible EU Law Violation

By April 29th, 2024

NOYB’s latest complaint may just be in line with its commitment to ensuring that these firms align with the European General Data Protection Regulation laws.

Exit mobile version