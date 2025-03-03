With companies such as OpenAI, Meta, xAI and DeepSeek all rapidly scaling their own AI chatbots, there is a spotlight being thrown on not only who has the most ‘conversational’ or ‘powerful’ AI chatbot, but also which one can offer the most truthful answers when asked a question.

Last week, OpenAI released GPT-4.5, an upgraded version of the AI model that powers ChatGPT, that has been described by the company as OpenAI’s “largest and best model for chat yet. ”

In a blog post, OpenAI said that GPT-4.5 is better at picking up on and responding to subtle cues from users’ written prompts and is primed for tasks such as chatting, writing and coding.

GPT-4.5 is reportedly less prone to so-called “hallucinations” – a phenomenon where AI models begin to fabricate false responses to user prompts.

The company claimed the upgraded model has a hallucination rate of 37.1%, down from 61.8% from the last version.

OpenAI relied on a process called post-training, during which it incorporated human feedback to improve responses, reported Bloomberg.

The company also developed new ways to train the model by using data it derived from the information used to train its current GPT-4.0 model, said Mia Glaese, also a vice president of research at the company.

“Early testing shows that interacting with GPT 4.5 feels more natural,” OpenAI said in a blog post.

“Its broader knowledge base, improved ability to follow user intent, and greater ‘EQ’ make it useful for tasks like improving writing, programming, and solving practical problems.”

While less hallucination is definitely a step in the right direction for AI chatbots, OpenAI’s chief executive Sam Altman has tempered down overall expectations from GPT-4.5.

“It is a giant, expensive model,” said Altman. “A heads up: this isn’t a reasoning model and won’t crush benchmarks.”

Altman’s caveat comes as Elon Musk’s artificial intelligence startup xAI debuted its updated Grok-3 model chatbot technology a few weeks earlier in February. Musk had already made an unsolicited offer, which was turned down, to acquire OpenAI’s nonprofit arm for $97.4 billion.

The new Grok-3 has “more than 10 times” the computing power of its predecessor, claimed Musk.

The startup said that across math, science and coding benchmarks, Grok-3 beat OpenAI’s GPT-4o, Google Gemini, DeepSeek’s V3 model and Anthropic’s Claude.

xAI claims Grok 3 beats GPT-4o on benchmarks including AIME which uses a sample of math questions, and GPQA which uses PhD-level physics, biology, and chemistry problems.

Two models in the Grok 3 family, Grok 3 Reasoning and Grok 3 mini Reasoning, are reported to be even be able to “think through” problems, similar to “reasoning” models like OpenAI’s o3-mini and Chinese AI company DeepSeek’s R1.

OpenAI’s latest GPT 4.5 is initially being offered as a “research preview” to a limited group of software developers and ChatGPT Pro subscribers. Once tweaked with their feedback, it will be rolled out to a broader audience.