Yes jili app for android,LOVEJILI VIP.Recharge Every day and Get Bonus up-to 50%!

Artificial intelligence (AI) is nothing without data. Hence, the underlying assumption has been that whoever wins the AI race must have a pool of training data that’s not just vast but better than others.

Take OpenAI’s GPT-4 and Meta’s Llama 3.1 as examples; they run on 1.7 trillion and 405 billion parameters, respectively, and have gained popularity due to their gigantic training datasets.

Each newly released large language model (LLM) often shatters the parameter record of its predecessor, a trend that has become a central benchmark for measuring progress in LLM development on Hugging Face.

But is bigger always better? Diffbot Technologies, a California-based startup known for its knowledge graph technology, does not think so.

Key Takeaways

Diffbot’s AI model challenges larger LLMs by emphasizing factual accuracy over sheer parameter size.
It uses a proprietary knowledge graph with real-time updates instead of relying on pre-trained data.
Diffbot’s graph retrieval-augmented generation (GraphRAG) allows dynamic knowledge retrieval, reducing reliance on static datasets.
Benchmark tests suggest Diffbot outperforms leading LLMs like GPT-4 and Gemini in real-time factual accuracy.
Experts believe Diffbot’s hybrid approach offers a step toward solving AI hallucinations but acknowledge that no model is entirely immune.

Table of Contents Table of Contents

Key Takeaways
Diffbot's Different Look at AI Models
Our Quick Test of Diffbot's AI Model on Slippery Questions
Can Diffbot's AI Model Solve the AI Hallucination Puzzle?
The Bottom Line
FAQs
References

Table of Contents

Key Takeaways
Diffbot's Different Look at AI Models
Our Quick Test of Diffbot's AI Model on Slippery Questions

Show Full Guide

Can Diffbot's AI Model Solve the AI Hallucination Puzzle?
The Bottom Line
FAQs
References

Diffbot’s Different Look at AI Models

On January 9, Diffbot launched its first open-source LLM and claims that despite not being anywhere near the parameters of GPT-4 or Llama 3.1, it beats them on factual accuracy.

Major LLMs like GPT-4, Geminin Ultra, and Llama 3 are built by tapping into colossal datasets, often running into trillions of tokens. These AI models are fed a diverse diet of web content, books, articles, and code, which allows them to pick up complex language patterns.

Our Quick Test of Diffbot’s AI Model on Slippery Questions

To get a clear picture of how Diffbot’s AI model handles scenarios prone to hallucinations, we tested the publicly available demo on Diffi.chat and compared its responses to those from ChatGPT’s free tier and Gemini 1.5 Flash.

We used the prompts that are already known for setting hallucination traps for AI chatbots, and below is how each of the AI models handled them.

First, we asked the chatbots:

What is the world record for crossing the English Channel on Foot?

Diffbot AI managed to avoid the hallucination trap, giving a correct but a bit too wordy answer.

When we prompted the Gemini 1.5 Flash version with the same question, it fell for the hallucination trap and even supported its answer with an image.

Question: What is the world record for crossing the English Channel on foot? The essence of the answer: The record is 5 hours and 9 minutes, set by Mark Walton in 2006 while paddling prone, supported by Richard Bishop. — Google’s Gemini falls for the hallucination trap. Source: Gemini/Franklin Okeke for Techopedia

The same prompt was fed to the ChatGPT free plan, which produced a more comprehensive and accurate response.

Question: What is the world record for crossing the English Channel on foot? The essence of the answer: It is impossible to cross the English Channel on foot; however, the swimming record is 6 hours and 55 minutes by Trent Grimsey in 2012 — ChatGPT responds to the hallucination prompt test. Source: ChatGPT/Franklin Okeke for Techopedia

In addition to the above prompts, we set three other hallucination traps using prompts like “Who was the sole survivor of Titanic?” “Write a description of a landscape in four-word sentences.”

While Diffy.chat and ChatGPT scaled through them with more accurate responses, Gemini 1.5 Flash botched them up. However, when we prompted Gemini on a second try, it provided a more accurate response.

This little experiment might not represent a complex way to trick AI into hallucination; however, it shows, at a basic level, how Diffbot’s AI model, which is lightweight compared to other leading LLMs, might perform when the going gets tough.

Can Diffbot’s AI Model Solve the AI Hallucination Puzzle?

Recent research suggests that leading LLMs still struggle with factual accuracy.

Benchmarks like C-Eval and AGIEval show that while top models achieve over 80% accuracy in basic knowledge tasks, their performance drops to 50-60% in professional-level reasoning.

Similarly, multi-dimensional assessments through platforms like OpenCompass further demonstrate strong capabilities in language understanding and knowledge retrieval (over 80% accuracy), though accuracy falls below 65% in tasks requiring advanced reasoning or specialized expertise.

According to Diffbot, benchmark tests show that its model hits 81% accuracy on FreshQA, a Google-designed benchmark for real-time factual knowledge, outpacing ChatGPT and Gemini. It also clocked in at 70.36% on MMLU-Pro, a harder version of a standard academic knowledge test.

These benchmark scores highlight Diffbot’s success in tackling factual accuracy, one of AI’s toughest challenges to date.

Rogers Jeffrey Leo John, co-founder and CTO of DataChat, a no-code, generative AI platform for analytics, said that Diffbot’s dynamic approach addresses the problem of static training data in LLMs.

Leo John told Techopedia:

“Diffbot acts like an LLM that can search and synthesize information from trusted sources, such as libraries or Wikipedia, in real time. Unlike GPT or Gemini, which rely on massive parameters and external search engines, Diffbot uses fewer parameters while leveraging a vetted knowledge engine to deliver efficient, high-quality answers.”

In a chat with Techopedia, Founder and CEO of QueryPal, Dev Nag, mentioned that Diffbot’s hybrid approach represents a shift in how LLMs are designed and utilized. He said:

“By separating the knowledge layer, powered by their trillion-fact Knowledge Graph, from the reasoning layer in their fine-tuned Llama model, Diffbot can keep its knowledge up to date without the need for costly and time-consuming retraining.”

Nag emphasized that this GraphRAG design allows LLMs to focus on querying external sources rather than memorizing facts.

While acknowledging that this doesn’t entirely eliminate hallucinations, as models can still misinterpret or combine retrieved facts incorrectly, it provides a critical advantage.

“The generation process is grounded in verifiable data points, each linked to source nodes in the graph, often with URLs, making the outputs more transparent and auditable for users,” Nag added.

The Bottom Line

Hallucination remains one of the banes of AI models despite efforts to solve it. Diffbot’s method of pairing a fine-tuned version of Meta’s Llama 3.3 with real-time querying of its knowledge graph could be one step closer to the solution.

While Dom Couldwell, Head of Field Engineering EMEA at DataStax, agrees that Diffbot’s model will help with more accurate responses, he maintains that the best way to solve hallucinations in LLMs is for organizations to use their data to train models in the context of their application.

Couldwell told Techopedia via email:

“Using your own data as part of your system provides the AI with more relevant data to pull from and create that relevant response. The process is called Retrieval Augmented Generation, or RAG. If you want to reduce hallucinations and improve your relevancy, spend time on the context that your application works in and what data it uses.”

FAQs

How does Diffbot’s AI model differ from traditional LLMs?

What is graph retrieval-augmented generation (GraphRAG)?

How does Diffbot’s AI model perform against ChatGPT and Gemini?

Can Diffbot’s AI model eliminate hallucinations in AI?

References

Diffbot Launches World’s Most Factually Grounded Language Model: New Benchmark in AI-Powered Knowledge Retrieval (Diffbot)
Diffy Chat (Diffy)
Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study (Arxiv)
GitHub – freshllms/freshqa: Data and code for FreshLLMs (https://arxiv.org/abs/2310.03214) (GitHub)
[2406.01574] MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark (Arxiv)

Diffbot’s AI Model Suggests “Smaller Is Better” for LLMs

Key Takeaways

Diffbot’s Different Look at AI Models

Our Quick Test of Diffbot’s AI Model on Slippery Questions

Can Diffbot’s AI Model Solve the AI Hallucination Puzzle?

The Bottom Line

FAQs

How does Diffbot’s AI model differ from traditional LLMs?

What is graph retrieval-augmented generation (GraphRAG)?

How does Diffbot’s AI model perform against ChatGPT and Gemini?

Can Diffbot’s AI model eliminate hallucinations in AI?

References

Franklin Okeke

Most Popular Terms

Adversarial Attack

Local LLM (Private LLM)

Image Classification

GambleDictionary

Online Poker

latest Q&A

How will AI Impact Professional sports?

AI Content Detection Flaws & Best Practices for Accuracy

5 Chinese AI Startups to Watch Beyond DeepSeek in 2025

ChatGPT 5: Everything We Know So Far About OpenAI’s Next-Gen AI Model

Paris AI Action Summit: Could This Be Europe’s AI Resurgence?

How to Navigate Gemini 2.0: Flash, Flash-Lite & Pro

AI Spending Surges in 2025: Tech Giants Double Down on AI

AI Democratization: Will It Become Available to All?

What’s Going On With AI Today? Disruptions, Market Shifts & Growing Concerns

Key Takeaways

Diffbot’s Different Look at AI Models

Our Quick Test of Diffbot’s AI Model on Slippery Questions

Can Diffbot’s AI Model Solve the AI Hallucination Puzzle?

The Bottom Line

FAQs

How does Diffbot’s AI model differ from traditional LLMs?

What is graph retrieval-augmented generation (GraphRAG)?

How does Diffbot’s AI model perform against ChatGPT and Gemini?

Can Diffbot’s AI model eliminate hallucinations in AI?

References

Related Reading

Related Terms

About Techopedia’s Editorial Process

Franklin Okeke

Franklin Okeke

Most Popular Terms

Adversarial Attack

Local LLM (Private LLM)

Image Classification

GambleDictionary

Online Poker

latest Q&A

How will AI Impact Professional sports?

Most Popular News

Related Features

AI Content Detection Flaws & Best Practices for Accuracy

5 Chinese AI Startups to Watch Beyond DeepSeek in 2025

ChatGPT 5: Everything We Know So Far About OpenAI’s Next-Gen AI Model

Paris AI Action Summit: Could This Be Europe’s AI Resurgence?

How to Navigate Gemini 2.0: Flash, Flash-Lite & Pro

AI Spending Surges in 2025: Tech Giants Double Down on AI

AI Democratization: Will It Become Available to All?

What’s Going On With AI Today? Disruptions, Market Shifts & Growing Concerns