The Profit Elevator
Posts
How to Avoid AI Hallucinations

How to Avoid AI Hallucinations

Scaling with processes is increasingly about how you use AI

Simon Maughan
December 17, 2024

One Billion Users

I went to a Christmas Fair with a friend who said he’d started using ChatGPT. He recognises he is late to the party but he was bowled over by its ability to boost his work researching and writing about property. He is not that late though, as OpenAI announced the goal of quadrupling users to over one billion in 2025.

ChatGPT is not the most popular AI tool, as Meta’s AI assistant has almost 500 million users. While we cannot compare a large language model to software, this does suggest that the most common use case for AI may be augmenting SaaS rather than recreating workflows with Large Language Models (LLMs). The latter often requires advice and assistance, so in the meantime, what can we do to get better results from personal use of ChatGPT?

Natural Language Generation

LLMs predict the next word based on previous words. Their output depends on the volume and nature of data used to train the model. OpenAI’s models, such as ChatGPT, tend to perform best because they are based on the most data.

We do not know in detail how LLMs work, which means research into improving results is published all the time. Some assume that natural language generation favours fluent writers, but prompting is its own form of communication taking into account how LLMs work. Users improve with practice but don’t become great novelists.

LLMs have been compared to an intelligent but naïve recruit. Training and instruction determines their progress as much as innate ability. Corporations will look increasingly to AI agents in place of new hires and as the CEO of NVIDIA said on the Bg2 podcast,

"I'm hoping that Nvidia someday will be a 50,000 employee company with a 100 million, you know, AI assistants, in every single group," – Jenson Huang.

AI agents do not replace humans but allow us unprecedented scale. The ability to interact with them, will therefore be an important skill in the near future.

Adding Context and Data

On occasion my wife talks to me about a topic we discussed the day before. If I am engrossed in my phone I don’t grasp her meaning and if I didn’t pay attention yesterday, I am lost. In the first case I lack context and in the second data.

LLMs perform zero and few shot learning, which means they understand concepts that are not familiar from the training data. Just as I try and piece together the previous conversation from what is being said, the model infers an answer by recognising similar situations. We’ve been married for 23 years and are good at filling in the gaps, while models gain a similar understanding from vast amounts of training data.

You can improve your performance with LLMs by providing examples of output, such as previous reports you want copied. By adding data you go from zero to few shot learning. You can also provide context, for instance by including limiting conditions in a query.

If you are not sure of an answer then challenge it. I asked Google’s Gemini for examples of adding context and it replied with an example of adding data. It corrected its response when challenged. The initial error may be caused by unclear input from users, or by the way that LLMs work.

Statistical Patterns and Mixing Up Words

Machine learning is pattern matching, which means models do not understand data and context the way humans do. Therefore, even with context, hallucinations still occur.

Pattern recognition can be powerful, as when machines spot irregularities in x-rays and diagnose disease before doctors. But as it is based on the most likely outcome, it will on occasion be wrong.

There are techniques to overcome these inaccuracies. One is to ask the model to respond in steps, or enter a chain of prompts that breaks down requests. To see this at work prompt an LLM with:

2x = 36-9y and 6y = x+3. What are x and y?

If we breakdown reasoning requests in a similar fashion to algebra puzzles, then the model is more likely to produce a correct answer. Try adding “Let’s think step-by-step” to your queries.

LLMs translate words into a string of numbers. When these numbers have a high correlation then words become interchangeable, such as car and automobile. Yet names such as Paul and Paula are correlated and may be swapped for one another in error.

When source material contains similar noun-phrasing the model can make mistakes. If data describes the population, area and age of both London and Paris, the model may confuse the two. The solution is fully formatted facts.

We are now moving beyond prompt engineering into Retrieval Augmented Generation (RAG), in which a model is provided with additional information. This can be rewritten to make it easier for models to understand, by simplifying sentences, making each one true on its own and separating potentially conflicting information. Fully formatted documents are a terrible read, because they repeat nouns and add detail about them in every sentence.

Converting RAG text into fully formatted form most likely requires outside assistance. For now, avoid using pronouns such as it, he and she in queries and repeat the noun. LLMs interpret each sentence separately and need constant reminding what they are working on.

The best way to do this is by thinking through a problem and seeking answers for each step of the way. Any guidance you provide for expected output should improve the accuracy of each response.

Questions to Ask and Answer

Have I asked one question for which I expect one answer?
Have I included examples of the output I expect to receive?
Have I explained my terms and defined any ambiguous words?

When you are ready there are three ways I can help:

1. Resolving Team Conflicts: A free email course tackling an issue that no one ever teaches you as a manager. This is an excellent introduction to one of the foundational understandings of The Profit Through Process Planner.

2. The Management Mentality Map: 5 skills to prepare you for leadership and the techniques you need when marketing and selling to people.

3. The Profit Through Process Planner: My flagship course on how to design and invigorate a business that scales. I share 30 years of experience of researching, investing in and running companies, intermingled with the science and stories of business.

Reply

or to participate.