A few days ago, Google released its new Gemma model – DataGemma. While the world is experimenting with RAG to reduce hallucinations and increase accuracy, Google decided to use RIG (retrieval interleaved generation), a technique that integrates LLMs with Data Commons, an open-source database of public data.
Explaining what exactly RIG is, Andrey Gavrilenko, head of data science at PandaDoc, mentioned how Google has revealed new models to reduce hallucination and improve accuracy. “It integrates both RIG and RAG, but adds a new layer: pulling data from trusted sources like Data Commons,” he added, further suggesting how these models are using the database to pull accurate information.
Typically, when a user asks a question, the AI model begins answering based on its existing knowledge.
However, with RIG, if the AI model encounters a need for more current or specific data, it pauses to search for this information from reliable external sources like databases or websites. The model then seamlessly incorporates this newly acquired data into its response, alternating between generating content and retrieving information as needed.
This approach enables the AI to provide more up-to-date and accurate answers, especially for topics involving rapidly changing information such as current events or recent statistics.
Unlike RAG, which performs retrieval once before generating the answer, RIG can adapt in real time while generating responses. This allows the model to refine its output iteratively as it retrieves new information.
RAG vs RIG
RIG allows for multiple retrievals at various stages of response generation, addressing all aspects of a complex query. This dynamic approach enables the AI to recognise knowledge gaps as it generates a response and fetches data from external sources multiple times during the process.
By continuously retrieving and integrating information, RIG reduces the likelihood of inaccuracies in the final output. This is particularly beneficial for queries requiring up-to-date or specialised information.
However, compared to RAG, RIG can be slow in performance as it might take a stop at multiple points to trace the most recent information. Also, with RAG, the model can achieve higher factual accuracy when the LLM cites numbers from the retrieved tables.
Bensen Hsu, the founder of OpenRead mentioned that the key point of RIG is to reduce hallucinations which occur especially when responding to queries involving numerical and statistical data. He later pointed out that this is a significant issue as LLMs are becoming more widely used, and their outputs need to be accurate and trustworthy.
“The evaluation of the RIG approach shows that it can improve the factual accuracy of LLM outputs, with the 27B fine-tuned model performing better than the 7B fine-tuned model. The RAG approach also demonstrates promising results, with the long-context LLM making accurate statistical claims when provided with relevant data from Data Commons,” he added further.
Importance of Data Commons
The DataGemma-RIG-27B-IT model, in particular, uses Data Commons to dynamically fact-check and validate statistical information during response generation. This real-time retrieval and integration of data helps ensure high accuracy in the model’s outputs.
One of the most important concepts when talking about RIG is Data Commons, which is described as “a publicly available knowledge graph containing over 240 billion rich data points across hundreds of thousands of statistical variables”. This vast repository of data serves as the foundation for grounding AI responses in factual information.
The data in Data Commons comes from reputable organisations such as the United Nations (UN), the World Health Organisation (WHO), the Centers for Disease Control and Prevention (CDC), and the Census Bureau. This ensures that the information used is reliable and trustworthy.
Researchers tested how well RIG works by using 101 different types of questions. These questions covered many topics and styles, like asking about specific facts, comparing places, or requesting lists of information. They looked at four main things: how correct the facts were, how well the AI used the information from a database called Data Commons, how accurately it answered questions, and how much of the Data Commons information it used.
The results showed that RIG made a big difference. The researchers found that the RIG approach improved factuality from 5-17% to about 58%.
This big jump shows that RIG can help make AI language models much more reliable when it comes to providing accurate information. This means if researchers can tackle the response time of RIG, it can solve critical problems that we generally face including hallucinations and the quality of response.