The most significant setback of large language models is their tendency to present vague, outdated, or downright false information. To ensure the answers are always up to date and relevant to your use case, generative models need to be pointed to the right knowledge sources.
This approach, called the Retrieval-Augmented Generation (RAG), supplies generative models with your own knowledge sources. The retrieval components, including GoogleSearch, allow you to use this method easily.
What is the GoogleSearch component?
This component allows the flow to retrieve knowledge by searching Google for relevant content. It specifies the search query and dictates how the flow retrieves the information. It can be used concurrently with the other retrieval components to enhance the final answer.
Input Query
Specifies the query that’s used to look up relevant information. It can be linked from a component, such as Chat Input, or inputted manually.
Results Limit
This setting limits the number of links crawled for information. Google has extensive algorithms to rank results and ensure the top ones are the most relevant to the query. The top five results should be plenty for the flow to get relevant information and craft a meaningful response to most queries.
Language
If your website is in a specific language, the chatbot returning results in English would do more harm than good. Moreover, niche and local topics may give more relevant results when searched in the correct language.
Country
Setting the country is another line of defense ensuring relevant results. Imagine you have a financial advice chatbot, and a user asks about tax laws. Only setting the language could result in a person from the UK getting information about US tax laws, resulting in confusion and frustration.
Location
This optional setting allows you to narrow down the results even more. Returning results relevant only to a specific city or region is beneficial for some use cases.
Query prefix
Query prefixes are specific words or symbols you can use to narrow down the type of results. For example, you might use the “daterange:” prefix to ensure the results are recent or the “filetype:” prefix if you want a specific file format. Google supports a range of these prefixes, allowing for great control over the returned results.
How to connect the GoogleSearch component to your flow
The component contains just one input and one output handle:
- Input Query: The query can be any text output. Common use cases would be connecting Chat Input or a Generator.
- Output: The output of any retriever-type component is always a Document.
The Document output can only be read by the Document to Text and Widget-type components. These transform the raw search results into a presentable form ready for output.
Creating a Flow using GoogleSearch
Let’s create a simple chatbot utilizing Google Search as its main source of knowledge.
- As always, start with Chat Input.
- Connect the input to the GoogleSearch component, meaning the human query is the search prompt.
- Transform the retriever output. The output is a URL record; we want it to be plain text. Use the URL Record to Text component.
- You can optionally add Prompt and Chat History. We will do so, as it makes the output nicer and more conversational.
- Connect the Generator to add an LLM to the mix.
- You’re ready to output
Here’s our resulting Flow:
Let’s ask the GoogleSearch bot what the best AI model is:
The bot returns a list of the best models, commenting on each and listing the sources. We purposefully limited the length of output to fit the chat window. However, the result may be much more elaborate than this.
Frequently Asked Questions
What is the GoogleSearch component?
This component allows you to retrieve knowledge by searching Google for relevant content. The component also allows for control over returned results.
How can I limit the returned results?
You can limit the number of search results the bot crawls. You can also make the query more specific by setting the language, country, and even location. The most powerful limiter is the query prefixes, which allow you to specify the age of results, the file type, and much more.
Why can’t I connect the GoogleSearch component to the output?
The component does not output the information in text form. The output of GoogleSearch is a Knowledge Document. This is a more structured document, including data unsuitable for output. You must first transform the document to text form via the Knowledge to Text component.
Can I connect both the Document Retriever and GoogleSearch? If so, which one is prioritized?
You can use both simultaneously to make the results more relevant. Each retriever would lead to its own output. In this case, the priority is set by the order of outputs in the canvas. This means that if Document Retriever is the first output from the top, it will be prioritized over other retrievers.