Document Retriever

Document Retriever links AI models to your chosen documents and URLs, enabling accurate, up-to-date, and relevant AI responses for your specific use case.

Document Retriever

Component description

How the Document Retriever component works

The most significant setback of large language models is their tendency to present vague, outdated, or downright false information. To ensure the answers are always up to date and relevant to your use case, generative models need to be pointed to the right knowledge sources.

This approach, called the Retrieval-Augmented Generation (RAG), supplies generative models with your own knowledge sources. The retriever components, including the Document Retriever, allow you to use this method.

What is the Document Retriever component?

This component allows the chatbot to retrieve knowledge from your own sources, ensuring that the information is relevant, reliable, and up-to-date. This information comes directly from the sources you specified in the Documents and Schedules. The role of this component is to control the retrieval.

Flowhunt's Knowledge Retriever

Input Query

Specifies the query that’s used to look up relevant information. It can either be linked from a component or inputted manually. In most cases, your input query will be the Chat Input.

Document Count

This setting limits the amount of documents the flow should retrieve from, making sure the results remain relevant and don’t take too long to generate.

Document categories

This optional setting lets you limit the retrieval to one of the categories you’ve created in the Documents screen of Knowledge Sources.

Schedules

Lets you limit the retrieval to one of the Schedules you’ve specified in the Schedules screen of Knowledge Sources.

Threshold

The sources in your knowledge database will match the query to varying degrees. AI will rank these by relevance from 0 to 1. This setting lets you control how well the output must match the query.

The exact threshold depends on your use case, but generally, 0.7-0.8 is recommended for highly relevant answers from a reasonable amount of sources.

Imagine you set the threshold to 0.6 and have the following articles:

  • Article A: 0.8
  • Article B: 0.65
  • Article C: 0.5
  • Article D: 0.9

Only the articles with a relevance score of over 0.6 will make it into the output, that is, only A, B, and D.

  • A high threshold, such as 0.9, will return very relevant results that closely match the query, but it might struggle to find enough documents and miss some relevant ones.
  • A low threshold, for example, one below 0.5, will provide information from more documents, but it runs the risk of returning irrelevant information.

How to connect the Document Retriever component to your flow

The component contains just one input and one output handle:

  • Input Query: The query can be any text output. Common use cases would be connecting human Chat Input or a Generator.
  • Output: The output of any retriever-type component is always a Document.

The Document output contains structured data unsuitable for the final chat output. All components that take Documents as their input transform them into a user-friendly format. These are either Widget components or the Document to Text transformer.

Why Use the Document Retriever?

  • Grounding AI Models: Enhance the factual accuracy and relevance of AI outputs by providing real, contextual information from your organization’s knowledge base.
  • Contextual Augmentation: Supply LLMs or chatbots with supporting documents or reference material for more informed responses.
  • Flexible Filtering: Search can be fine-tuned by category, schedule, URL, document structure, or metadata, ensuring you surface only the most relevant information.
  • Custom Output: Choose how much content to retrieve, how to split it, and which metadata to include, making it easy to adapt for downstream AI processes or UI needs.
  • Agent Integration: With tool descriptions and naming, the component can be referenced as a tool in agent-based architectures.

Example Use Cases

  • Retrieval-Augmented Generation (RAG): Provide LLMs with supporting documents to generate accurate, knowledge-backed responses.
  • Chatbots and Virtual Assistants: Quickly surface FAQs or policy documents in response to employee/customer questions.
  • Data Enrichment: Pull in product, author, or other metadata for further AI-driven analysis or workflow automation.

Example

Let’s Try it Now! Before building the flow, we must ensure we have created relevant Documents or Schedules. If no good source is present, the chatbot will either apologize for being unable to answer.

Steps:

  1. Start with Chat Input.
  2. Add the Document Retriever and connect Chat Input as the Input Query.
  3. The output is a Document that needs to be transformed; for this example, we will use the Document to Text.
  4. Next, connect an AI Generator.
  5. You’re ready to chat.
Example of how to use Document Retriever in Flowhunt

Now our Flow can search our sources based on a human query, transform the structured data into readable text, and pass it to AI to generate a user-friendly answer.

Our Knowledge Sources contain a Schedule set to crawl FlowHunt’s pricing page for up-to-date information. Let’s ask the bot about it:

Flowhunt bot's answer about URLsLab's pricing

Examples of flow templates using Document Retriever component

To help you get started quickly, we have prepared several example flow templates that demonstrate how to use the Document Retriever component effectively. These templates showcase different use cases and best practices, making it easier for you to understand and implement the component in your own projects.

AI Chatbot with FreshChat & Knowledge Base Support
AI Chatbot with FreshChat & Knowledge Base Support

AI Chatbot with FreshChat & Knowledge Base Support

Deploy a smart AI chatbot that integrates seamlessly with FreshChat. The chatbot answers user inquiries using your internal knowledge base and intelligently for...

3 min read
AI Chatbot with LiveChat.com Integration
AI Chatbot with LiveChat.com Integration

AI Chatbot with LiveChat.com Integration

Deploy an AI-powered chatbot on your website that leverages your internal knowledge base to answer customer queries, and seamlessly forwards complex or unresolv...

4 min read
AI Chatbot with Slack Human Escalation
AI Chatbot with Slack Human Escalation

AI Chatbot with Slack Human Escalation

Deploy a smart customer support chatbot for LiveAgent that automatically answers visitor questions, retrieves knowledge base documents, and escalates to a human...

4 min read
AI Chatbot with Tawk Human Handoff
AI Chatbot with Tawk Human Handoff

AI Chatbot with Tawk Human Handoff

An AI-powered live chat support chatbot that answers customer questions using an internal knowledge base and intelligently hands off complex queries to human ag...

3 min read
AI Customer Service Chatbot
AI Customer Service Chatbot

AI Customer Service Chatbot

An AI-powered customer service chatbot that uses your internal knowledge sources to provide instant, accurate, and helpful responses to customer queries. It lev...

3 min read
AI Customer Service Chatbot with Human Handoff
AI Customer Service Chatbot with Human Handoff

AI Customer Service Chatbot with Human Handoff

An AI-powered customer service chatbot that automatically assists users, retrieves information from internal documents and the web, and seamlessly escalates to ...

3 min read
AI Customer Support Agent With LiveAgent API Integration
AI Customer Support Agent With LiveAgent API Integration

AI Customer Support Agent With LiveAgent API Integration

This AI-powered workflow automates customer support by connecting user queries to company knowledge sources, external APIs (such as LiveAgent), and a language m...

5 min read
AI Email Assistant for Gmail
AI Email Assistant for Gmail

AI Email Assistant for Gmail

Automate Gmail inbox management with an AI agent that reads incoming emails, leverages your knowledge base to craft professional replies, and can send, label, o...

3 min read
AI HubSpot Lead Generation Chatbot
AI HubSpot Lead Generation Chatbot

AI HubSpot Lead Generation Chatbot

This AI-powered workflow automates lead qualification and contact management in HubSpot. The chatbot collects user information, researches company details, iden...

3 min read
AI Lead Generation Chatbot with Email Notification
AI Lead Generation Chatbot with Email Notification

AI Lead Generation Chatbot with Email Notification

This AI-powered lead generation chatbot provides personalized customer support using your internal knowledge base, identifies potential leads in real-time, and ...

4 min read
AI Support Chatbot with LiveAgent Integration
AI Support Chatbot with LiveAgent Integration

AI Support Chatbot with LiveAgent Integration

Automate your customer support with an AI chatbot that answers questions using your internal knowledge base and seamlessly connects users to a human agent via L...

4 min read
AI-Powered Outlook Email Reply Automation
AI-Powered Outlook Email Reply Automation

AI-Powered Outlook Email Reply Automation

Automate professional email replies in Outlook using an AI agent that leverages organizational knowledge sources. Incoming emails are received, parsed, and answ...

3 min read
ChatGPT Knowledge Base Assistant
ChatGPT Knowledge Base Assistant

ChatGPT Knowledge Base Assistant

AI chatbot assistant powered by OpenAI GPT-4o that automatically searches and leverages internal company documents to answer user questions. Delivers context-aw...

3 min read
Convert Technical Documentation to SEO Article
Convert Technical Documentation to SEO Article

Convert Technical Documentation to SEO Article

Transform technical documentation from a URL into a compelling, SEO-optimized article for your website. This flow analyzes top-ranking competitor content, gener...

4 min read
HUGO Markdown File Translator
HUGO Markdown File Translator

HUGO Markdown File Translator

This workflow streamlines the translation of HUGO markdown files into target languages while preserving file structure and formatting. Leveraging AI language mo...

3 min read
LiveAgent AI Chatbot Support
LiveAgent AI Chatbot Support

LiveAgent AI Chatbot Support

Automate customer support in LiveAgent with an AI chatbot that answers questions using your internal knowledge base, retrieves relevant documents, and seamlessl...

4 min read
Related Articles Paragraph Generator
Related Articles Paragraph Generator

Related Articles Paragraph Generator

Automatically generates a short, engaging paragraph for your website that includes links to the most relevant related articles. This AI-powered workflow analyze...

4 min read
Semantic Knowledgebase Search
Semantic Knowledgebase Search

Semantic Knowledgebase Search

Easily search and retrieve information from private knowledgebase documents using semantic search powered by AI. The flow expands user queries, searches across ...

3 min read
SEO Content Gap Analyzer
SEO Content Gap Analyzer

SEO Content Gap Analyzer

This AI-powered workflow analyzes the content structure of your web page, compares it with top-ranking competitor pages, and provides tailored recommendations o...

4 min read
Shopify AI Customer Support Agent
Shopify AI Customer Support Agent

Shopify AI Customer Support Agent

A workflow for an AI-powered customer service agent that can answer queries about Shopify products, retrieve order status, and access information from internal ...

4 min read
Smartsupp AI Chatbot with Human Handoff
Smartsupp AI Chatbot with Human Handoff

Smartsupp AI Chatbot with Human Handoff

This workflow creates an AI-powered chatbot integrated with Smartsupp, leveraging an internal knowledge base to answer customer support inquiries. If the chatbo...

3 min read
Website & Video Conclusion Generator
Website & Video Conclusion Generator

Website & Video Conclusion Generator

Generate concise conclusions from websites, uploaded documents, or YouTube videos using AI. Perfect for quickly summarizing key takeaways and creating article e...

3 min read

Frequently asked questions

What is the DOcument Retriever component?

This component allows the Flow to retrieve knowledge from your own sources, such as documents and URLs, ensuring the returned information is relevant, reliable, and up-to-date.

Why can’t I connect a Document Retriever to Chat Output?

Retriever components create structured data that is not suitable for output. It must first be transformed to text or visual format before sending to the Chat Output component.

Where does the Knowledge Retriever get information from?

The component searches for the closest query match within the information from user-specified URLs, documents, and schedules.

How many documents does it return?

You can set a limit for the number of results returned, ensuring only the most relevant content is included in your flow.

Can I filter which documents are searched?

Yes, you can filter by document categories, schedules, or URLs, focusing the search on specific segments of your knowledge base.

Can I connect both the Document Retriever and GoogleSearch? If so, which one is prioritized?

You can use both simultaneously. Each retriever leads to its own output, with priority set by the order of outputs in the canvas. The first output from the top is prioritized.

Try FlowHunt's Document Retriever

Build smarter AI solutions by connecting your knowledge sources and ensuring your chatbot always delivers relevant, up-to-date answers.

Learn more

GoogleSearch Component
GoogleSearch Component

GoogleSearch Component

FlowHunt's GoogleSearch component enhances chatbot accuracy using Retrieval-Augmented Generation (RAG) to access up-to-date knowledge from Google. Control resul...

4 min read
AI Components +4
Knowledge Sources
Knowledge Sources

Knowledge Sources

Knowledge Sources make teaching the AI according to your needs a breeze. Discover all the ways of linking knowledge with FlowHunt. Easily connect websites, docu...

3 min read
AI Knowledge Management +3
Question Answering
Question Answering

Question Answering

Question Answering with Retrieval-Augmented Generation (RAG) combines information retrieval and natural language generation to enhance large language models (LL...

5 min read
AI Question Answering +4