Building a Knowledge-Base Agent with RAG in Minutes
You have a 50-page support document, a product manual, and a FAQ that no one reads. Your support team answers the same questions every day — questions that are already answered in those documents, if anyone bothered to look.
RAG (Retrieval Augmented Generation) solves this by giving your AI agent direct access to your documents. Instead of relying on its general training data, the AI searches your uploaded files for relevant information and uses that to answer questions. The result is an agent that gives accurate, specific answers grounded in your actual documentation.
hiroi handles the entire RAG pipeline for you. Upload your files, and the system takes care of the rest.
What RAG Actually Is
RAG stands for Retrieval Augmented Generation. Let me break that down without the jargon:
-
Retrieval — When a user asks a question, the system searches your uploaded documents for the most relevant passages. This is not keyword search — it is semantic search, meaning it understands meaning and intent, not just matching words.
-
Augmented — The retrieved passages are added to the AI's context alongside the user's question. The AI now has specific, relevant information from your documents to work with.
-
Generation — The AI generates a response using both its general knowledge and the specific information retrieved from your documents.
The key insight is that the AI is not memorizing your documents. It is searching them in real time, every time a question is asked. This means:
- Updated documents are reflected immediately
- The AI cites specific content rather than guessing
- Answers stay grounded in your actual materials
- Hallucination is dramatically reduced because the AI has real source material
How It Works in hiroi
Step 1: Upload Your Documents
Go to your agent's settings in the hiroi dashboard and upload files to the knowledge base. Supported formats:
- PDF — product manuals, whitepapers, reports
- DOCX — Word documents, policies, procedures
- TXT — plain text files, markdown content
There is no special formatting required. Upload the documents as they are.
Step 2: Automatic Processing
Once uploaded, hiroi handles the technical pipeline:
- Text extraction — content is pulled from the document regardless of format
- Chunking — the document is split into meaningful passages (not arbitrary character limits, but logical sections)
- Embedding — each chunk is converted into a vector representation using an embedding model
- Indexing — vectors are stored for fast semantic search
This process takes seconds for most documents. Large files (100+ pages) may take a minute or two.
Step 3: Ask Questions
That is it. Your agent now answers questions using your documents. No configuration, no prompt engineering, no API setup.
A visitor asks "What is your refund policy for international orders?" The system:
- Converts the question into a vector
- Searches your indexed documents for the most semantically similar passages
- Retrieves the top matches (typically 3-5 relevant chunks)
- Sends them to the AI along with the question
- The AI generates a response grounded in those specific passages
Tips for Structuring Documents
The quality of RAG answers depends heavily on how your source documents are structured. Here are practical tips from real usage:
Use Clear Headings
Documents with descriptive headings produce better retrieval results. The chunking algorithm uses headings to create logical passages, so "Section 4.2" is less useful than "Return Policy for International Orders."
Keep Related Information Together
If your shipping policy references your return policy, and they are in separate documents with no cross-references, the AI might miss the connection. Either consolidate related topics or include brief cross-references.
Be Specific, Not Vague
A document that says "Contact support for pricing details" gives the AI nothing to work with. A document that lists actual pricing tiers, with conditions and exceptions, gives the AI everything it needs to answer pricing questions accurately.
Update Regularly
RAG answers are only as current as your documents. If your pricing changes but the uploaded document still shows old prices, the AI will confidently give outdated information. Set a reminder to re-upload updated documents when policies change.
One Topic Per Section
Dense paragraphs covering multiple topics confuse the retrieval step. A chunk about both shipping times and return windows might rank well for a shipping question but include irrelevant return information in the context. Break content into focused sections.
What RAG Is Good At (And What It Is Not)
RAG excels at:
- Factual Q&A — "What are your business hours?" "What does the warranty cover?"
- Policy lookups — "Can I return a customized item?" "What ID do I need to verify my account?"
- Product specifications — "What is the battery life of Model X?" "Is it compatible with USB-C?"
- Process explanations — "How do I submit a claim?" "What are the steps to upgrade my plan?"
RAG is less suited for:
- Real-time data — stock prices, live inventory counts, order status (use API integrations instead)
- Subjective opinions — "Is this product worth it?" (the AI should not present marketing copy as objective advice)
- Complex reasoning across many documents — if answering the question requires synthesizing information from 10 different documents, results may be inconsistent
RAG Combined with Page Integration
hiroi's RAG and Page Integration work together. The AI can read both the live page content and your uploaded documents simultaneously.
A customer on a product page asks "How does the warranty work for this specific model?" The AI:
- Reads the product name and model number from the page via Page Integration
- Searches the warranty document via RAG for sections mentioning that model
- Combines both sources to give a specific, accurate answer
This combination — live page context plus document knowledge — is what makes the answers feel genuinely intelligent rather than templated.
Getting Started
- Go to your agent in the hiroi dashboard
- Navigate to the Knowledge Base section
- Upload your documents (PDF, DOCX, or TXT)
- Wait for processing to complete (usually under 30 seconds)
- Test by asking questions in the live preview
Start with your most frequently asked questions. Upload the documents that answer them, test a few queries, and verify the responses are accurate. Then expand to additional documentation as needed.
The goal is not to upload everything you have ever written. It is to give the AI the specific information it needs to answer the questions your visitors actually ask. Start focused, expand based on real conversations.