Knowledge Engine
The Knowledge Engine enables RAG (Retrieval-Augmented Generation) for your AI applications. Upload documents, and the LLM automatically retrieves relevant context to answer questions with your private data.
Feature Access Required: Knowledge Engine requires whitelisted access.
To request access, email sales@demeterics.com with:
- Subject: "Feature Access Request"
- Feature name: "Knowledge Engine"
Overview
The Knowledge Engine transforms your documents into a searchable knowledge base that LLMs can query. Instead of relying solely on training data, the LLM retrieves relevant passages from your documents to provide accurate, grounded responses.
Key Benefits:
- Grounded Responses: Answers based on your actual documents, not hallucinated content
- Private Data: Your documents stay in your Demeterics account, not sent to LLM training
- Automatic Indexing: Upload documents and they're immediately searchable
- Multi-Format Support: PDF, Word, Markdown, HTML, and plain text
Architecture
User Query → LLM (with Knowledge Tools)
↓
search_knowledge (vector search)
find_documents (document discovery)
get_summary (topic overview)
get_content (full document)
↓
Relevant context injected
↓
LLM generates grounded response
The Knowledge Engine provides four tools that the LLM can call during conversations:
| Tool | Purpose |
|---|---|
search_knowledge |
Vector similarity search across all document chunks |
find_documents |
Discover relevant documents ranked by match frequency |
get_summary |
Retrieve the summary/abstract of a specific topic |
get_content |
Retrieve the full content of a specific document |
Getting Started
Step 1: Create a Knowledge Project
- Go to Knowledge → Projects in the dashboard
- Click Create Project
- Enter a name and description
- Click Create
Step 2: Upload Documents
- Select your project
- Click Upload Files
- Drag and drop or select files:
- PDF documents
- Word documents (.docx)
- Markdown files (.md)
- HTML files (.html)
- Plain text (.txt)
- Files are automatically processed and indexed
Processing includes:
- Text extraction
- Semantic chunking (500-1000 tokens per chunk)
- Embedding generation (OpenAI text-embedding-3-small)
- Summary generation for each topic
Step 3: Enable Knowledge in Your Application
Knowledge Engine integrates with:
- Unified Chat API via
demeterics_tools.knowledge - AI Chat Widget via Knowledge Project attachment
Project Organization
Documents in a Knowledge Project are organized by topics. Each topic contains:
| File | Purpose |
|---|---|
{topic}_summary.md |
AI-generated summary/abstract of the topic |
{topic}_detailed.md |
Full content with structure preserved |
| Chunk vectors | Semantic embeddings for similarity search |
Example structure:
my-project/
├── product-guide/
│ ├── product-guide_summary.md
│ ├── product-guide_detailed.md
│ └── [vector chunks]
├── pricing-policy/
│ ├── pricing-policy_summary.md
│ ├── pricing-policy_detailed.md
│ └── [vector chunks]
└── faq/
├── faq_summary.md
├── faq_detailed.md
└── [vector chunks]
How the LLM Uses Knowledge Tools
When the LLM receives a user question, it decides which tools to call based on the query:
Vector Search (search_knowledge)
For specific questions, the LLM searches for relevant passages:
User: "What is the return policy for electronics?"
LLM calls: search_knowledge(query="return policy electronics")
Returns: Top 5 most relevant chunks with similarity scores
Document Discovery (find_documents)
For broad questions, the LLM first discovers which documents might be relevant:
User: "Tell me about your products"
LLM calls: find_documents(query="products")
Returns: List of topics ranked by relevance
- product-guide (5 matches, avg score 0.89)
- pricing-policy (3 matches, avg score 0.76)
Summary Retrieval (get_summary)
For topic overview, the LLM retrieves the summary:
LLM calls: get_summary(topic="product-guide")
Returns: The product-guide_summary.md content
Full Content (get_content)
For detailed information, the LLM retrieves full documents:
LLM calls: get_content(topic="pricing-policy")
Returns: The pricing-policy_detailed.md content
(automatically split if over 30,000 tokens)
Vector Search Details
The Knowledge Engine uses:
- Embedding Model: OpenAI
text-embedding-3-small(1536 dimensions) - Similarity Metric: Cosine similarity
- Chunk Size: 500-1000 tokens with semantic boundaries
- Storage: Google BigQuery with ML.VECTOR_SEARCH
Search flow:
- Query embedded using same model
- Cosine similarity computed against all chunks
- Top-K results returned (default: 5)
- Results include: content, similarity score, topic, token count
Best Practices
Document Preparation
- Use clear headings: Structure documents with H1, H2, H3 headers
- One topic per file: Don't mix unrelated content in a single document
- Include metadata: Add title, date, and version information
- Remove boilerplate: Strip headers/footers that repeat across pages
- Update regularly: Re-upload documents when content changes
Project Organization
- Separate by domain: Create different projects for different use cases
- Use descriptive names:
customer-support-kbnotproject-1 - Version control: Keep source documents in version control
- Monitor quality: Review LLM responses and refine documents
Query Optimization
- System prompts: Instruct the LLM when to use knowledge tools
- Temperature: Use lower temperature (0.3-0.5) for factual retrieval
- Max iterations: Allow 2-3 tool calls for complex questions
- Fallback behavior: Define what to do when no relevant content found
Integration Options
Option 1: Unified Chat API (Recommended)
Use the /chat endpoint with demeterics_tools.knowledge enabled.
See: Unified Chat API with Knowledge
Option 2: AI Chat Widget
Attach a Knowledge Project to your AI Chat Widget.
See: AI Chat Widget Knowledge Integration
Option 3: Direct API Access
Query your knowledge base directly via the Knowledge API.
Search endpoint:
curl -X POST https://api.demeterics.com/v1/projects/{project_id}/agent/search \
-H "Authorization: Bearer dmt_your_api_key" \
-H "Content-Type: application/json" \
-d '{"query": "return policy", "top_k": 5}'
Response:
{
"results": [
{
"content": "Our return policy allows returns within 30 days...",
"topic_slug": "policies",
"similarity": 0.92,
"tokens": 156
}
],
"query_tokens": 3,
"search_time_ms": 45
}
Pricing
Knowledge Engine usage is billed via Demeterics credits:
| Component | Cost |
|---|---|
| Document indexing | $0.01 per 1,000 tokens |
| Embedding queries | $0.0001 per query |
| Storage | Included in base plan |
All LLM calls made during tool-calling loops are billed at standard rates for the model used.
Limits
| Limit | Value |
|---|---|
| Projects per account | 10 |
| Documents per project | 500 |
| Max document size | 50 MB |
| Max tokens per document | 100,000 |
| Vector chunks per project | 50,000 |
Contact sales@demeterics.com for enterprise limits.
Troubleshooting
Documents not appearing in search?
- Check the project's indexing status in the dashboard
- Re-sync vectors: Project → Settings → Resync Vectors
- Verify the document was processed without errors
Low relevance scores?
- Use more descriptive queries
- Check document structure (headings, paragraphs)
- Consider splitting large documents into focused topics
LLM not calling tools?
- Add system prompt: "Use knowledge tools to answer questions about our products"
- Check tool_choice is set to "auto" not "none"
- Verify
demeterics_tools.knowledgeis enabled
Need help?
- Email: support@demeterics.com
- Dashboard chat widget
- API Reference
Related Documentation
- Unified Chat API – Using Knowledge with the Chat API
- AI Chat Widget – Widget integration with Knowledge
- API Reference – Full API documentation
- Getting Started – Quick start guide