Knowledge Engine

Learn how to integrate Demeterics into your workflows with step-by-step guides and API examples.

Knowledge Engine

The Knowledge Engine enables RAG (Retrieval-Augmented Generation) for your AI applications. Upload documents, and the LLM automatically retrieves relevant context to answer questions with your private data.

Feature Access Required: Knowledge Engine requires whitelisted access.

To request access, email sales@demeterics.com with:

  • Subject: "Feature Access Request"
  • Feature name: "Knowledge Engine"

Overview

The Knowledge Engine transforms your documents into a searchable knowledge base that LLMs can query. Instead of relying solely on training data, the LLM retrieves relevant passages from your documents to provide accurate, grounded responses.

Key Benefits:

  • Grounded Responses: Answers based on your actual documents, not hallucinated content
  • Private Data: Your documents stay in your Demeterics account, not sent to LLM training
  • Automatic Indexing: Upload documents and they're immediately searchable
  • Multi-Format Support: PDF, Word, Markdown, HTML, and plain text

Architecture

User Query → LLM (with Knowledge Tools)
                    ↓
            search_knowledge (vector search)
            find_documents (document discovery)
            get_summary (topic overview)
            get_content (full document)
                    ↓
            Relevant context injected
                    ↓
            LLM generates grounded response

The Knowledge Engine provides four tools that the LLM can call during conversations:

Tool Purpose
search_knowledge Vector similarity search across all document chunks
find_documents Discover relevant documents ranked by match frequency
get_summary Retrieve the summary/abstract of a specific topic
get_content Retrieve the full content of a specific document

Getting Started

Step 1: Create a Knowledge Project

  1. Go to Knowledge → Projects in the dashboard
  2. Click Create Project
  3. Enter a name and description
  4. Click Create

Step 2: Upload Documents

  1. Select your project
  2. Click Upload Files
  3. Drag and drop or select files:
    • PDF documents
    • Word documents (.docx)
    • Markdown files (.md)
    • HTML files (.html)
    • Plain text (.txt)
  4. Files are automatically processed and indexed

Processing includes:

  • Text extraction
  • Semantic chunking (500-1000 tokens per chunk)
  • Embedding generation (OpenAI text-embedding-3-small)
  • Summary generation for each topic

Step 3: Enable Knowledge in Your Application

Knowledge Engine integrates with:


Project Organization

Documents in a Knowledge Project are organized by topics. Each topic contains:

File Purpose
{topic}_summary.md AI-generated summary/abstract of the topic
{topic}_detailed.md Full content with structure preserved
Chunk vectors Semantic embeddings for similarity search

Example structure:

my-project/
├── product-guide/
│   ├── product-guide_summary.md
│   ├── product-guide_detailed.md
│   └── [vector chunks]
├── pricing-policy/
│   ├── pricing-policy_summary.md
│   ├── pricing-policy_detailed.md
│   └── [vector chunks]
└── faq/
    ├── faq_summary.md
    ├── faq_detailed.md
    └── [vector chunks]

How the LLM Uses Knowledge Tools

When the LLM receives a user question, it decides which tools to call based on the query:

Vector Search (search_knowledge)

For specific questions, the LLM searches for relevant passages:

User: "What is the return policy for electronics?"

LLM calls: search_knowledge(query="return policy electronics")

Returns: Top 5 most relevant chunks with similarity scores

Document Discovery (find_documents)

For broad questions, the LLM first discovers which documents might be relevant:

User: "Tell me about your products"

LLM calls: find_documents(query="products")

Returns: List of topics ranked by relevance
- product-guide (5 matches, avg score 0.89)
- pricing-policy (3 matches, avg score 0.76)

Summary Retrieval (get_summary)

For topic overview, the LLM retrieves the summary:

LLM calls: get_summary(topic="product-guide")

Returns: The product-guide_summary.md content

Full Content (get_content)

For detailed information, the LLM retrieves full documents:

LLM calls: get_content(topic="pricing-policy")

Returns: The pricing-policy_detailed.md content
(automatically split if over 30,000 tokens)

Vector Search Details

The Knowledge Engine uses:

  • Embedding Model: OpenAI text-embedding-3-small (1536 dimensions)
  • Similarity Metric: Cosine similarity
  • Chunk Size: 500-1000 tokens with semantic boundaries
  • Storage: Google BigQuery with ML.VECTOR_SEARCH

Search flow:

  1. Query embedded using same model
  2. Cosine similarity computed against all chunks
  3. Top-K results returned (default: 5)
  4. Results include: content, similarity score, topic, token count

Best Practices

Document Preparation

  1. Use clear headings: Structure documents with H1, H2, H3 headers
  2. One topic per file: Don't mix unrelated content in a single document
  3. Include metadata: Add title, date, and version information
  4. Remove boilerplate: Strip headers/footers that repeat across pages
  5. Update regularly: Re-upload documents when content changes

Project Organization

  1. Separate by domain: Create different projects for different use cases
  2. Use descriptive names: customer-support-kb not project-1
  3. Version control: Keep source documents in version control
  4. Monitor quality: Review LLM responses and refine documents

Query Optimization

  1. System prompts: Instruct the LLM when to use knowledge tools
  2. Temperature: Use lower temperature (0.3-0.5) for factual retrieval
  3. Max iterations: Allow 2-3 tool calls for complex questions
  4. Fallback behavior: Define what to do when no relevant content found

Integration Options

Use the /chat endpoint with demeterics_tools.knowledge enabled.

See: Unified Chat API with Knowledge

Option 2: AI Chat Widget

Attach a Knowledge Project to your AI Chat Widget.

See: AI Chat Widget Knowledge Integration

Option 3: Direct API Access

Query your knowledge base directly via the Knowledge API.

Search endpoint:

curl -X POST https://api.demeterics.com/v1/projects/{project_id}/agent/search \
  -H "Authorization: Bearer dmt_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"query": "return policy", "top_k": 5}'

Response:

{
  "results": [
    {
      "content": "Our return policy allows returns within 30 days...",
      "topic_slug": "policies",
      "similarity": 0.92,
      "tokens": 156
    }
  ],
  "query_tokens": 3,
  "search_time_ms": 45
}

Pricing

Knowledge Engine usage is billed via Demeterics credits:

Component Cost
Document indexing $0.01 per 1,000 tokens
Embedding queries $0.0001 per query
Storage Included in base plan

All LLM calls made during tool-calling loops are billed at standard rates for the model used.


Limits

Limit Value
Projects per account 10
Documents per project 500
Max document size 50 MB
Max tokens per document 100,000
Vector chunks per project 50,000

Contact sales@demeterics.com for enterprise limits.


Troubleshooting

Documents not appearing in search?

  • Check the project's indexing status in the dashboard
  • Re-sync vectors: Project → Settings → Resync Vectors
  • Verify the document was processed without errors

Low relevance scores?

  • Use more descriptive queries
  • Check document structure (headings, paragraphs)
  • Consider splitting large documents into focused topics

LLM not calling tools?

  • Add system prompt: "Use knowledge tools to answer questions about our products"
  • Check tool_choice is set to "auto" not "none"
  • Verify demeterics_tools.knowledge is enabled

Need help?