Data Extraction API
Demeterics stores all your LLM interactions in BigQuery and provides APIs to extract and analyze this data programmatically. This guide covers how to export interaction data, query usage metrics, and integrate with your analytics pipeline.
Overview
Your interaction data is stored in BigQuery and can be extracted via:
- Export API (
POST /api/v1/exports) - Bulk export to JSON, CSV, or Avro - Stream API (
GET /api/v1/exports/{request_id}/stream) - Stream large datasets - Dashboard UI - Download exports directly from the web interface
All exports are scoped to your user account and respect data retention policies.
Authentication
All export endpoints require authentication via your Demeterics API key:
Authorization: Bearer dmt_your_api_key
Your API key must have the export scope enabled. To check or update scopes, visit Settings → API Keys.
Export API
POST /api/v1/exports
Create a new data export job. Returns immediately with a request ID for streaming large datasets.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
format |
string | No | Output format: json, csv, or avro. Default: csv |
start_date |
string | No | Start date filter (ISO 8601: YYYY-MM-DD) |
end_date |
string | No | End date filter (ISO 8601: YYYY-MM-DD) |
tables |
array | No | Tables to export: interactions, eval_runs, eval_results. Default: all |
use_gcs |
boolean | No | Export to GCS bucket instead of streaming. Default: false |
gcs_bucket |
string | No | Target GCS bucket (required if use_gcs is true) |
Example: Export last 30 days as JSON
curl -X POST https://api.demeterics.com/api/v1/exports \
-H "Authorization: Bearer dmt_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"format": "json",
"start_date": "2025-11-01",
"end_date": "2025-11-30",
"tables": ["interactions"]
}'
Response
{
"status": "ok",
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"row_count": 1542,
"bytes_size": 2048576,
"message": "Export ready for streaming"
}
GET /api/v1/exports/{request_id}/stream
Stream the exported data. Use the request_id from the export response.
Example: Stream as CSV
curl -X GET "https://api.demeterics.com/api/v1/exports/550e8400-e29b-41d4-a716-446655440000/stream" \
-H "Authorization: Bearer dmt_your_api_key" \
-o interactions.csv
Query Parameters
| Parameter | Description |
|---|---|
format |
Override format: json or csv |
Interaction Data Schema
Exported interactions include the following fields:
| Field | Type | Description |
|---|---|---|
transaction_id |
string | Unique interaction identifier (ULID) |
request_id |
string | Client-provided request ID for idempotency |
session_id |
string | Session identifier for grouping conversations |
user_id |
int64 | Your Demeterics user ID |
model |
string | LLM model used (e.g., llama-3.3-70b-versatile) |
question |
string | Input prompt/question |
question_time |
timestamp | When the question was sent |
answer |
string | LLM response |
answer_time |
timestamp | When the answer was received |
latency_ms |
int64 | Response time in milliseconds |
prompt_tokens |
int64 | Input token count |
completion_tokens |
int64 | Output token count |
cached_tokens |
int64 | Cached token count (if applicable) |
total_tokens |
int64 | Total tokens used |
estimated_cost |
float64 | Estimated cost in USD |
status |
string | success, error, or timeout |
error_message |
string | Error details (if status is error) |
application |
string | Application name from API key |
metadata |
json | Custom metadata attached to the interaction |
tags |
array | Tags for categorization |
Export Examples
Python: Export and Analyze
import requests
import pandas as pd
from io import StringIO
API_KEY = "dmt_your_api_key"
BASE_URL = "https://api.demeterics.com"
# Create export job
response = requests.post(
f"{BASE_URL}/api/v1/exports",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"format": "csv",
"start_date": "2025-11-01",
"end_date": "2025-11-30",
"tables": ["interactions"]
}
)
export = response.json()
request_id = export["request_id"]
# Stream the data
stream_response = requests.get(
f"{BASE_URL}/api/v1/exports/{request_id}/stream",
headers={"Authorization": f"Bearer {API_KEY}"}
)
# Load into pandas
df = pd.read_csv(StringIO(stream_response.text))
# Analyze
print(f"Total interactions: {len(df)}")
print(f"Total cost: ${df['estimated_cost'].sum():.2f}")
print(f"Avg latency: {df['latency_ms'].mean():.0f}ms")
print(f"\nTop models:")
print(df['model'].value_counts().head())
Node.js: Stream to File
const fs = require('fs');
const https = require('https');
const API_KEY = 'dmt_your_api_key';
// Create export
fetch('https://api.demeterics.com/api/v1/exports', {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
format: 'json',
start_date: '2025-11-01',
end_date: '2025-11-30'
})
})
.then(res => res.json())
.then(data => {
// Stream to file
const file = fs.createWriteStream('interactions.json');
https.get(
`https://api.demeterics.com/api/v1/exports/${data.request_id}/stream`,
{ headers: { 'Authorization': `Bearer ${API_KEY}` } },
response => response.pipe(file)
);
});
Shell: Daily Export Script
#!/bin/bash
# daily_export.sh - Export yesterday's interactions
API_KEY="dmt_your_api_key"
YESTERDAY=$(date -d "yesterday" +%Y-%m-%d)
TODAY=$(date +%Y-%m-%d)
OUTPUT_FILE="interactions_${YESTERDAY}.csv"
# Create export
REQUEST_ID=$(curl -s -X POST https://api.demeterics.com/api/v1/exports \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"format\": \"csv\",
\"start_date\": \"$YESTERDAY\",
\"end_date\": \"$TODAY\",
\"tables\": [\"interactions\"]
}" | jq -r '.request_id')
# Download
curl -s "https://api.demeterics.com/api/v1/exports/$REQUEST_ID/stream" \
-H "Authorization: Bearer $API_KEY" \
-o "$OUTPUT_FILE"
echo "Exported to $OUTPUT_FILE"
GCS Export (Enterprise)
For large datasets, export directly to a Google Cloud Storage bucket:
curl -X POST https://api.demeterics.com/api/v1/exports \
-H "Authorization: Bearer dmt_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"format": "avro",
"use_gcs": true,
"gcs_bucket": "gs://your-bucket/exports/",
"start_date": "2025-01-01",
"end_date": "2025-11-30"
}'
Response
{
"status": "ok",
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"url": "gs://your-bucket/exports/interactions_2025-11-30.avro",
"expires_at": "2025-12-07T00:00:00Z",
"row_count": 150000,
"message": "Export complete"
}
Note: Contact support to enable GCS export for your account.
Rate Limits
| Endpoint | Limit |
|---|---|
POST /api/v1/exports |
10 requests/minute |
GET /api/v1/exports/{id}/stream |
100 requests/minute |
Export jobs are cached for 10 minutes. Repeated requests with the same parameters will return the cached result.
Best Practices
- Use date filters - Always specify
start_dateandend_dateto limit data volume - Export incrementally - Run daily/weekly exports instead of full history dumps
- Use CSV for analysis - Easier to work with in spreadsheets and pandas
- Use Avro for pipelines - More efficient for BigQuery, Spark, or data warehouses
- Store exports - Export jobs expire after 10 minutes; save the data locally
Troubleshooting
401 Unauthorized
- Check that your API key is valid
- Ensure the key has
exportscope enabled
403 Forbidden
- Your API key lacks the
exportscope - Update key permissions in Settings → API Keys
404 Not Found
- Export request expired (10 minute TTL)
- Re-create the export job
500 Internal Server Error
- Date range may be too large
- Try a smaller date range or specific tables