Elis AI — Technical Guide
Architecture Overview
Elis AI is a dynamic AI orchestration platform built on two core chain layers — router chains (selection) and agent chains (execution) — with distributed model inference, a self-improving feedback loop, and full trace observability.
┌────────────────────────────────────────────────────────┐
│ Frontend (Next.js 14) │
│ Chat UI ─ Automations ─ Editor ─ Blog ─ Analytics │
│ SSE Streaming │
└────────────────────┬───────────────────────────────────┘
│ HTTPS / JWT
┌────────────────────▼───────────────────────────────────┐
│ Backend (Flask + Gunicorn/Uvicorn) │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌───────────────┐ │
│ │ Router │ │ Agent │ │ Tool │ │
│ │ Chain │──│ Chain │──│ Executor │ │
│ │ (Selection) │ │ (Execution) │ │ (70+ tools) │ │
│ └──────┬──────┘ └──────┬───────┘ └───────┬───────┘ │
│ │ │ │ │
│ ┌──────▼──────────────▼───────────────────▼───────┐ │
│ │ Model Gateway / Miner Dispatch │ │
│ └───────────────────────┬───────────────────────────┘ │
│ │ │
│ ┌───────────┐ ┌───────▼──────┐ ┌─────────────────┐ │
│ │ PostgreSQL │ │ Redis │ │ Key Broker │ │
│ │ + pgvector │ │ (cache/Q) │ │ Service (KBS) │ │
│ └───────────┘ └──────────────┘ └─────────────────┘ │
└────────────────────────────────────────────────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ OpenAI │ │ Ollama │ │ Self- │
│ Miners │ │ Miners │ │ Hosted │
└─────────┘ └─────────┘ └─────────┘
Tech Stack
| Layer | Technology | Version |
|---|---|---|
| Frontend | Next.js, React, Tailwind CSS | 14, 18, 3.x |
| Backend | Flask, Gunicorn, Uvicorn | Python 3.11+ |
| Database | PostgreSQL + pgvector | 16 |
| Cache/Queue | Redis | 7 |
| AI Models | OpenAI (GPT-5 family), Ollama (Qwen, Phi, Gemma) | — |
| Embeddings | pgvector (multi-dimension: 256–4096) | — |
| NLP | spaCy, Presidio (PII), Microsoft NER | — |
| SQL Agent | LangChain | — |
| Browser | Selenium, Playwright | — |
| Encryption | Key Broker Service (envelope encryption) | — |
| Azure Communication Services | — | |
| Container | Docker Compose (local), Azure Container Apps (prod) | — |
Core Systems
1. Router Chain (Selection Layer)
The router selects the optimal agent, model, and execution mode for each user prompt. No hardcoded routes exist — selection is entirely dynamic.
Decision pipeline:
User prompt
│
▼
Embed prompt (vector representation)
│
▼
Candidate retrieval (vector similarity search)
├── AgentRetriever → ranked agent candidates
├── TemplateRetriever → ranked template candidates
├── ToolBundleRetriever → ranked tool bundle candidates
└── ModelProfileRetriever → ranked model candidates
│
▼
Score & rank candidates
├── Similarity score (cosine distance)
├── Prior quality (EMA — exponential moving average)
└── Live feasibility (miner available? tools functional? data source linked?)
│
▼
Select: agent_id + model_id + miner_id + execution_mode
│
▼
Emit router trace (decision metadata for audit + learning)
Execution modes:
| Mode | Description | When Selected |
|---|---|---|
direct_answer |
Simple factual response, no tools | Low-complexity, factual queries |
retrieval_augmented |
Enhanced with KB/document retrieval | Queries matching uploaded documents |
skill_chain |
Multi-step: search → expertise → context → answer | Research-heavy queries |
multi_step_workflow |
Complex task decomposition with tool orchestration | Multi-part or ambiguous queries |
Non-negotiable rule: Router selection uses similarity, quality history, and feasibility. No prompt-specific routing or static model pinning.
2. Agent Chain (Execution Layer)
Agents are reusable blueprints that define how a class of queries should be handled.
Agent definition:
Agent
├── system_prompt — Domain-specific instructions
├── expert_domain — Category (finance, research, general, etc.)
├── tool_bundle — Curated set of tools this agent can use
├── execution_rules — Constraints (max tool loops, token budget)
├── quality_metrics — EMA scores from historical performance
└── examples — Sample prompts for similarity matching
Execution flow:
Router output (agent + model + mode)
│
▼
Build system prompt (agent instructions + context + data sources)
│
▼
Execute tool loop
├── Model generates tool calls (function calling)
├── Tools execute and return results
├── Results are injected into conversation context
├── Loop continues until agent produces final answer or budget exhausted
└── Complexity-scoped budgets control loop iterations
│
▼
Post-process response
├── Extract citations and source URLs
├── Build evidence blocks
├── Format tables, charts, images
└── Rehydrate redacted PII
│
▼
Stream response via SSE
│
▼
Emit trace packet (full execution audit)
3. Feedback & Learning Loop
Every execution is traced, graded, and used to improve future routing.
Trace packet contents:
- Router decision metadata (candidates, scores, selection reason)
- Agent selection and execution mode
- Tool calls with inputs/outputs and timing
- Model I/O with token counts
- Total latency breakdown
- Error events (if any)
Grading pipeline:
Response
│
▼
ResponseEvaluator
├── Relevance (does it answer the question?)
├── Factual accuracy (validated via tools?)
├── Tool selection appropriateness
├── Clarity and completeness
└── Source quality
│
▼
Quality score (0.0 – 1.0)
│
▼
EMA update propagation
├── Agent profile → influences future agent selection
├── Model profile → influences future model selection
├── Template candidates → promote successful patterns
└── Tool bundle reliability → adjust tool availability
4. Miner System (Distributed Inference)
Models run on distributed worker containers called "miners."
Registration flow:
1. Miner starts and registers via WebSocket with enrollment secret.
2. Backend validates and adds to miner pool.
3. Regular health heartbeats confirm availability.
4. Tier classification: nano, small, frontier, self-hosted.
Dispatch flow:
Agent needs model inference
│
▼
Model Gateway
├── Check model tier requirements
├── Filter available miners (healthy + matching model)
├── Score by reliability + latency
└── Select best candidate
│
▼
Dispatch to miner
│
├── Success → return response
└── Failure → reroute to next healthy miner (if capacity exists)
Model pair locking (consensus mode):
- For critical queries, two miners execute the same prompt.
- Responses are compared for consistency.
- If they agree → return result.
- If they disagree → flag for review or select higher-confidence response.
Tool Ecosystem (70+)
Web Research
| Tool | Purpose |
|---|---|
web_search |
General web search |
web_search_discover |
Explore related topics |
web_content_reader |
Extract main content from a URL |
web_crawl |
Multi-page site crawl |
site_deep_dive |
Scrape structured data |
deep_search |
Multi-step retrieval with refinement |
trusted_web_search |
Fact-checked search with validation |
direct_url_lookup |
Bypass search, fetch URL directly |
Browser Automation
| Tool | Purpose |
|---|---|
browser_open_session |
Start Selenium/Playwright session |
browser_navigate |
Go to URL |
browser_type |
Fill form fields |
browser_click |
Click elements |
browser_read_dom |
Extract page HTML |
browser_screenshot |
Visual capture |
browser_network_traffic |
Intercept network requests |
browser_close_session |
Clean up |
Data Collection
| Tool | Purpose |
|---|---|
deep_collection |
Systematic property/listing collection |
save_collection_items |
Store structured data |
web_table_extractor |
Parse HTML/CSV tables |
extract_chart_data |
OCR charts and graphs |
merge_data_sources |
Combine multiple data streams |
Database & SQL
| Tool | Purpose |
|---|---|
SQL_agent_tool |
Natural-language SQL query execution (LangChain) |
sql_query |
Direct query with result limiting |
Geographic & Spatial
| Tool | Purpose |
|---|---|
geo_geocode |
Address → lat/lon |
geo_reverse |
lat/lon → address |
geo_normalize |
Standardize location formats |
geo_distance |
Calculate distances |
map_plot_points |
Render map with data points |
map_spatial_join |
Spatial analysis |
Knowledge & Retrieval
| Tool | Purpose |
|---|---|
conversation_retriever |
Recall earlier conversation turns |
document_query_retriever |
Search uploaded documents |
knowledge_base_service |
Search ingested knowledge base |
embedding_tool |
Vector similarity search |
evidence_retriever |
Find supporting sources |
Content & Formatting
| Tool | Purpose |
|---|---|
build_table_block |
Generate markdown tables |
build_chart_block |
Chart data structures |
generate_chart |
Create visualizations |
build_citation_block |
Format references |
html_to_markdown |
Convert HTML content |
Validation & Trust
| Tool | Purpose |
|---|---|
validate_claims |
Fact-check assertions |
contradiction_detector |
Find inconsistencies |
response_verifier |
Cross-check results |
link_safety_checker |
Detect malicious URLs |
Specialized
| Tool | Purpose |
|---|---|
rest_api_tool |
Call custom REST APIs |
decision_extractor |
Parse decisions from text |
constraint_extractor |
Identify requirements |
task_graph_builder |
Decompose complex tasks |
Data Layer
PostgreSQL Schema (Key Tables)
agent_registry — Agent blueprints (system prompt, tools, domain)
templates — Agent templates with version history
model_registry — Model catalog + performance metrics
conversations — Multi-turn chat history
messages — Individual messages + trace JSON
widget_runs — Automation execution history
router_traces — Routing decisions for learning
documents — Uploaded documents metadata
companies — Organization records
company_members — Role assignments
data_sources — SQL/API/doc connections
Embedding Tables (pgvector)
| Table | Dimensions Supported |
|---|---|
agents_embeddings |
256, 384, 768, 1024, 1536, 2048, 3072, 4096 |
templates_embeddings |
Same |
tool_bundles_embeddings |
Same |
documents_embeddings |
Same |
company_docs_embeddings |
Same |
Dimension-specific columns avoid padding/truncation loss. The system selects dimension based on the embedding model tier in use.
Retrieval Pipeline
User query
│
▼
Embed query (match dimension to active model)
│
▼
Multi-stage search:
1. Agent similarity (AgentRetriever)
2. Template similarity (TemplateRetriever)
3. Tool bundle similarity (ToolBundleRetriever)
4. Model profile similarity (ModelProfileRetriever)
5. Document similarity (HybridRetrieval: semantic + keyword)
│
▼
Rank by relevance + feasibility
│
▼
Return candidates to router/agent
Redis Usage
- Session caching (JWT sessions, temporary state)
- Message buffer (SSE streaming backpressure)
- Async task queue (automation scheduling, training jobs)
Security Architecture
Authentication
- JWT-based authentication for all API endpoints.
- Email verification required (configurable).
- Optional two-factor authentication.
- Token refresh and expiry management.
PII Redaction (Presidio)
User input → PII Detector (emails, phones, SSN, CC, addresses)
│
├── Found → Replace with [REDACTED_TYPE] tokens
│ Store mapping in redaction context
│
▼
Send redacted text to model
│
▼
Receive model response → Rehydrate [REDACTED_TYPE] → Final output
Data Encryption
- At rest: Envelope encryption via Key Broker Service (KBS).
- In transit: HTTPS for all client-backend communication.
- Miner registration: Enrollment secret required.
- SQL credentials: Encrypted before storage, decrypted only at query time.
Audit Logging
- Every API call logs: user, company, action, timestamp.
- AI traces persist full execution details.
- Admin can inspect any conversation or run.
Deployment
Local Development
docker-compose up -d --build
Services started:
| Service | Port | Purpose |
|---|---|---|
| frontend | 3000 | Next.js UI |
| backend | 8000 | Flask API |
| postgres | 5432 | Primary database |
| redis | 6379 | Cache and queue |
| kbs | — | Key Broker Service |
| pgadmin | 5050 | Database admin UI |
| miner-openai-* | 8080+ | OpenAI model workers |
| miner-ollama-* | 8081+ | Ollama model workers |
Production (Azure Container Apps)
Docker build → Push to Azure Container Registry
│
▼
Azure Container Apps deploys:
├── Frontend container
├── Backend container
├── PostgreSQL (managed or container)
├── Redis (managed or container)
├── KBS container
└── Miner containers (scaled per demand)
Key environment variables:
| Variable | Purpose |
|---|---|
| DATABASE_URL | PostgreSQL connection string |
| REDIS_URL | Redis connection string |
| JWT_SECRET | Token signing key |
| DB_ENCRYPTION_KEY | Fernet key for envelope encryption |
| MINER_ENROLLMENT_SECRET | Miner registration auth |
| OPENAI_API_KEY | OpenAI model access |
| REDACT_ENABLED | Toggle PII redaction |
| AZURE_COMMUNICATION_SERVICES_CONNECTION_STRING | Email service |
| AUTH_REQUIRE_EMAIL_VERIFICATION | Email verification toggle |
Scaling
- Horizontal: Add miner containers for more inference capacity.
- Vertical: Increase container resources for backend or database.
- Queue-based: Redis absorbs burst traffic; workers process asynchronously.
- Health-based: Degraded miners are automatically removed from the pool.
API Overview
Core Endpoints
| Method | Path | Purpose |
|---|---|---|
POST |
/api/auth/register |
Create account |
POST |
/api/auth/login |
Authenticate |
POST |
/api/conversations |
Create conversation |
POST |
/api/conversations/{id}/messages |
Send message |
GET |
/api/conversations/{id}/messages |
Get messages (SSE stream) |
GET |
/api/conversations/{id}/trace |
Get execution trace |
POST |
/api/automations |
Create automation widget |
POST |
/api/automations/{id}/run |
Trigger manual run |
GET |
/api/automations/{id}/runs |
Get run history |
POST |
/api/datasources |
Add data source |
GET |
/api/agents |
List available agents |
GET |
/api/models |
List available models |
GET |
/api/analytics/runs |
Query run analytics |
SSE Streaming Protocol
Responses stream via Server-Sent Events:
event: token
data: {"content": "The", "index": 0}
event: token
data: {"content": " answer", "index": 1}
event: source
data: {"url": "https://...", "title": "..."}
event: tool_call
data: {"tool": "web_search", "input": "...", "output": "..."}
event: done
data: {"trace_id": "abc-123", "tokens_used": 847}
Debugging & Diagnostics
Trace Inspection
Every run produces a trace accessible via API or the admin UI:
{
"trace_id": "abc-123",
"router": {
"candidates": [...],
"selected_agent": "financial-analyst-v3",
"selected_model": "gpt-5",
"confidence": 0.92,
"execution_mode": "skill_chain"
},
"tool_calls": [
{"tool": "web_search", "input": "...", "output": "...", "latency_ms": 340},
{"tool": "build_table_block", "input": "...", "output": "...", "latency_ms": 12}
],
"model_io": {
"input_tokens": 2400,
"output_tokens": 847,
"model": "gpt-5",
"miner_id": "miner-openai-1"
},
"total_latency_ms": 3200,
"quality_score": 0.88
}
Diagnostic Checklist
When diagnosing issues, validate the full path in order:
1. Auth — Is the user authenticated? Valid JWT?
2. Conversation — Does the conversation exist? Correct company scope?
3. Message — Was the message persisted? Is it in the correct format?
4. Router — Did the router select an agent? Check trace for candidates and scores.
5. Agent — Did the agent execute? Check tool calls and model I/O.
6. Trace — Is the trace complete? All fields populated?
7. Response — Is the response non-empty? Does it answer the question?
Health Checks
- Backend:
GET /api/health - Miner: WebSocket heartbeat (periodic ping)
- Database: Connection pool monitoring
- Redis:
PINGcommand
Log Inspection
# Backend logs
docker logs backend --tail 200
# Miner logs
docker logs miner-openai-1 --tail 200
# Database logs
docker logs postgres --tail 200
Development Workflow
Adding a New Tool
- Define the tool function in the tool executor module.
- Register it in the tool registry with name, description, and parameters.
- Add to relevant tool bundles (or create a new bundle).
- Embed the tool bundle description for similarity matching.
- Test with a probe prompt that should trigger the tool.
- Verify the tool appears in trace output.
Adding a New Agent
- Define agent blueprint: system prompt, expert domain, tool bundle, examples.
- Insert into
agent_registrytable. - Embed the agent description and examples.
- Router will automatically consider it for matching prompts.
- Monitor quality scores after deployment.
Testing
- Single-prompt probes: Test one question, inspect trace.
- 10-industry benchmark: Automated evaluation across verticals.
- Automation tests: Validate widget scheduling and report generation.
- E2E tests: Browser-based tests for full user flows.
Deployment Checklist
- Run tests locally.
- Build Docker images.
- Push to container registry.
- Deploy to Azure Container Apps.
- Verify health endpoints.
- Run a single-prompt probe to confirm end-to-end.
- Monitor dashboards for anomalies.