Deployment Notes
Updated March 25, 2026.
This document intentionally excludes live secrets. Store production credentials in Azure Container Apps secrets, GitLab CI variables, or a dedicated secret manager instead of committing them to source control.
Immediate Follow-Up
- Rotate any credentials that were previously written into this file.
- Confirm the rotated values are updated in every deployed environment before the next release.
Pending Migration — token_transactions (March 19, 2026)
The token_transactions table had a stale CHECK constraint (token_transactions_tx_type_check) that only allowed: free_daily, miner_earned, package_purchase, api_spend, chat_spend, admin_grant, refund. The code now also uses subscription_grant and stripe_topup, which were blocked by this constraint.
This is now handled by the auto-discovery migration system (see below). No manual SQL is required — just deploy the new backend image and restart.
Token ledger / chat settlement (March 25, 2026)
Chat billing settlement writes miner_earned rows that must satisfy the same token_transactions columns as chat_spend (including balance_after when that column exists). Migration 0014_token_transactions_ledger_columns ensures:
balance_afterandreference_idexist when an older database predates them.balance_afteris backfilled as a running sum ofamountperuser_id(ordered bycreated_at,id) only where it was NULL, so existing rows are not overwritten.reference_idis filled fromreference_keywherereference_idis still null.NOT NULLis applied tobalance_afteronly after every row has a value (avoids breaking partially migrated databases).
Data on Azure is preserved (no DELETE/TRUNCATE). Deploy the backend image and restart; migrations run automatically. If 0014 exits early because some rows still have balance_after IS NULL, fix data manually or re-run after cleanup — the app still works when balance_after is absent or nullable, but production schemas that require NOT NULL should complete the backfill.
Database Migrations
The project uses a lightweight auto-discovery migration runner in backend/migrations/. Migrations run automatically on backend startup via init_schema() and are tracked in the schema_migrations table so each one executes at most once.
How it works
- Place a numbered Python file in
backend/migrations/(e.g.0002_add_foo_column.py). - The file must expose a
run(cur)function that receives an open psycopg2 cursor. - On startup, the runner sorts all
[0-9]*.pyfiles, skips already-applied ones, and executes the rest in order. - Applied migrations are recorded in
schema_migrations(name, applied_at).
Adding a new migration
# backend/migrations/0002_describe_the_change.py
"""Brief description of what this migration does."""
def run(cur) -> None:
cur.execute("ALTER TABLE ... ")
Current migrations
| Name | Description |
|---|---|
backfill_trace_json |
(legacy) Convert TracePacket repr strings to JSON |
0001_fix_token_transactions_constraint |
Drop stale tx_type CHECK, ensure reference_key + metadata_json columns |
0014_token_transactions_ledger_columns |
Add/backfill balance_after + reference_id on token_transactions (chat settlement) |
0036_automation_chunks |
Create automation_chunks pgvector table with HNSW index for widget report vector storage |
0037–0044 |
Crawl/pagination policy, embedding dimensions, user ML system (see backend/migrations/) |
Note (WS2-WS5): No new migrations added. Tier gating, shared pool, and token tracking use existing tables with metadata-level changes only (metadata_json fields on automation_chunks, new tx_type values in token_transactions).
Azure deployment notes
- Migrations auto-apply on restart — push the new image and restart the container app.
- To verify:
SELECT * FROM schema_migrations ORDER BY applied_at; - If the backend cannot be restarted, run the migration SQL manually against the Azure PostgreSQL instance. The SQL for each migration is in its file under
backend/migrations/. - Template for env vars (no secrets committed): see
config/deployment.env.template.
On-prem bundle publication
The downloadable on-prem bundle is now built and published separately from the backend image.
- Build the archive only:
scripts/build_onprem_bundle.ps1 - Build, upload to Azure Blob Storage, and update backend env vars:
scripts/publish_onprem_bundle.ps1 deploy.ps1 -Target backendwill publish the on-prem bundle automatically whenONPREM_BUNDLE_BLOB_CONNECTION_STRINGis set in the shell and-SkipOnPremBundlePublishis not used.
Required env var for publishing:
ONPREM_BUNDLE_BLOB_CONNECTION_STRING
Required env vars for hosted issuance of signed on-prem activation codes:
LICENSE_PUBLIC_KEY_PEMLICENSE_SIGNING_PRIVATE_KEY_PEM
Optional env vars:
ONPREM_BUNDLE_BLOB_CONTAINERONPREM_BUNDLE_BLOB_NAMEONPREM_BUNDLE_VERSIONONPREM_BUNDLE_CHECKSUM_SHA256ONPREM_BUNDLE_PUBLISHED_AT
On-prem bootstrap env vars are separate from hosted Azure deployment and should not be set on the shared production apps:
ONSITE_BOOTSTRAP_TOKENONSITENEXT_PUBLIC_ONSITE
These are bundle/runtime settings for customer-managed onsite installs only. They are not required for deploy.ps1 Azure Container Apps deploys and should remain unset on the hosted backend/frontend unless you are intentionally deploying a dedicated onsite-style environment.
If LICENSE_PUBLIC_KEY_PEM and LICENSE_SIGNING_PRIVATE_KEY_PEM are missing on the hosted backend, on-prem checkout stays degraded by design:
- readiness exposes
onprem_license_signing_not_configured - hosted billing shows on-prem checkout as not ready
- activation payload issuance returns
pending_signing_keyuntil the signing keys are configured
Production env parity check (April 7, 2026)
Validated via Azure CLI (az containerapp show).
- Resource group:
amodelis-prod - App:
backend - Revision:
backend--0000171 - Image:
[private container registry]/backend:v1.0.88
All concurrency-related env vars confirmed present:
| Variable | Value on Azure | Status |
|---|---|---|
GUNICORN_WORKERS |
2 |
✅ Correct for 2 vCPU |
GUNICORN_WORKER_CONNECTIONS |
2000 |
✅ |
GUNICORN_TIMEOUT |
600 |
✅ |
ORCH_THREAD_POOL_WORKERS |
8 |
✅ |
ORCH_STREAM_RUN_TIMEOUT_SECONDS |
300 |
✅ |
ORCH_STATUS_POLLING_TIMEOUT_GRACE_S |
120 |
✅ |
ORCH_TOOL_LOOP_MAX_LATENCY_MS |
300000 |
✅ |
ORCH_MAX_RESPONSE_TIME_SECONDS |
300 |
✅ |
ORCH_MINER_PROXY_TIMEOUT_SECONDS |
240 |
✅ |
ORCH_IGNORE_USER_PER_REQUEST_TOKEN_LIMIT |
1 |
✅ |
MINER_HTTP_SYNC_WAIT_S |
90 |
✅ |
MINER_QUEUE_SYNC_WAIT_S |
60 |
✅ |
MINER_INTERNAL_MODE |
1 |
✅ |
DB_POOL_MIN |
3 |
✅ |
DB_POOL_MAX |
12 |
✅ Sized for Postgres B1ms |
DAILY_FREE_TOKEN_AMOUNT |
500000 |
✅ |
LOG_LEVEL |
ERROR |
✅ |
ORCH_STREAM_HARD_TIMEOUT_ENABLED |
1 |
✅ |
BROKER_ENABLED |
1 |
✅ Broker (narrator) framework owns the user-facing stream and per-conversation task queue |
BROKER_RUNNING_SLOT_CAP |
3 |
✅ Max concurrent tasks per conversation before broker prompts user to reorder/cancel |
Verify current env vars:
az containerapp show --name backend --resource-group amodelis-prod \
--query "properties.template.containers[0].env[].{name: name, value: value}" -o table
Automation Vector Store (April 7, 2026)
Widget report data is now chunked, embedded, and stored in a pgvector table (automation_chunks) for semantic recall in chat and cross-widget pipelines.
Activation
The feature is off by default. Set this env var on the backend to enable:
| Variable | Value | Purpose |
|---|---|---|
AUTOMATION_VECTORSTORE_ENABLED |
1 |
Enable ingestion of widget report data into automation_chunks and recall in chat/pipelines. Default 0 (disabled). |
What it does when enabled
- Ingestion: After every successful widget run, report findings, metrics, source evidence, and table rows are chunked (512 tokens, 64 overlap), embedded via the existing
EmbeddingService, and bulk-inserted intoautomation_chunks. - Chat recall:
recall_artifacts_for_prompt()searchesautomation_chunksby cosine similarity (0.10 boost, below company docs' 0.15). Each recalled chunk gets a freshness tag ([FRESH],[RECENT],[AGING],[STALE]) based oncollected_at. - Cross-widget recall: Downstream widgets with
upstream_widget_idorsource_config.recall_from_widgetsautomatically receive upstream data in their prompt, gated bymax_upstream_age_hours(default:stale_after_hoursor 48h). - Retention:
delete_expired_widget_reports()cascade-deletes associated chunks. Per-widget, only the latest 2 runs' chunks are kept to prevent index bloat.
Migration
Migration 0036_automation_chunks runs automatically on startup. It creates:
- automation_chunks table with embedding vector(1536), metadata columns (widget_id, run_id, report_id, company_id, chunk_type, source_urls, entity_tags, metric_tags, metric_value, collected_at)
- HNSW index on embedding for cosine distance search
- 7 supplementary indexes for filtering by company, widget, report, run, collected_at, and compound (company_id, chunk_type)
Verify post-deploy:
SELECT * FROM schema_migrations WHERE name = '0036_automation_chunks';
SELECT count(*) FROM automation_chunks; -- should be 0 until first widget runs complete
Resource impact
- Embedding cost: ~$0.04/day at embedding API rates for 500 daily widget runs × 20 chunks.
- Storage: ~1 KB per chunk (content + vector embedding). 10K chunks ≈ 10 MB. Negligible on current production Postgres storage.
- Latency: Ingestion is async (runs after report creation, does not block the widget run response). Recall adds one pgvector query (~5-15ms with HNSW index) to the existing recall pipeline.
Widget configuration (optional, for cross-widget recall)
| Widget field | Type | Purpose |
|---|---|---|
upstream_widget_id |
string |
Widget whose latest report data is injected into this widget's prompt |
source_config.recall_from_widgets |
string[] |
Additional widget IDs to recall data from |
source_config.max_upstream_age_hours |
int |
Max age of upstream chunks to use (default: stale_after_hours or 48) |
Rollback
To disable without data loss, unset AUTOMATION_VECTORSTORE_ENABLED (or set to 0). Ingestion and recall are fully gated — existing chunks remain in the table but are not queried. To remove the table entirely:
DROP TABLE IF EXISTS automation_chunks;
DELETE FROM schema_migrations WHERE name = '0036_automation_chunks';
Tier Gating, Shared Data Pool & Token Tracking (April 9, 2026)
This release adds plan-based feature gating, cross-company data sharing, and token usage tracking. No new database migrations are required — all features use existing tables (automation_chunks, token_transactions, company_subscriptions) with metadata-level changes only.
What changed
| Area | Backend files | Frontend files |
|---|---|---|
| Tier gating | agents/plan_gate.py (new), billing.py, automations_api.py, agents/automations_service.py |
components/automations/widget-editor.tsx, (app)/billing/page.tsx, lib/types.ts |
| Shared data pool | agents/shared_pool.py (new), agents/collection_store.py, agents/db.py, agents/adaptive_persistence.py |
— |
| Token tracking | agents/orchestration.py, agents/embeddings.py, agents/automations_service.py, company_api.py, agents/db.py |
(app)/companies/[id]/page.tsx, lib/company-api.ts |
| Content pages | — | app/pricing/page.tsx (new), app/tutorials/user/data-collection/page.tsx (new), app/page.tsx, app/docs/page.tsx |
| Real-time extraction | agents/collection_extract.py (new), agents/tools/tool_calling_service.py |
— |
Tier gating details
Plan features are defined in billing.py under each plan's features dict and FREE_TIER_FEATURES. The plan_gate.py service resolves a company's active plan and returns feature limits used at five enforcement points:
- Widget creation limit —
automations_api.pychecksmax_widgets - Deep collection access —
automations_service.pychecksdeep_collection - Widget visibility — Public widgets require
shared_pool_access - Crawl page limit — Page budget per widget run gated by plan tier
- Billing page — Frontend shows plan-specific feature bullets
No env vars required. Gating reads directly from company_subscriptions + billing.py plan definitions.
Shared data pool details
Widgets marked visibility=public with source_class=public_web contribute chunks to the shared pool. Other companies can recall these chunks during chat via shared_pool.py → search_shared_pool_chunks() in db.py. Chunks from the requester's own company are excluded. PII fields (source_urls, entity_tags) are redacted before delivery.
- Daily query limit: 50 per company (configurable in
shared_pool.py) - Contributor credits:
shared_data_credittx_type intoken_transactions - No new tables — queries the existing
automation_chunkstable filtering onmetadata_jsonfields
Token tracking details
Three new tx_type values in token_transactions:
- automation_spend — logged after each automation widget run
- embedding_spend — logged per embedding batch
- shared_data_credit — credited to data contributors
The 0001 migration already dropped the tx_type CHECK constraint, so these values require no schema changes.
New API endpoints (role-gated to dev+):
- GET /api/companies/<id>/usage?days=30 — aggregate token usage summary
- GET /api/companies/<id>/usage/members?days=30 — per-member breakdown
Verify post-deploy
-- Confirm tx_type CHECK is dropped (should return 0 rows)
SELECT conname FROM pg_constraint
WHERE conrelid = 'token_transactions'::regclass AND contype = 'c';
-- Confirm automation_chunks has metadata fields
SELECT DISTINCT metadata_json->>'source_class', metadata_json->>'widget_visibility'
FROM automation_chunks LIMIT 10;
Rollback
All WS2-WS5 features are code-only (no schema migrations). To roll back, deploy the previous backend + frontend image tags. No database changes to revert.
Production Endpoints
- Frontend:
https://tryelisai.com(alsohttps://www.tryelisai.com) - Backend API:
https://api.tryelisai.com
Legacy Azure URLs (redirect to custom domain):
- Backend: https://backend.calmocean-1927d5f8.eastus.azurecontainerapps.io
- Frontend: https://frontend.calmocean-1927d5f8.eastus.azurecontainerapps.io (301 → tryelisai.com)
Custom Domain
- Domain:
tryelisai.com(registered via Azure App Service Domains) - DNS Zone: Azure DNS in resource group
amodelis-prod - Container Apps Environment:
amodelis-env - Environment Static IP:
52.188.74.22
DNS Records
| Type | Name | Value |
|---|---|---|
| A | @ | 52.188.74.22 |
| CNAME | www | frontend.calmocean-1927d5f8.eastus.azurecontainerapps.io |
| CNAME | api | backend.calmocean-1927d5f8.eastus.azurecontainerapps.io |
| TXT | asuid | 7722DCC994F5A7F153C7DC7BB498B1F3F8A06136A614C6E571EF4DAE049EAAD2 |
| TXT | asuid.www | (same verification ID) |
| TXT | asuid.api | (same verification ID) |
Hostname Bindings
| Hostname | Container App | Certificate |
|---|---|---|
| tryelisai.com | frontend | Managed (auto-renewed) |
| www.tryelisai.com | frontend | Managed (auto-renewed) |
| api.tryelisai.com | backend | Managed (auto-renewed) |
The old Azure frontend URL is redirected to https://tryelisai.com via Next.js middleware (frontend/middleware.ts).
Required Runtime Secrets
Backend:
DATABASE_URLREDIS_URLJWT_SECRETOPENAI_API_KEYANTHROPIC_API_KEY— Claude models (sk-ant-…)GOOGLE_API_KEY— Gemini modelsGROQ_API_KEY— Groq-hosted Llama / Mixtral / DeepSeek (gsk_…)MISTRAL_API_KEY— Mistral cloud modelsCEREBRAS_API_KEY— Cerebras Wafer-Scale models (csk-…)STRIPE_SECRET_KEYSTRIPE_PUBLISHABLE_KEYAPP_BASE_URLFRONTEND_BASE_URLDAILY_FREE_TOKEN_AMOUNT(use500000for production;1000is a common typo and would show as ~1.0K in the UI — the backend now rejects sub-10K values unlessALLOW_LOW_DAILY_FREE_TOKENS=1)AUTH_REQUIRE_EMAIL_VERIFICATIONAUTH_REQUIRE_EMAIL_2FAAZURE_COMMUNICATION_SERVICES_CONNECTION_STRINGAZURE_COMMUNICATION_EMAIL_SENDERLICENSE_PUBLIC_KEY_PEMwhen issuing signed on-prem activation codesLICENSE_SIGNING_PRIVATE_KEY_PEMwhen issuing signed on-prem activation codes
Frontend build/runtime:
INTERNAL_API_URLNEXT_PUBLIC_API_URL
Backend environment variables (miners, CORS, browser service)
Set these on the backend Container App (or in CI) when you use external miners, browser-based fetching, or custom domains:
| Variable | Purpose |
|---|---|
CORS_ALLOWED_ORIGINS |
Comma-separated browser origins allowed to call the API. Production: include https://tryelisai.com, https://www.tryelisai.com, and any staging origins. |
MINER_INTERNAL_MODE |
1 / true = backend uses built-in virtual miners (OpenAI keys on the backend). 0 / false = dispatch jobs to registered external miners. Production with community miners: use 0. |
MINER_INTERNAL_OWNER_USER_ID |
Optional explicit user id to attribute internal-miner earnings when MINER_INTERNAL_MODE is on. |
MINER_INTERNAL_OWNER_EMAIL |
Optional email lookup for the same (if user id not set). |
ROUTER_ALLOW_EXPLORATION_IN_PROD |
Set to 1 to allow active router policies with exploration.enabled=true in production (APP_ENV=prod). |
BROWSER_SERVICE_URL |
Base URL of the headless browser microservice (e.g. http://browser:9020 in compose). Unset = browser fetch features disabled. |
BROWSER_SERVICE_AUTH_TOKEN |
Shared secret; backend sends Authorization: Bearer … to the browser service. Must match the token configured on the browser Container App. |
BROWSER_SERVICE_POOL_SIZE |
Concurrent sessions to pool (default 6 in docker-compose.yml). Must not exceed BROWSER_SESSION_MAX_COUNT on the browser service (default 12). |
BROWSER_SESSION_MAX_COUNT |
Server-side ceiling for open Chrome sessions on the browser container (default 12 in browser_service/.env). Keep above BROWSER_SERVICE_POOL_SIZE. |
ORCH_WEB_FETCH_PARALLELISM |
Concurrent browser fetch calls per orchestration run (default 6). Keep in sync with BROWSER_SERVICE_POOL_SIZE. |
BROWSER_SERVICE_POOL_WAIT_SECONDS |
Max wait for a free browser (default 5). |
BROWSER_SERVICE_FETCH_WAIT_SECONDS |
Post-navigation settle time for /fetch (default 0.4). |
BROKER_ENABLED |
1 (default in deploy.ps1 and docker-compose.yml) routes every chat turn through the broker (narrator), which owns the user-facing stream, dispatches background Task records via task_registry, and emits broker_question chunks when the conversation queue cap is reached. Set 0 to fall back to the legacy run() pipeline path. |
BROKER_RUNNING_SLOT_CAP |
Max concurrent running tasks per conversation before the broker prompts the user with reorder/cancel choices. Default 3. |
BROKER_TASK_BACKEND |
Optional. memory forces the in-process TaskRegistry even when REDIS_URL is set (useful for isolated debugging). Leave unset in production so the Redis-backed registry is used and task state survives across Gunicorn workers. |
BROKER_TASK_TTL_SECONDS |
TTL for per-conversation task lists in Redis. Default 86400 (24 h). |
Note on deploy automation:
deploy.ps1now setsBROKER_ENABLEDandBROKER_RUNNING_SLOT_CAPon the backend Container App on every deploy. Override by exporting the env var in the deploy shell before running the script (e.g.$env:BROKER_ENABLED = "0").
Model provider API keys
deploy.ps1 also pushes the following provider keys to the backend Container App on every backend deploy, but only when they are present in the deploy shell (it never blanks an existing Azure value):
| Variable | Provider | Format |
|---|---|---|
ANTHROPIC_API_KEY |
Anthropic Claude | sk-ant-… |
GOOGLE_API_KEY |
Google Gemini | (Google API key) |
GROQ_API_KEY |
Groq (Llama / Mixtral / DeepSeek) | gsk_… |
MISTRAL_API_KEY |
Mistral cloud | (Mistral key) |
CEREBRAS_API_KEY |
Cerebras Wafer-Scale | csk-… |
These act as the platform BYOK fallback used by backend/agents/model_clients.py (ModelGateway) and backend/internal_miner.py (auto-registration of internal miners) when an organization has not configured its own keys via the BYOK panel.
To deploy with all keys, export them before running deploy.ps1:
$env:ANTHROPIC_API_KEY = "sk-ant-…"
$env:GOOGLE_API_KEY = "AIza…"
$env:GROQ_API_KEY = "gsk_…"
$env:MISTRAL_API_KEY = "…"
$env:CEREBRAS_API_KEY = "csk-…"
.\deploy.ps1 -Target backend
On-prem only variables
Do not configure these on the shared Azure production apps unless you are intentionally running an onsite deployment profile:
| Variable | Purpose |
|---|---|
ONSITE |
Enables onsite runtime posture and related guards/flows. Leave unset or 0 on hosted Azure. |
NEXT_PUBLIC_ONSITE |
Frontend onsite mode flag for downloadable bundle/runtime. Not used for hosted Azure frontend builds. |
ONSITE_BOOTSTRAP_TOKEN |
One-time first-admin bootstrap token for onsite bundles. Never required for hosted Azure deploys. |
The browser service container should set BROWSER_SERVICE_ALLOWED_INTERNAL_HOSTS if you navigate to private hostnames (defaults include frontend, localhost, host.docker.internal).
Registry / CI:
GITLAB_TOKEN
Azure Email Auth
The app now supports Azure Communication Services Email for both account verification and emailed second-factor codes.
Set these backend secrets to enable it:
AZURE_COMMUNICATION_SERVICES_CONNECTION_STRINGAZURE_COMMUNICATION_EMAIL_SENDERFRONTEND_BASE_URL
Set these backend flags to enforce it:
AUTH_REQUIRE_EMAIL_VERIFICATION=1AUTH_REQUIRE_EMAIL_2FA=1
Behavior:
- Signup creates the user and sends a verification link to
/verify-email. - Login checks email verification first, then sends a one-time sign-in code by email before issuing the JWT.
- When the flags are unset, the previous email/password flow stays active.
Compliance Environment Variables (April 11, 2026)
The ComplianceMonitor service runs every 6 hours and evaluates runtime posture against HIPAA, ITAR, NERC CIP, FISMA, and PCI-DSS requirements. Alerts are surfaced on the system-admin compliance dashboard and per-org compliance tabs. To pass all compliance checks on Azure production, set the following env vars on the backend Container App.
Required for all deployments
| Variable | Value | Framework | Purpose |
|---|---|---|---|
JWT_EXPIRY_HOURS |
24 |
HIPAA | Session token lifetime must be ≤ 24 hours. Default is 72 — triggers a high session_timeout alert. |
AUTH_REQUIRE_EMAIL_2FA |
1 |
HIPAA | Enforce email-based MFA on login. Without this, a medium mfa alert fires. |
AUTH_IDLE_TIMEOUT_MINUTES |
15 |
HIPAA | Auto-logout after idle period. Must be > 0 and ≤ 15. Missing or > 15 triggers a medium idle_timeout alert. |
DB_ENCRYPTION_KEY |
(base64 Fernet key) | HIPAA | Encryption key for PHI fields at rest. Missing triggers a **critica |
[...content truncated for whitepaper synthesis...]