learning_ai_common_plat/.env.example
saravanakumardb1 c63736459b feat(fleet): anti-flap hysteresis + autoscale Prometheus series & dashboard (ops #5)
Make the capacity autoscaling signal safe to act on automatically and observable
in Grafana.

Anti-flap hysteresis:
- New pure applyHysteresis: suppresses a direction reversal (scale_in after
  scale_out, or vice versa) within a cooldown window so a consumer cannot thrash
  capacity. A critical scale-out (queued work, zero usable capacity) always
  bypasses the cooldown. Cooldown anchor only advances on an emitted action, so a
  suppressed signal keeps counting down from the real last action.
- Process-wide per-product cooldown state (mirrors reaper/breaker in-mem state)
  with a test seam; cooldown tunable via FLEET_AUTOSCALE_COOLDOWN_SEC (default 300).
- GET /fleet/autoscale[/all] now serve the debounced (stateful) recommendation.

Observability:
- Prometheus exposition emits the RAW recommendation per product
  (fleet_autoscale_recommended_seats/delta/pressure + one-hot fleet_autoscale_action
  {action}). RAW (not stateful) so a scrape never mutates the cooldown anchors.
- Grafana "Fleet Overview" gains two panels: products recommending scale-out
  (stat) + recommended seat delta vs backlog (timeseries).

Docs: FLEET_AUTOSCALE_COOLDOWN_SEC in .env.example.

Tests: +10 (hysteresis/stateful/cooldown + prom autoscale series); full suite 1856
green; lint + tsc clean. Verified live: a throwaway Prometheus scraped the running
service and the dashboard PromQL returned real scale-out/scale-in recommendations
across products.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-06-01 23:02:08 -07:00

126 lines
5.5 KiB
Plaintext

# ── Common Platform Environment Variables ──────────────────────
# Copy to .env and fill in real values.
# ── Azure Key Vault (optional — secrets fall back to env vars) ─
# Set this to resolve secrets from AKV instead of .env:
AZURE_KEYVAULT_URL=
# ── Cosmos DB (prototype defaults to local emulator) ───────────
# For the Docker prototype stack, leave these pointed at the local emulator.
# When you move to a managed environment later, replace them with real Azure values.
COSMOS_ENDPOINT=http://cosmos-emulator:8081
COSMOS_KEY=<cosmos-emulator-key>
COSMOS_DATABASE=lysnrai
# ── Auth (platform-service) ─────────────────────────
JWT_SECRET=change-me-prototype-jwt-secret
RATE_LIMIT_STORE_MODE=datastore
RATE_LIMIT_CONFIG_JSON=
API_KEY_RATE_LIMIT_CONFIG_JSON=
API_KEY_PRODUCT_RATE_LIMIT_CONFIG_JSON=
# ── Azure Blob Storage (platform-service) ─────────────────────
STORAGE_PROVIDER=azure
AZURE_BLOB_CONNECTION_STRING=DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=<azurite-default-key>;BlobEndpoint=http://azurite:10000/devstoreaccount1;
AZURE_BLOB_ACCOUNT_NAME=devstoreaccount1
AZURE_BLOB_ACCOUNT_KEY=<azurite-default-key>
AZURE_BLOB_PUBLIC_ENDPOINT=http://localhost:10000/devstoreaccount1
# ── Stripe (platform-service) ────────────────────────
STRIPE_SECRET_KEY=sk_test_...
STRIPE_WEBHOOK_SECRET=whsec_...
STRIPE_PRICE_PRO=price_...
STRIPE_PRICE_ENTERPRISE=price_...
# ── Email Delivery (platform-service) ─────────────────────────
# Use `smtp` for a self-hosted SMTP relay such as Mailpit, Postal, Mailcow, etc.
EMAIL_PROVIDER=smtp
EMAIL_FROM_ADDRESS=noreply@bytelyst.local
EMAIL_FROM_NAME=ByteLyst
SMTP_HOST=mailpit
SMTP_PORT=1025
SMTP_SECURE=false
SMTP_USER=
SMTP_PASSWORD=
TELEGRAM_BOT_TOKEN=
TELEGRAM_DEFAULT_CHAT_ID=
SLACK_WEBHOOK_URL=
SLACK_DEFAULT_CHANNEL=
EVENT_BUS_BACKEND=file
EVENT_BUS_FILE=.data/platform-events.json
EVENT_BUS_POLL_MS=100
EVENT_BUS_LEASE_MS=30000
# ── Extraction Service (port 4005 + Python sidecar 4006) ─────
PYTHON_SIDECAR_URL=http://localhost:4006
DEFAULT_MODEL_ID=gemini-2.5-flash
GEMINI_API_KEY=your-gemini-api-key
EXTRACTION_QUEUE_BACKEND=file
EXTRACTION_QUEUE_FILE=.data/extraction-jobs.json
EXTRACTION_QUEUE_POLL_MS=100
EXTRACTION_QUEUE_LEASE_MS=30000
# ── Webhooks (optional — fire-and-forget callbacks) ──────────
WEBHOOK_INVITATION_REDEEMED_URL=
WEBHOOK_REFERRAL_STATUS_URL=
WEBHOOK_WAITLIST_JOINED_URL=
# ── Telemetry (platform-service) ──────────────────────────────
TELEMETRY_ENABLED=true
TELEMETRY_ALERT_WEBHOOK_URL=
TELEMETRY_GEO_API_URL=http://ip-api.com/json
TELEMETRY_EVENT_TTL_DAYS=90
# ── Field Encryption (@bytelyst/field-encrypt) ──────────────
# Key provider: 'akv' (production) | 'env' (dev/staging) | 'memory' (tests)
FIELD_ENCRYPT_KEY_PROVIDER=memory
# Hex-encoded 32-byte key — only for 'env' provider (like AUTH_TOTP_ENCRYPTION_KEY)
FIELD_ENCRYPT_KEY=
# Product-specific MEK name in AKV — only for 'akv' provider
FIELD_ENCRYPT_MEK_NAME=lysnr-mek
# ── Gitea NPM Registry (private @bytelyst packages) ─────────
# Token for authenticating with the Gitea npm registry.
# Generate at: http://<GITEA_NPM_HOST>:3300/user/settings/applications
GITEA_NPM_TOKEN=
GITEA_NPM_HOST=localhost
GITEA_NPM_OWNER=learning_ai_user
# ── Product Identity ──────────────────────────────────────────
DEFAULT_PRODUCT_ID=lysnrai
# ── Cowork Service (port 4009 — Fastify bridge to Rust runtime) ─
# cowork-service forwards auth, flags, audit, telemetry, and AI budgets to
# platform-service. The Anthropic key is only needed when running the Rust
# runtime locally via IPC; in the containerised dev stack it is optional.
ANTHROPIC_API_KEY=
RUST_RUNTIME_BIN=cowork-orchestrator
RUST_RUNTIME_TIMEOUT_MS=300000
OLLAMA_URL=http://localhost:11434/v1
OLLAMA_MODELS=
FEATURE_FLAGS_ENABLED=true
# ── Fleet ops/observability ───────────────────────────────────
# Bearer token Prometheus uses to scrape GET /api/fleet/metrics/prom. Must match
# the `credentials` in services/monitoring/prometheus/prometheus.yml. When unset,
# the endpoint requires an admin JWT instead (so it is never world-readable).
FLEET_METRICS_TOKEN=changeme-fleet-metrics-token
# Fleet feature flags (default OFF): cost/latency routing, per-engine breaker,
# per-product/-engine budget enforcement, and multi-tenant access enforcement.
FLEET_COST_ROUTING=
FLEET_ENGINE_BREAKER=
FLEET_BUDGETS=
FLEET_TENANT_ENFORCEMENT=
# Capacity autoscaling signal (§5) — tunes the advisory scale recommendation
# served at GET /api/fleet/autoscale[/all] (consumed by an external scaler).
# All optional; unset keys fall back to the in-code defaults shown below.
FLEET_AUTOSCALE_SCALE_OUT_PCT=85
FLEET_AUTOSCALE_SCALE_IN_PCT=20
FLEET_AUTOSCALE_MAX_STEP=5
FLEET_AUTOSCALE_MIN_SEATS=0
# Anti-flap cooldown (seconds): the /fleet/autoscale endpoints suppress a
# direction reversal (scale_in after scale_out, or vice versa) within this
# window so a consumer cannot thrash capacity. A critical scale-out (queued work
# with zero usable capacity) always bypasses the cooldown. Default 300.
FLEET_AUTOSCALE_COOLDOWN_SEC=300