learning_ai_common_plat/docs/devops/RAILWAY_DEPLOYMENT_RUNBOOK.md
Saravana Achu Mac 6d6ca217a5 chore(devops): improve railway deploy script, add env sync and deployment runbook
- Refactor railway-deploy.sh: add --sync-env, --dry-run, --detach flags and service selector
- Add railway-sync-env.sh for pre-deploy environment variable synchronization
- Add RAILWAY_DEPLOYMENT_RUNBOOK.md with step-by-step deployment guide

Co-Authored-By: Oz <oz-agent@warp.dev>
2026-03-05 20:03:59 -08:00

3.3 KiB

Railway Deployment Runbook (Common Platform)

Last updated: 2026-02-16

This runbook covers repeatable deploys for shared services in learning_ai_common_plat.

Scope

Deployable Railway services from this repo:

  1. platform-service
  2. extraction-service

There are currently 2 Railway backend services in this repo. Monitoring (loki, grafana, gateway) is local Docker Compose infra, not deployed by Railway scripts.

Deploy Scripts Inventory

Script Purpose Services Affected
scripts/railway-deploy.sh Deploy code to Railway via railway up platform-service, extraction-service, or both
scripts/railway-sync-env.sh Sync selected vars from local .env into Railway service variables platform-service, extraction-service, or both
scripts/docker-prep.sh Build shared packages for local Docker builds No Railway deploy; local prep only

One-Time Setup

  1. Install Railway CLI:
npm i -g @railway/cli
  1. Login:
railway login
  1. Optional shell defaults:
export RAILWAY_PROJECT_ID="a6bc4ea7-e89c-42da-819a-8879fb022a0d"
export RAILWAY_ENVIRONMENT="production"

First Deploy (or Env Changes)

  1. Update local .env at repo root.
  2. Sync env vars to Railway (no deploy per var):
scripts/railway-sync-env.sh all --env-file .env
  1. Deploy both services:
scripts/railway-deploy.sh all

Repeat Deploy for New Code

If only code changed:

scripts/railway-deploy.sh all

Single-service deploys:

scripts/railway-deploy.sh platform
scripts/railway-deploy.sh extraction

If env vars changed too:

scripts/railway-sync-env.sh all --env-file .env
scripts/railway-deploy.sh all

Useful Flags

scripts/railway-deploy.sh

# Sync env before deploy, then deploy extraction
scripts/railway-deploy.sh extraction --sync-env --env-file .env

# Custom message + attach to logs
scripts/railway-deploy.sh platform --message "hotfix: auth token handling" --attach

# Dry run
scripts/railway-deploy.sh all --sync-env --dry-run

scripts/railway-sync-env.sh

# Sync only platform-service
scripts/railway-sync-env.sh platform --env-file .env

# Trigger deploys while setting vars (normally skipped)
scripts/railway-sync-env.sh extraction --trigger-deploys

# Dry run
scripts/railway-sync-env.sh all --dry-run

Verification

After deploy:

railway link --project "$RAILWAY_PROJECT_ID" --environment "$RAILWAY_ENVIRONMENT"
railway service status -a -e "$RAILWAY_ENVIRONMENT"

Service logs:

railway logs --service platform-service --environment "$RAILWAY_ENVIRONMENT" --deployment --lines 200
railway logs --service extraction-service --environment "$RAILWAY_ENVIRONMENT" --deployment --lines 200

Troubleshooting

  1. Railway CLI is not authenticated:
railway login
  1. Unable to access Railway service ...:
    • Confirm project ID and environment.
    • Confirm service names: platform-service, extraction-service.
  2. Missing required vars during sync:
    • Provide COSMOS_ENDPOINT, COSMOS_KEY, JWT_SECRET in .env, or
    • Set AZURE_KEYVAULT_URL and ensure runtime identity can fetch secrets.
  3. Deploy command succeeds but app is unhealthy:
    • Check deployment logs using railway logs --deployment.
    • Validate env vars synced for the affected service.