bytelyst-devops-tools/dashboard/README.md
Hermes VM 824f31586a docs(dashboard): Phase 5 P1 — fix port/endpoint drift, dedupe deployment docs
Closes the Phase 5 P1 doc-drift checkbox and REVIEW_ACTIONS #5.

The 3000-vs-3049 confusion came from prose claims in three docs that
each picked a different "right" answer. The truth is: the web container
listens on :3000; docker-compose maps `127.0.0.1:3049:3000`; production
is fronted by Traefik on `https://devops.bytelyst.com`. Encoding that
explicitly so future readers don't have to dig through compose files:

  - DEPLOYMENT.md becomes canonical. Its content is now the (more
    accurate) old DEPLOYMENT_GUIDE.md merged with a "Ports — quick
    reference" table covering Local dev / Docker Compose / Production
    Traefik, plus a Local-development section for `pnpm dev`.
  - DEPLOYMENT_GUIDE.md → 5-line redirect stub pointing at
    DEPLOYMENT.md (kept for `deploy.sh` and any external links).
  - deploy.sh updated to point at DEPLOYMENT.md.
  - README.md "Web port: 3000" line rewritten to spell out container
    vs Compose-host vs dev-mode and link to the port table.
  - ENDPOINTS.md gets a top-of-file note: every `localhost:3000` URL
    in that file is the `pnpm dev` workflow; substitute `:3049` for
    the Dockerized stack.
  - REVIEW_ACTIONS.md #5 marked RESOLVED with the rationale.

No code, behavior, lint, or test changes.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-05-30 07:03:05 +00:00

218 lines
7.7 KiB
Markdown

# ByteLyst DevOps Dashboard
Internal DevOps dashboard for deployment orchestration and service monitoring across ByteLyst products.
## Architecture
```
dashboard/
├── backend/ # Fastify 5 backend (port 4004)
│ └── src/
│ ├── lib/ # Config, auth, Cosmos
│ └── modules/ # Services, deployments, health
├── web/ # Next.js 16 frontend (port 3000)
│ └── src/
│ ├── app/ # Pages
│ └── lib/ # API client, auth
└── shared/
└── product.json # Product identity
```
## Features
- **Service Registry**: Manage all ByteLyst services (trading, notes, clock, etc.)
- **Deployment Orchestration**: Trigger deployments via existing bash scripts
- **Health Monitoring**: Real-time health checks for all services with caching
- **Deployment History**: Audit trail of all deployments with captured logs (JSON-polled by the web client; no SSE)
- **Cross-Navigation**: One-click link to Platform Admin dashboard
- **Hermes Mission Control**: Read-only mock dashboard for portfolio-wide execution, task ledger, product health, history, agents, and settings
- **Testing**: Vitest for backend, React Testing Library for frontend
- **Security**: Rate limiting, CORS, security headers, Zod validation
- **Auto-Refresh**: Automatic health status updates every 60 seconds
## Recent Improvements
### Testing Infrastructure
- Added Vitest for backend testing with test files for services and deployments
- Added React Testing Library for frontend with API client tests
- Test scripts: `pnpm test` (watch mode), `pnpm test:run` (CI mode)
### Health Monitoring
- Implemented actual HTTP health checks with 10-second timeout
- Added 30-second caching to avoid overwhelming services
- Added User-Agent header for health check requests
- Added admin endpoint to clear health cache (`DELETE /api/health/cache`)
- Health status determined by response time: >5s = degraded
### API Validation
- Added Zod schemas for all API routes (services, deployments, health)
- Proper error handling with BadRequestError from @bytelyst/errors
- Validated path parameters, query parameters, and request bodies
- Strict validation on update operations to prevent accidental field changes
### Deployment Logs
- Endpoint `GET /api/deployments/:id/logs` returns the full captured stdout/stderr + current status as a single JSON payload (admin only).
- The web client polls this endpoint while a deployment is `running`. There is intentionally no SSE/WebSocket stream — the previous attempt with `fastify-sse-v2` was incompatible with Fastify 5 and was removed. If a real-time stream is needed later, implement it explicitly via `reply.raw` and update this section in the same change.
### Security Enhancements
- Added rate limiting: 100 requests per minute per IP
- Improved CORS with allowed origins whitelist
- Added security headers: X-Content-Type-Options, X-Frame-Options, X-XSS-Protection, HSTS, Referrer-Policy
- OPTIONS preflight request handling
- Credentials support for authenticated requests
### Auto-Refresh
- Automatic health status refresh every 60 seconds
- Manual refresh button to clear cache and force health checks
- Visual feedback with spinning icon during refresh
- Last health check timestamp displayed on service cards
## Setup
### Prerequisites
- Node.js 22+
- pnpm 10.6.5
- Azure Cosmos DB credentials
- Platform Service URL
- Access to @bytelyst/* packages (via common-plat workspace or Gitea registry)
### Installation
The dashboard uses the .pnpmfile.cjs pattern for dynamic dependency resolution, supporting both local workspace and Gitea registry modes.
```bash
# For local development (uses workspace links to learning_ai_common_plat)
pnpm install:common-plat
# For production (uses Gitea registry at localhost:3300)
pnpm install:gitea
```
### Backend
```bash
cd backend
cp .env.example .env # Add your credentials
pnpm dev # Runs on port 4004
```
### Frontend
```bash
cd web
cp .env.local.example .env.local # Add your URLs
pnpm dev # Next dev server on http://localhost:3000 (no Docker)
```
### Running Both
```bash
# From dashboard root
pnpm dev
```
## Environment Variables
### Backend (.env)
```
PORT=4004
PLATFORM_SERVICE_URL=http://localhost:4003
COSMOS_ENDPOINT=https://your-cosmos.documents.azure.com:443/
COSMOS_KEY=your-cosmos-key
COSMOS_DATABASE=bytelyst-platform
JWT_SECRET=your-jwt-secret
```
### Frontend (.env.local)
```
NEXT_PUBLIC_DEVOPS_API_URL=http://localhost:4004
NEXT_PUBLIC_PLATFORM_URL=http://localhost:4003
```
Production deployments use `https://api.bytelyst.com/devops` for `NEXT_PUBLIC_DEVOPS_API_URL` and `https://api.bytelyst.com/platform/api` for `NEXT_PUBLIC_PLATFORM_URL`.
## Usage
1. **Seed Services**: Click "Seed Services" on the dashboard to register default services
2. **Deploy**: Click "Deploy" on any service card to trigger deployment
3. **Monitor**: View real-time health status and deployment history
4. **Platform Admin**: Click "Platform Admin" link to jump to the admin dashboard
5. **Hermes Mission Control**: Visit `/hermes` for the mock executive command center and the companion routes `/hermes/tasks`, `/hermes/tasks/[id]`, `/hermes/products`, `/hermes/history`, `/hermes/agents`, and `/hermes/settings`
## Integration with Platform Admin
- DevOps dashboard links to admin-web at `http://localhost:3001`
- Admin-web should have a reciprocal link back to DevOps dashboard
- Both use platform-service for authentication
## API Endpoints
### Services
- `GET /api/services` - List all services
- `GET /api/services/:id` - Get single service
- `POST /api/services` - Create service (admin only)
- `PUT /api/services/:id` - Update service (admin only)
- `DELETE /api/services/:id` - Delete service (admin only)
### Deployments
- `GET /api/deployments` - Recent deployments (with `?limit=` query param)
- `GET /api/deployments/service/:serviceId` - Deployments for specific service
- `GET /api/deployments/:id` - Single deployment
- `GET /api/deployments/:id/logs` - Get captured deployment logs as JSON (web client polls this; no SSE)
- `POST /api/deployments/trigger/:serviceId` - Trigger deployment (admin only)
### Health
- `GET /api/health` - Health of all services
- `GET /api/health/:serviceId` - Health of specific service
- `DELETE /api/health/cache` - Clear health cache (admin only)
### Seed
- `POST /api/seed` - Seed default services (admin only)
## Development
```bash
# Backend typecheck
cd backend && pnpm typecheck
# Frontend typecheck
cd web && pnpm typecheck
# Run tests (watch mode)
pnpm test
# Run tests (CI mode)
pnpm test:run
# Run both
pnpm --filter backend dev & pnpm --filter web dev
```
## Deployment
See [DEPLOYMENT.md](./DEPLOYMENT.md) for detailed deployment instructions.
Deploy as a ByteLyst product:
- Product ID: `devops-internal`
- Backend port: 4004 (host) / 4004 (container)
- Web port: 3000 (container) — exposed on host as **`localhost:3049`** under Docker Compose; dev mode (`pnpm dev`) listens directly on `localhost:3000`. See [`DEPLOYMENT.md`](./DEPLOYMENT.md) for the full port table.
- Use existing deployment scripts in parent directory
- Public API base: `https://api.bytelyst.com/devops`
## Production Features
The dashboard includes comprehensive production-ready features:
- **CI/CD Pipeline**: Gitea Actions with build, test, typecheck, lint, E2E tests
- **Security**: CSRF protection, rate limiting, CORS, security headers
- **Monitoring**: System metrics, Docker management, performance tracking
- **Operations**: Database migrations, backup/restore, audit logging
- **Accessibility**: ARIA labels, keyboard navigation, skip links
- **PWA**: Web app manifest, mobile-friendly
- **Documentation**: OpenAPI/Swagger at `/docs`
See [DEPLOYMENT.md](./DEPLOYMENT.md) for complete deployment guide.