Closes the Phase 5 P1 doc-drift checkbox and REVIEW_ACTIONS #5. The 3000-vs-3049 confusion came from prose claims in three docs that each picked a different "right" answer. The truth is: the web container listens on :3000; docker-compose maps `127.0.0.1:3049:3000`; production is fronted by Traefik on `https://devops.bytelyst.com`. Encoding that explicitly so future readers don't have to dig through compose files: - DEPLOYMENT.md becomes canonical. Its content is now the (more accurate) old DEPLOYMENT_GUIDE.md merged with a "Ports — quick reference" table covering Local dev / Docker Compose / Production Traefik, plus a Local-development section for `pnpm dev`. - DEPLOYMENT_GUIDE.md → 5-line redirect stub pointing at DEPLOYMENT.md (kept for `deploy.sh` and any external links). - deploy.sh updated to point at DEPLOYMENT.md. - README.md "Web port: 3000" line rewritten to spell out container vs Compose-host vs dev-mode and link to the port table. - ENDPOINTS.md gets a top-of-file note: every `localhost:3000` URL in that file is the `pnpm dev` workflow; substitute `:3049` for the Dockerized stack. - REVIEW_ACTIONS.md #5 marked RESOLVED with the rationale. No code, behavior, lint, or test changes. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
218 lines
7.7 KiB
Markdown
218 lines
7.7 KiB
Markdown
# ByteLyst DevOps Dashboard
|
|
|
|
Internal DevOps dashboard for deployment orchestration and service monitoring across ByteLyst products.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
dashboard/
|
|
├── backend/ # Fastify 5 backend (port 4004)
|
|
│ └── src/
|
|
│ ├── lib/ # Config, auth, Cosmos
|
|
│ └── modules/ # Services, deployments, health
|
|
├── web/ # Next.js 16 frontend (port 3000)
|
|
│ └── src/
|
|
│ ├── app/ # Pages
|
|
│ └── lib/ # API client, auth
|
|
└── shared/
|
|
└── product.json # Product identity
|
|
```
|
|
|
|
## Features
|
|
|
|
- **Service Registry**: Manage all ByteLyst services (trading, notes, clock, etc.)
|
|
- **Deployment Orchestration**: Trigger deployments via existing bash scripts
|
|
- **Health Monitoring**: Real-time health checks for all services with caching
|
|
- **Deployment History**: Audit trail of all deployments with captured logs (JSON-polled by the web client; no SSE)
|
|
- **Cross-Navigation**: One-click link to Platform Admin dashboard
|
|
- **Hermes Mission Control**: Read-only mock dashboard for portfolio-wide execution, task ledger, product health, history, agents, and settings
|
|
- **Testing**: Vitest for backend, React Testing Library for frontend
|
|
- **Security**: Rate limiting, CORS, security headers, Zod validation
|
|
- **Auto-Refresh**: Automatic health status updates every 60 seconds
|
|
|
|
## Recent Improvements
|
|
|
|
### Testing Infrastructure
|
|
- Added Vitest for backend testing with test files for services and deployments
|
|
- Added React Testing Library for frontend with API client tests
|
|
- Test scripts: `pnpm test` (watch mode), `pnpm test:run` (CI mode)
|
|
|
|
### Health Monitoring
|
|
- Implemented actual HTTP health checks with 10-second timeout
|
|
- Added 30-second caching to avoid overwhelming services
|
|
- Added User-Agent header for health check requests
|
|
- Added admin endpoint to clear health cache (`DELETE /api/health/cache`)
|
|
- Health status determined by response time: >5s = degraded
|
|
|
|
### API Validation
|
|
- Added Zod schemas for all API routes (services, deployments, health)
|
|
- Proper error handling with BadRequestError from @bytelyst/errors
|
|
- Validated path parameters, query parameters, and request bodies
|
|
- Strict validation on update operations to prevent accidental field changes
|
|
|
|
### Deployment Logs
|
|
- Endpoint `GET /api/deployments/:id/logs` returns the full captured stdout/stderr + current status as a single JSON payload (admin only).
|
|
- The web client polls this endpoint while a deployment is `running`. There is intentionally no SSE/WebSocket stream — the previous attempt with `fastify-sse-v2` was incompatible with Fastify 5 and was removed. If a real-time stream is needed later, implement it explicitly via `reply.raw` and update this section in the same change.
|
|
|
|
### Security Enhancements
|
|
- Added rate limiting: 100 requests per minute per IP
|
|
- Improved CORS with allowed origins whitelist
|
|
- Added security headers: X-Content-Type-Options, X-Frame-Options, X-XSS-Protection, HSTS, Referrer-Policy
|
|
- OPTIONS preflight request handling
|
|
- Credentials support for authenticated requests
|
|
|
|
### Auto-Refresh
|
|
- Automatic health status refresh every 60 seconds
|
|
- Manual refresh button to clear cache and force health checks
|
|
- Visual feedback with spinning icon during refresh
|
|
- Last health check timestamp displayed on service cards
|
|
|
|
## Setup
|
|
|
|
### Prerequisites
|
|
|
|
- Node.js 22+
|
|
- pnpm 10.6.5
|
|
- Azure Cosmos DB credentials
|
|
- Platform Service URL
|
|
- Access to @bytelyst/* packages (via common-plat workspace or Gitea registry)
|
|
|
|
### Installation
|
|
|
|
The dashboard uses the .pnpmfile.cjs pattern for dynamic dependency resolution, supporting both local workspace and Gitea registry modes.
|
|
|
|
```bash
|
|
# For local development (uses workspace links to learning_ai_common_plat)
|
|
pnpm install:common-plat
|
|
|
|
# For production (uses Gitea registry at localhost:3300)
|
|
pnpm install:gitea
|
|
```
|
|
|
|
### Backend
|
|
|
|
```bash
|
|
cd backend
|
|
cp .env.example .env # Add your credentials
|
|
pnpm dev # Runs on port 4004
|
|
```
|
|
|
|
### Frontend
|
|
|
|
```bash
|
|
cd web
|
|
cp .env.local.example .env.local # Add your URLs
|
|
pnpm dev # Next dev server on http://localhost:3000 (no Docker)
|
|
```
|
|
|
|
### Running Both
|
|
|
|
```bash
|
|
# From dashboard root
|
|
pnpm dev
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
### Backend (.env)
|
|
|
|
```
|
|
PORT=4004
|
|
PLATFORM_SERVICE_URL=http://localhost:4003
|
|
COSMOS_ENDPOINT=https://your-cosmos.documents.azure.com:443/
|
|
COSMOS_KEY=your-cosmos-key
|
|
COSMOS_DATABASE=bytelyst-platform
|
|
JWT_SECRET=your-jwt-secret
|
|
```
|
|
|
|
### Frontend (.env.local)
|
|
|
|
```
|
|
NEXT_PUBLIC_DEVOPS_API_URL=http://localhost:4004
|
|
NEXT_PUBLIC_PLATFORM_URL=http://localhost:4003
|
|
```
|
|
|
|
Production deployments use `https://api.bytelyst.com/devops` for `NEXT_PUBLIC_DEVOPS_API_URL` and `https://api.bytelyst.com/platform/api` for `NEXT_PUBLIC_PLATFORM_URL`.
|
|
|
|
## Usage
|
|
|
|
1. **Seed Services**: Click "Seed Services" on the dashboard to register default services
|
|
2. **Deploy**: Click "Deploy" on any service card to trigger deployment
|
|
3. **Monitor**: View real-time health status and deployment history
|
|
4. **Platform Admin**: Click "Platform Admin" link to jump to the admin dashboard
|
|
5. **Hermes Mission Control**: Visit `/hermes` for the mock executive command center and the companion routes `/hermes/tasks`, `/hermes/tasks/[id]`, `/hermes/products`, `/hermes/history`, `/hermes/agents`, and `/hermes/settings`
|
|
|
|
## Integration with Platform Admin
|
|
|
|
- DevOps dashboard links to admin-web at `http://localhost:3001`
|
|
- Admin-web should have a reciprocal link back to DevOps dashboard
|
|
- Both use platform-service for authentication
|
|
|
|
## API Endpoints
|
|
|
|
### Services
|
|
- `GET /api/services` - List all services
|
|
- `GET /api/services/:id` - Get single service
|
|
- `POST /api/services` - Create service (admin only)
|
|
- `PUT /api/services/:id` - Update service (admin only)
|
|
- `DELETE /api/services/:id` - Delete service (admin only)
|
|
|
|
### Deployments
|
|
- `GET /api/deployments` - Recent deployments (with `?limit=` query param)
|
|
- `GET /api/deployments/service/:serviceId` - Deployments for specific service
|
|
- `GET /api/deployments/:id` - Single deployment
|
|
- `GET /api/deployments/:id/logs` - Get captured deployment logs as JSON (web client polls this; no SSE)
|
|
- `POST /api/deployments/trigger/:serviceId` - Trigger deployment (admin only)
|
|
|
|
### Health
|
|
- `GET /api/health` - Health of all services
|
|
- `GET /api/health/:serviceId` - Health of specific service
|
|
- `DELETE /api/health/cache` - Clear health cache (admin only)
|
|
|
|
### Seed
|
|
- `POST /api/seed` - Seed default services (admin only)
|
|
|
|
## Development
|
|
|
|
```bash
|
|
# Backend typecheck
|
|
cd backend && pnpm typecheck
|
|
|
|
# Frontend typecheck
|
|
cd web && pnpm typecheck
|
|
|
|
# Run tests (watch mode)
|
|
pnpm test
|
|
|
|
# Run tests (CI mode)
|
|
pnpm test:run
|
|
|
|
# Run both
|
|
pnpm --filter backend dev & pnpm --filter web dev
|
|
```
|
|
|
|
## Deployment
|
|
|
|
See [DEPLOYMENT.md](./DEPLOYMENT.md) for detailed deployment instructions.
|
|
|
|
Deploy as a ByteLyst product:
|
|
- Product ID: `devops-internal`
|
|
- Backend port: 4004 (host) / 4004 (container)
|
|
- Web port: 3000 (container) — exposed on host as **`localhost:3049`** under Docker Compose; dev mode (`pnpm dev`) listens directly on `localhost:3000`. See [`DEPLOYMENT.md`](./DEPLOYMENT.md) for the full port table.
|
|
- Use existing deployment scripts in parent directory
|
|
- Public API base: `https://api.bytelyst.com/devops`
|
|
|
|
## Production Features
|
|
|
|
The dashboard includes comprehensive production-ready features:
|
|
|
|
- **CI/CD Pipeline**: Gitea Actions with build, test, typecheck, lint, E2E tests
|
|
- **Security**: CSRF protection, rate limiting, CORS, security headers
|
|
- **Monitoring**: System metrics, Docker management, performance tracking
|
|
- **Operations**: Database migrations, backup/restore, audit logging
|
|
- **Accessibility**: ARIA labels, keyboard navigation, skip links
|
|
- **PWA**: Web app manifest, mobile-friendly
|
|
- **Documentation**: OpenAPI/Swagger at `/docs`
|
|
|
|
See [DEPLOYMENT.md](./DEPLOYMENT.md) for complete deployment guide.
|