bytelyst-devops-tools/dashboard/README.md

220 lines
7.2 KiB
Markdown

# ByteLyst DevOps Dashboard
Internal DevOps dashboard for deployment orchestration and service monitoring across ByteLyst products.
## Architecture
```
dashboard/
├── backend/ # Fastify 5 backend (port 4004)
│ └── src/
│ ├── lib/ # Config, auth, Cosmos
│ └── modules/ # Services, deployments, health
├── web/ # Next.js 16 frontend (port 3000)
│ └── src/
│ ├── app/ # Pages
│ └── lib/ # API client, auth
└── shared/
└── product.json # Product identity
```
## Features
- **Service Registry**: Manage all ByteLyst services (trading, notes, clock, etc.)
- **Deployment Orchestration**: Trigger deployments via existing bash scripts
- **Health Monitoring**: Real-time health checks for all services with caching
- **Deployment History**: Audit trail of all deployments with log streaming
- **Cross-Navigation**: One-click link to Platform Admin dashboard
- **Hermes Mission Control**: Read-only mock dashboard for portfolio-wide execution, task ledger, product health, history, agents, and settings
- **Testing**: Vitest for backend, React Testing Library for frontend
- **Security**: Rate limiting, CORS, security headers, Zod validation
- **Auto-Refresh**: Automatic health status updates every 60 seconds
## Recent Improvements
### Testing Infrastructure
- Added Vitest for backend testing with test files for services and deployments
- Added React Testing Library for frontend with API client tests
- Test scripts: `pnpm test` (watch mode), `pnpm test:run` (CI mode)
### Health Monitoring
- Implemented actual HTTP health checks with 10-second timeout
- Added 30-second caching to avoid overwhelming services
- Added User-Agent header for health check requests
- Added admin endpoint to clear health cache (`DELETE /api/health/cache`)
- Health status determined by response time: >5s = degraded
### API Validation
- Added Zod schemas for all API routes (services, deployments, health)
- Proper error handling with BadRequestError from @bytelyst/errors
- Validated path parameters, query parameters, and request bodies
- Strict validation on update operations to prevent accidental field changes
### Deployment Log Streaming
- Added SSE endpoint for real-time log streaming (`GET /api/deployments/:id/logs`)
- Frontend EventSource integration with cleanup function
- Automatic polling for running deployments (1-second interval)
- Proper connection cleanup on client disconnect
### Security Enhancements
- Added rate limiting: 100 requests per minute per IP
- Improved CORS with allowed origins whitelist
- Added security headers: X-Content-Type-Options, X-Frame-Options, X-XSS-Protection, HSTS, Referrer-Policy
- OPTIONS preflight request handling
- Credentials support for authenticated requests
### Auto-Refresh
- Automatic health status refresh every 60 seconds
- Manual refresh button to clear cache and force health checks
- Visual feedback with spinning icon during refresh
- Last health check timestamp displayed on service cards
## Setup
### Prerequisites
- Node.js 22+
- pnpm 10.6.5
- Azure Cosmos DB credentials
- Platform Service URL
- Access to @bytelyst/* packages (via common-plat workspace or Gitea registry)
### Installation
The dashboard uses the .pnpmfile.cjs pattern for dynamic dependency resolution, supporting both local workspace and Gitea registry modes.
```bash
# For local development (uses workspace links to learning_ai_common_plat)
pnpm install:common-plat
# For production (uses Gitea registry at localhost:3300)
pnpm install:gitea
```
### Backend
```bash
cd backend
cp .env.example .env # Add your credentials
pnpm dev # Runs on port 4004
```
### Frontend
```bash
cd web
cp .env.local.example .env.local # Add your URLs
pnpm dev # Runs on port 3000
```
### Running Both
```bash
# From dashboard root
pnpm dev
```
## Environment Variables
### Backend (.env)
```
PORT=4004
PLATFORM_SERVICE_URL=http://localhost:4003
COSMOS_ENDPOINT=https://your-cosmos.documents.azure.com:443/
COSMOS_KEY=your-cosmos-key
COSMOS_DATABASE=bytelyst-platform
JWT_SECRET=your-jwt-secret
```
### Frontend (.env.local)
```
NEXT_PUBLIC_DEVOPS_API_URL=http://localhost:4004
NEXT_PUBLIC_PLATFORM_URL=http://localhost:4003
```
Production deployments use `https://api.bytelyst.com/devops` for `NEXT_PUBLIC_DEVOPS_API_URL` and `https://api.bytelyst.com/platform/api` for `NEXT_PUBLIC_PLATFORM_URL`.
## Usage
1. **Seed Services**: Click "Seed Services" on the dashboard to register default services
2. **Deploy**: Click "Deploy" on any service card to trigger deployment
3. **Monitor**: View real-time health status and deployment history
4. **Platform Admin**: Click "Platform Admin" link to jump to the admin dashboard
5. **Hermes Mission Control**: Visit `/hermes` for the mock executive command center and the companion routes `/hermes/tasks`, `/hermes/tasks/[id]`, `/hermes/products`, `/hermes/history`, `/hermes/agents`, and `/hermes/settings`
## Integration with Platform Admin
- DevOps dashboard links to admin-web at `http://localhost:3001`
- Admin-web should have a reciprocal link back to DevOps dashboard
- Both use platform-service for authentication
## API Endpoints
### Services
- `GET /api/services` - List all services
- `GET /api/services/:id` - Get single service
- `POST /api/services` - Create service (admin only)
- `PUT /api/services/:id` - Update service (admin only)
- `DELETE /api/services/:id` - Delete service (admin only)
### Deployments
- `GET /api/deployments` - Recent deployments (with `?limit=` query param)
- `GET /api/deployments/service/:serviceId` - Deployments for specific service
- `GET /api/deployments/:id` - Single deployment
- `GET /api/deployments/:id/logs` - Stream deployment logs via SSE
- `POST /api/deployments/trigger/:serviceId` - Trigger deployment (admin only)
### Health
- `GET /api/health` - Health of all services
- `GET /api/health/:serviceId` - Health of specific service
- `DELETE /api/health/cache` - Clear health cache (admin only)
### Seed
- `POST /api/seed` - Seed default services (admin only)
## Development
```bash
# Backend typecheck
cd backend && pnpm typecheck
# Frontend typecheck
cd web && pnpm typecheck
# Run tests (watch mode)
pnpm test
# Run tests (CI mode)
pnpm test:run
# Run both
pnpm --filter backend dev & pnpm --filter web dev
```
## Deployment
See [DEPLOYMENT.md](./DEPLOYMENT.md) for detailed deployment instructions.
Deploy as a ByteLyst product:
- Product ID: `devops-internal`
- Backend port: 4004
- Web port: 3000
- Use existing deployment scripts in parent directory
- Public API base: `https://api.bytelyst.com/devops`
## Production Features
The dashboard includes comprehensive production-ready features:
- **CI/CD Pipeline**: Gitea Actions with build, test, typecheck, lint, E2E tests
- **Security**: CSRF protection, rate limiting, CORS, security headers
- **Monitoring**: System metrics, Docker management, performance tracking
- **Operations**: Database migrations, backup/restore, audit logging
- **Accessibility**: ARIA labels, keyboard navigation, skip links
- **PWA**: Web app manifest, mobile-friendly
- **Documentation**: OpenAPI/Swagger at `/docs`
See [DEPLOYMENT.md](./DEPLOYMENT.md) for complete deployment guide.