# Debug Service Skill **Description**: Systematic methodology for diagnosing and fixing failing services or endpoints across the entire stack. ## When to Use - Any service is returning errors or unexpected behavior - Health checks are failing - Tests are failing in CI or locally - Users report broken functionality ## Prerequisites - Access to service logs - Terminal with curl installed - Ability to run tests locally ## Steps ### 1. Identify the Failing Service Quickly locate which service is affected: ```bash # Service locations reference: # Backend API (Python/FastAPI) → backend/src/ # Billing Service (Fastify) → ../learning_ai_common_plat/services/billing-service/src/ # Growth Service (Fastify) → ../learning_ai_common_plat/services/growth-service/src/ # Platform Service (Fastify) → ../learning_ai_common_plat/services/platform-service/src/ # Tracker Service (Fastify) → ../learning_ai_common_plat/services/tracker-service/src/ # Admin Dashboard (Next.js) → admin-dashboard-web/src/ # User Dashboard (Next.js) → user-dashboard-web/src/ # Tracker Dashboard (Next.js) → tracker-dashboard-web/src/ ``` ### 2. Check Health Status ```bash # Check all services at once curl -s http://localhost:8000/health && \ curl -s http://localhost:4003/health ``` ### 3. Examine Logs **For local development:** ```bash # Check recent logs from all services tail -50 .logs/backend.log .logs/platform-service.log 2>/dev/null | head -100 ``` **For Docker:** ```bash docker compose logs --tail=50 ``` ### 4. Reproduce the Issue - **API errors**: Use curl with verbose output ```bash curl -v -X POST http://localhost:8000/api/endpoint -H "Content-Type: application/json" -d '{"key":"value"}' ``` - **UI errors**: Check browser console and network tab - **Test failures**: Run specific tests with verbose output ```bash # Python python -m pytest tests/test_specific.py -v -x # TypeScript/Vitest pnpm test --reporter=verbose specific.test.ts ``` ### 5. Fix Methodology Follow this order to avoid common pitfalls: 1. **Read the test first** - Understand what the expected behavior is 2. **Read the source code** - Trace the execution path 3. **Fix the source, NOT the test** - Unless the test itself is wrong 4. **Add a regression test** - If none exists for this bug 5. **Run the full test suite** - Ensure no new issues were introduced ### 6. Verify the Fix ```bash # Python tests python -m pytest tests/ backend/tests/ -v --tb=short -x # TypeScript services cd ../learning_ai_common_plat && pnpm --filter @lysnrai/ test # Dashboard builds cd admin-dashboard-web && npm run build ``` ### 7. Commit with Proper Format ```bash git add . git commit -m "fix(): " # Examples: # fix(billing): handle null subscription in usage endpoint # fix(platform): validate JWT token in auth middleware # fix(admin): resolve dashboard loading state issue git push ``` ## Common Patterns ### Database Connection Issues ```bash # Check Cosmos DB connectivity curl -s http://localhost:4003/health | jq . # Look for database errors in logs grep -i "database\|cosmos\|connection" .logs/*.log ``` ### Authentication Issues ```bash # Decode JWT to check contents echo "" | cut -d. -f2 | base64 -d | jq . # Check auth service health curl -s http://localhost:4003/api/auth/me -H "Authorization: Bearer " ``` ### Service Dependencies ```bash # Check if dependent services are running docker compose ps # Verify service communication curl -s http://localhost:4003/health | jq . # consolidated platform service ``` ## Notes - **Always check logs first** - Most issues have clear error messages - **Isolate the problem** - Don't change multiple things at once - **Document the fix** - Add comments if the issue was non-obvious - **Consider edge cases** - Think about what might cause this to fail again ## Related Skills - [Local Development Setup](./local-development.md) - Start services locally - [Production Readiness](./production-readiness.md) - Comprehensive validation - [Test Strategies](./test-strategies.md) - Writing effective tests