saravanakumardb1
|
a77b3ff931
|
refactor(extraction-service): provider-agnostic transcription — OpenAI + Azure Speech + Mock
- TranscriptionProvider interface with transcribe() + isConfigured()
- OpenAITranscriptionProvider: Whisper API (existing behavior)
- AzureTranscriptionProvider: Azure Speech REST API for short audio
- MockTranscriptionProvider: deterministic results for testing
- Factory: getSTT() singleton with env-driven auto-detection
- STT_PROVIDER=openai|azure|mock (explicit)
- Auto-detect: AZURE_SPEECH_KEY → azure, OPENAI_API_KEY → openai, else mock
- Config: add STT_PROVIDER, AZURE_SPEECH_KEY, AZURE_SPEECH_REGION env vars
- Route refactored: audio download (common) → provider.transcribe() (swappable)
- deriveFilename() extracted to types.ts (shared by route + providers)
- 35 transcription tests (was 12), 171 total passing
- Follows same pattern as @bytelyst/llm provider abstraction
|
2026-04-06 11:30:22 -07:00 |
|
saravanakumardb1
|
cc3fbf8187
|
feat(extraction-service): add /api/transcribe route — speech-to-text via OpenAI Whisper API
- POST /api/transcribe: download audio from URL, call Whisper API, return transcript
- Types: TranscribeRequestSchema (Zod), TranscribeResponse, SUPPORTED_AUDIO_TYPES
- Guards: 25MB size limit, 30s download timeout, 120s Whisper timeout, 429 rate limit
- Config: OPENAI_API_KEY, OPENAI_BASE_URL, WHISPER_MODEL env vars
- 12 new tests (schema validation + constants)
- Registered in server.ts alongside extract + task routes
|
2026-04-06 11:10:57 -07:00 |
|
saravanakumardb1
|
c292bb5cc1
|
feat(extraction): scaffold extraction-service + @bytelyst/extraction package
- extraction-service: Fastify scaffold (port 4005) with extract/tasks modules
- src/lib/: config, errors, cosmos, product-config, python-bridge
- src/modules/extract/: types (Zod schemas), routes (POST /extract, batch, models)
- src/modules/tasks/: types, repository (Cosmos CRUD), routes (CRUD endpoints)
- Python sidecar: FastAPI app, LangExtract wrapper, models, task registry
- @bytelyst/extraction package: types, client factory, index exports
- Both pnpm build pass clean
|
2026-02-14 13:31:40 -08:00 |
|