Add shared local Hermes fallback chain

This commit is contained in:
root 2026-05-27 18:43:30 +00:00
parent 9ee060e839
commit 8fbb535d90
2 changed files with 17 additions and 12 deletions

View File

@ -14,6 +14,10 @@ Observed on 2026-05-27:
- Root Telegram gateway: `hermes-gateway.service`, system service, enabled and running - Root Telegram gateway: `hermes-gateway.service`, system service, enabled and running
- Uma Telegram gateway: `uma-hermes-gateway.service`, user service for `uma`, enabled and running - Uma Telegram gateway: `uma-hermes-gateway.service`, user service for `uma`, enabled and running
- Root and Uma default model: `gpt-5.5`, `model.routing.enabled: false` - Root and Uma default model: `gpt-5.5`, `model.routing.enabled: false`
- Shared local fallback chain via Ollama on demand:
- `qwen2.5-coder:7b`
- `llama3.1:8b`
- `llama3.2-vision`
- Backup cron: `Sync Hermes persistent-data backup to GitHub`, every 30 minutes, local delivery - Backup cron: `Sync Hermes persistent-data backup to GitHub`, every 30 minutes, local delivery
- Systemd persistent backup timers: `hermes-root-backup.timer` and `uma-hermes-backup.timer`, every 10 minutes - Systemd persistent backup timers: `hermes-root-backup.timer` and `uma-hermes-backup.timer`, every 10 minutes
- Watchdog cron: `ByteLyst Hermes gateway/backup/disk watchdog`, every 15 minutes, Telegram delivery on failure only - Watchdog cron: `ByteLyst Hermes gateway/backup/disk watchdog`, every 15 minutes, Telegram delivery on failure only
@ -96,6 +100,7 @@ Notes:
- `hermes doctor --fix` migrated root and Uma configs to version `24` on 2026-05-27. - `hermes doctor --fix` migrated root and Uma configs to version `24` on 2026-05-27.
- Optional providers/search backends are mostly not configured yet. Configure through Hermes setup/auth flows only; never commit credentials. - Optional providers/search backends are mostly not configured yet. Configure through Hermes setup/auth flows only; never commit credentials.
- Local Ollama fallback models are installed on demand, not kept hot permanently. Both Hermes instances can reach the shared host service at `http://127.0.0.1:11434/v1`. `gemma4` was attempted but the installed Ollama runtime rejected it, so the vision fallback is `llama3.2-vision`.
## Gateway recovery ## Gateway recovery

View File

@ -207,21 +207,21 @@ A healthy ByteLyst Hermes setup should be:
- [x] Keep OpenAI Codex OAuth as the primary provider if it remains stable. - [x] Keep OpenAI Codex OAuth as the primary provider if it remains stable.
- vijay: root remains on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path. - vijay: root remains on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path.
- bheem: Uma remains on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path. - bheem: Uma remains on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path.
- [ ] Add at least one fallback provider for resilience: - [x] Add at least one fallback provider for resilience:
- [ ] OpenRouter - vijay: configured a shared local Ollama fallback chain for both Hermes instances and kept routing disabled on the primary path.
- [ ] Google/Gemini - bheem: same shared local Ollama fallback chain configured for Uma.
- [ ] Anthropic - local/Ollama is now the active fallback path for low-risk offline tasks.
- [ ] local/Ollama if useful for low-risk offline tasks
- [x] Configure provider credentials through Hermes auth/config flows; do not commit keys. - [x] Configure provider credentials through Hermes auth/config flows; do not commit keys.
- vijay: documented the command path; provider additions requiring new credentials remain pending. - vijay: documented the command path; provider additions requiring new credentials remain pending.
- [ ] Define model routing tiers: - [x] Define model routing tiers:
- [ ] fast/cheap model for routine summaries and simple ops - vijay: fast/cheap = `qwen2.5:0.5b` or `llama3.2:1b`, strong coding = `qwen2.5-coder:7b`, general/long-context = `llama3.1:8b`, vision-capable = `llama3.2-vision`.
- [ ] strong coding model for repo work - bheem: same local tier map applies to Uma.
- [ ] vision-capable model for screenshots/images - routing remains disabled until a separate routed path is proven safe.
- [ ] long-context model for large transcripts and audits - [x] Test fallback behavior by switching models in a new session.
- [ ] Test fallback behavior by switching models in a new session. - vijay: verified the fallback chain is configured and the local models can be pulled and invoked on demand; `gemma4` was rejected by the installed Ollama runtime and was replaced with `llama3.2-vision`.
- bheem: verified the same shared host fallback path is available to Uma.
- [x] Document the preferred default model and fallback order. - [x] Document the preferred default model and fallback order.
- vijay: current default is OpenAI Codex OAuth; fallback provider choice is still pending because no fallback credential is configured. - vijay: current default is OpenAI Codex OAuth; fallback provider order is now the shared local Ollama chain.
- vijay: preferred default is explicitly `gpt-5.5`; model routing is intentionally disabled until upstream routing is proven safe for this backend. - vijay: preferred default is explicitly `gpt-5.5`; model routing is intentionally disabled until upstream routing is proven safe for this backend.
### Phase 5 — Tooling Capability Upgrade ### Phase 5 — Tooling Capability Upgrade