docs: document local LLM utility workflows
Some checks failed
pre-commit / pre-commit (push) Failing after 33s
Some checks failed
pre-commit / pre-commit (push) Failing after 33s
This commit is contained in:
parent
44fd6a462a
commit
0e1905aa33
156
docs/llm-utility-workflows.md
Normal file
156
docs/llm-utility-workflows.md
Normal file
@ -0,0 +1,156 @@
|
||||
# LLM Utility Workflows: FreeLLMAPI + MarkItDown
|
||||
|
||||
This VM has two private utilities for cheaper, safer, and more token-efficient AI workflows.
|
||||
|
||||
## FreeLLMAPI private fallback gateway
|
||||
|
||||
FreeLLMAPI is installed as a private OpenAI-compatible gateway for low-stakes fallback or optional use.
|
||||
|
||||
### Runtime
|
||||
|
||||
- App path: `/opt/freellmapi/app`
|
||||
- Persistent database: `/var/lib/freellmapi/data/freeapi.db`
|
||||
- Service: `freellmapi.service`
|
||||
- Base URL: `http://127.0.0.1:3001/v1`
|
||||
- Client env file: `/etc/freellmapi/client.env`
|
||||
- Status helper: `freellmapi-status`
|
||||
- Provider-key helper: `freellmapi-add-key`
|
||||
|
||||
The service is intentionally loopback-only. Do not expose it through Caddy or publish it on a Docker public interface; its dashboard/admin APIs are local-trust-only.
|
||||
|
||||
### Operations
|
||||
|
||||
```bash
|
||||
systemctl status freellmapi.service --no-pager
|
||||
freellmapi-status
|
||||
journalctl -u freellmapi.service -n 100 --no-pager
|
||||
systemctl restart freellmapi.service
|
||||
```
|
||||
|
||||
Expected bind check:
|
||||
|
||||
```text
|
||||
127.0.0.1:3001
|
||||
```
|
||||
|
||||
There should be no `0.0.0.0:3001` or `:::3001` listener.
|
||||
|
||||
### Add provider keys
|
||||
|
||||
FreeLLMAPI is installed and healthy, but it needs provider keys before chat fallback is usable. Add keys without echoing secrets:
|
||||
|
||||
```bash
|
||||
printf '%s' "$OPENROUTER_API_KEY" | freellmapi-add-key openrouter "main openrouter"
|
||||
printf '%s' "$GEMINI_API_KEY" | freellmapi-add-key google "main gemini"
|
||||
printf '%s' "$GROQ_API_KEY" | freellmapi-add-key groq "main groq"
|
||||
```
|
||||
|
||||
For Cloudflare Workers AI, the expected FreeLLMAPI key format is:
|
||||
|
||||
```text
|
||||
account_id:api_token
|
||||
```
|
||||
|
||||
Check key counts without revealing secrets:
|
||||
|
||||
```bash
|
||||
freellmapi-status
|
||||
```
|
||||
|
||||
### Use from OpenAI-compatible clients
|
||||
|
||||
Load local client config:
|
||||
|
||||
```bash
|
||||
set -a
|
||||
source /etc/freellmapi/client.env
|
||||
set +a
|
||||
```
|
||||
|
||||
Then configure clients with:
|
||||
|
||||
- Base URL: `$FREELLMAPI_BASE_URL`
|
||||
- API key: `$FREELLMAPI_API_KEY`
|
||||
- Model: `auto`
|
||||
|
||||
Use this only for non-sensitive, low-stakes, or fallback workloads. Prompts still go to third-party providers selected by the gateway.
|
||||
|
||||
### Safety boundaries
|
||||
|
||||
- Do not send secrets, credentials, customer data, or sensitive incident details.
|
||||
- Do not make this public.
|
||||
- Treat it as opportunistic capacity, not an SLA-backed production dependency.
|
||||
- Prefer primary paid/high-quality providers for serious Hermes coding/devops work.
|
||||
|
||||
## MarkItDown document-to-Markdown workflow
|
||||
|
||||
Microsoft MarkItDown is installed for token-efficient local document extraction before LLM analysis.
|
||||
|
||||
### Runtime
|
||||
|
||||
- Venv: `/opt/markitdown/venv`
|
||||
- Direct CLI: `bytelyst-markitdown`
|
||||
- Safe wrapper: `bytelyst-doc2md`
|
||||
- Status helper: `markitdown-status`
|
||||
- Hermes skill: `markitdown-document-workflow`
|
||||
|
||||
Installed support covers common local document formats: PDF, DOCX, PPTX, XLSX/XLS, HTML, CSV/JSON/XML/text-like inputs supported by MarkItDown.
|
||||
|
||||
### Standard conversion
|
||||
|
||||
```bash
|
||||
bytelyst-doc2md /path/to/document.pdf -o /tmp/document.md
|
||||
python3 -m json.tool /tmp/document.md.stats.json
|
||||
```
|
||||
|
||||
The stats file records source size, SHA256, output size, token estimate, truncation status, and conversion time.
|
||||
|
||||
For large files:
|
||||
|
||||
```bash
|
||||
bytelyst-doc2md /path/to/document.pdf -o /tmp/document.head.md --max-chars 50000
|
||||
```
|
||||
|
||||
### Security behavior
|
||||
|
||||
`bytelyst-doc2md` is the preferred wrapper because it:
|
||||
|
||||
- refuses URLs by default,
|
||||
- disables plugins by default,
|
||||
- strips long base64 data URIs by default,
|
||||
- records source SHA256 for local files,
|
||||
- writes token estimates before content is loaded into an LLM context.
|
||||
|
||||
Use `--allow-url` only for trusted URLs after considering SSRF/local-file exposure. Prefer downloading the file with a vetted tool first, then converting the local path.
|
||||
|
||||
### Verification
|
||||
|
||||
```bash
|
||||
markitdown-status
|
||||
```
|
||||
|
||||
Expected output includes:
|
||||
|
||||
```text
|
||||
bytelyst-doc2md: ok
|
||||
source_sha256_present: True
|
||||
markdown_heading_present: True
|
||||
url_refusal: ok
|
||||
```
|
||||
|
||||
## Hermes usage pattern
|
||||
|
||||
When a user sends a document or asks to analyze a Drive/PDF/Office file:
|
||||
|
||||
1. Load the `markitdown-document-workflow` skill.
|
||||
2. Download or locate the file locally.
|
||||
3. Convert with `bytelyst-doc2md`.
|
||||
4. Read the stats JSON first.
|
||||
5. Search/page the generated Markdown instead of loading the whole file.
|
||||
6. Fall back to OCR tooling only when MarkItDown output is incomplete or image-only.
|
||||
|
||||
When a user asks to save LLM/provider cost:
|
||||
|
||||
1. Use primary Hermes provider for sensitive or high-stakes work.
|
||||
2. Use FreeLLMAPI only for low-stakes fallback after provider keys have been added.
|
||||
3. Confirm `freellmapi-status` shows at least one provider key before routing work to it.
|
||||
@ -52,6 +52,7 @@ Current key files:
|
||||
- `docs/remove_user_interactive.md`
|
||||
- `docs/hermes-setup-upgrade-roadmap.md`
|
||||
- `docs/hermes-operations.md`
|
||||
- `docs/llm-utility-workflows.md`
|
||||
- `docs/vm-security-blind-spots-roadmap.md`
|
||||
- `docs/vm-exposure-inventory.md`
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user