4.4 KiB
LLM Utility Workflows: FreeLLMAPI + MarkItDown
This VM has two private utilities for cheaper, safer, and more token-efficient AI workflows.
FreeLLMAPI private fallback gateway
FreeLLMAPI is installed as a private OpenAI-compatible gateway for low-stakes fallback or optional use.
Runtime
- App path:
/opt/freellmapi/app - Persistent database:
/var/lib/freellmapi/data/freeapi.db - Service:
freellmapi.service - Base URL:
http://127.0.0.1:3001/v1 - Client env file:
/etc/freellmapi/client.env - Status helper:
freellmapi-status - Provider-key helper:
freellmapi-add-key
The service is intentionally loopback-only. Do not expose it through Caddy or publish it on a Docker public interface; its dashboard/admin APIs are local-trust-only.
Operations
systemctl status freellmapi.service --no-pager
freellmapi-status
journalctl -u freellmapi.service -n 100 --no-pager
systemctl restart freellmapi.service
Expected bind check:
127.0.0.1:3001
There should be no 0.0.0.0:3001 or :::3001 listener.
Add provider keys
FreeLLMAPI is installed and healthy, but it needs provider keys before chat fallback is usable. Add keys without echoing secrets:
printf '%s' "$OPENROUTER_API_KEY" | freellmapi-add-key openrouter "main openrouter"
printf '%s' "$GEMINI_API_KEY" | freellmapi-add-key google "main gemini"
printf '%s' "$GROQ_API_KEY" | freellmapi-add-key groq "main groq"
For Cloudflare Workers AI, the expected FreeLLMAPI key format is:
account_id:api_token
Check key counts without revealing secrets:
freellmapi-status
Use from OpenAI-compatible clients
Load local client config:
set -a
source /etc/freellmapi/client.env
set +a
Then configure clients with:
- Base URL:
$FREELLMAPI_BASE_URL - API key:
$FREELLMAPI_API_KEY - Model:
auto
Use this only for non-sensitive, low-stakes, or fallback workloads. Prompts still go to third-party providers selected by the gateway.
Safety boundaries
- Do not send secrets, credentials, customer data, or sensitive incident details.
- Do not make this public.
- Treat it as opportunistic capacity, not an SLA-backed production dependency.
- Prefer primary paid/high-quality providers for serious Hermes coding/devops work.
MarkItDown document-to-Markdown workflow
Microsoft MarkItDown is installed for token-efficient local document extraction before LLM analysis.
Runtime
- Venv:
/opt/markitdown/venv - Direct CLI:
bytelyst-markitdown - Safe wrapper:
bytelyst-doc2md - Status helper:
markitdown-status - Hermes skill:
markitdown-document-workflow
Installed support covers common local document formats: PDF, DOCX, PPTX, XLSX/XLS, HTML, CSV/JSON/XML/text-like inputs supported by MarkItDown.
Standard conversion
bytelyst-doc2md /path/to/document.pdf -o /tmp/document.md
python3 -m json.tool /tmp/document.md.stats.json
The stats file records source size, SHA256, output size, token estimate, truncation status, and conversion time.
For large files:
bytelyst-doc2md /path/to/document.pdf -o /tmp/document.head.md --max-chars 50000
Security behavior
bytelyst-doc2md is the preferred wrapper because it:
- refuses URLs by default,
- disables plugins by default,
- strips long base64 data URIs by default,
- records source SHA256 for local files,
- writes token estimates before content is loaded into an LLM context.
Use --allow-url only for trusted URLs after considering SSRF/local-file exposure. Prefer downloading the file with a vetted tool first, then converting the local path.
Verification
markitdown-status
Expected output includes:
bytelyst-doc2md: ok
source_sha256_present: True
markdown_heading_present: True
url_refusal: ok
Hermes usage pattern
When a user sends a document or asks to analyze a Drive/PDF/Office file:
- Load the
markitdown-document-workflowskill. - Download or locate the file locally.
- Convert with
bytelyst-doc2md. - Read the stats JSON first.
- Search/page the generated Markdown instead of loading the whole file.
- Fall back to OCR tooling only when MarkItDown output is incomplete or image-only.
When a user asks to save LLM/provider cost:
- Use primary Hermes provider for sensitive or high-stakes work.
- Use FreeLLMAPI only for low-stakes fallback after provider keys have been added.
- Confirm
freellmapi-statusshows at least one provider key before routing work to it.