# LLM Utility Workflows: FreeLLMAPI + MarkItDown This VM has two private utilities for cheaper, safer, and more token-efficient AI workflows. ## FreeLLMAPI private fallback gateway FreeLLMAPI is installed as a private OpenAI-compatible gateway for low-stakes fallback or optional use. ### Runtime - App path: `/opt/freellmapi/app` - Persistent database: `/var/lib/freellmapi/data/freeapi.db` - Service: `freellmapi.service` - Base URL: `http://127.0.0.1:3001/v1` - Client env file: `/etc/freellmapi/client.env` - Status helper: `freellmapi-status` - Provider-key helper: `freellmapi-add-key` The service is intentionally loopback-only. Do not expose it through Caddy or publish it on a Docker public interface; its dashboard/admin APIs are local-trust-only. ### Operations ```bash systemctl status freellmapi.service --no-pager freellmapi-status journalctl -u freellmapi.service -n 100 --no-pager systemctl restart freellmapi.service ``` Expected bind check: ```text 127.0.0.1:3001 ``` There should be no `0.0.0.0:3001` or `:::3001` listener. ### Add provider keys FreeLLMAPI is installed and healthy, but it needs provider keys before chat fallback is usable. Add keys without echoing secrets: ```bash printf '%s' "$OPENROUTER_API_KEY" | freellmapi-add-key openrouter "main openrouter" printf '%s' "$GEMINI_API_KEY" | freellmapi-add-key google "main gemini" printf '%s' "$GROQ_API_KEY" | freellmapi-add-key groq "main groq" ``` For Cloudflare Workers AI, the expected FreeLLMAPI key format is: ```text account_id:api_token ``` Check key counts without revealing secrets: ```bash freellmapi-status ``` ### Use from OpenAI-compatible clients Load local client config: ```bash set -a source /etc/freellmapi/client.env set +a ``` Then configure clients with: - Base URL: `$FREELLMAPI_BASE_URL` - API key: `$FREELLMAPI_API_KEY` - Model: `auto` Use this only for non-sensitive, low-stakes, or fallback workloads. Prompts still go to third-party providers selected by the gateway. ### Safety boundaries - Do not send secrets, credentials, customer data, or sensitive incident details. - Do not make this public. - Treat it as opportunistic capacity, not an SLA-backed production dependency. - Prefer primary paid/high-quality providers for serious Hermes coding/devops work. ## MarkItDown document-to-Markdown workflow Microsoft MarkItDown is installed for token-efficient local document extraction before LLM analysis. ### Runtime - Venv: `/opt/markitdown/venv` - Direct CLI: `bytelyst-markitdown` - Safe wrapper: `bytelyst-doc2md` - Status helper: `markitdown-status` - Hermes skill: `markitdown-document-workflow` Installed support covers common local document formats: PDF, DOCX, PPTX, XLSX/XLS, HTML, CSV/JSON/XML/text-like inputs supported by MarkItDown. ### Standard conversion ```bash bytelyst-doc2md /path/to/document.pdf -o /tmp/document.md python3 -m json.tool /tmp/document.md.stats.json ``` The stats file records source size, SHA256, output size, token estimate, truncation status, and conversion time. For large files: ```bash bytelyst-doc2md /path/to/document.pdf -o /tmp/document.head.md --max-chars 50000 ``` ### Security behavior `bytelyst-doc2md` is the preferred wrapper because it: - refuses URLs by default, - disables plugins by default, - strips long base64 data URIs by default, - records source SHA256 for local files, - writes token estimates before content is loaded into an LLM context. Use `--allow-url` only for trusted URLs after considering SSRF/local-file exposure. Prefer downloading the file with a vetted tool first, then converting the local path. ### Verification ```bash markitdown-status ``` Expected output includes: ```text bytelyst-doc2md: ok source_sha256_present: True markdown_heading_present: True url_refusal: ok ``` ## Hermes usage pattern When a user sends a document or asks to analyze a Drive/PDF/Office file: 1. Load the `markitdown-document-workflow` skill. 2. Download or locate the file locally. 3. Convert with `bytelyst-doc2md`. 4. Read the stats JSON first. 5. Search/page the generated Markdown instead of loading the whole file. 6. Fall back to OCR tooling only when MarkItDown output is incomplete or image-only. When a user asks to save LLM/provider cost: 1. Use primary Hermes provider for sensitive or high-stakes work. 2. Use FreeLLMAPI only for low-stakes fallback after provider keys have been added. 3. Confirm `freellmapi-status` shows at least one provider key before routing work to it.