bytelyst-devops-tools/docs/llm-utility-workflows.md
Hermes VM 0e1905aa33
Some checks failed
pre-commit / pre-commit (push) Failing after 33s
docs: document local LLM utility workflows
2026-05-28 00:21:06 +00:00

4.4 KiB

LLM Utility Workflows: FreeLLMAPI + MarkItDown

This VM has two private utilities for cheaper, safer, and more token-efficient AI workflows.

FreeLLMAPI private fallback gateway

FreeLLMAPI is installed as a private OpenAI-compatible gateway for low-stakes fallback or optional use.

Runtime

  • App path: /opt/freellmapi/app
  • Persistent database: /var/lib/freellmapi/data/freeapi.db
  • Service: freellmapi.service
  • Base URL: http://127.0.0.1:3001/v1
  • Client env file: /etc/freellmapi/client.env
  • Status helper: freellmapi-status
  • Provider-key helper: freellmapi-add-key

The service is intentionally loopback-only. Do not expose it through Caddy or publish it on a Docker public interface; its dashboard/admin APIs are local-trust-only.

Operations

systemctl status freellmapi.service --no-pager
freellmapi-status
journalctl -u freellmapi.service -n 100 --no-pager
systemctl restart freellmapi.service

Expected bind check:

127.0.0.1:3001

There should be no 0.0.0.0:3001 or :::3001 listener.

Add provider keys

FreeLLMAPI is installed and healthy, but it needs provider keys before chat fallback is usable. Add keys without echoing secrets:

printf '%s' "$OPENROUTER_API_KEY" | freellmapi-add-key openrouter "main openrouter"
printf '%s' "$GEMINI_API_KEY" | freellmapi-add-key google "main gemini"
printf '%s' "$GROQ_API_KEY" | freellmapi-add-key groq "main groq"

For Cloudflare Workers AI, the expected FreeLLMAPI key format is:

account_id:api_token

Check key counts without revealing secrets:

freellmapi-status

Use from OpenAI-compatible clients

Load local client config:

set -a
source /etc/freellmapi/client.env
set +a

Then configure clients with:

  • Base URL: $FREELLMAPI_BASE_URL
  • API key: $FREELLMAPI_API_KEY
  • Model: auto

Use this only for non-sensitive, low-stakes, or fallback workloads. Prompts still go to third-party providers selected by the gateway.

Safety boundaries

  • Do not send secrets, credentials, customer data, or sensitive incident details.
  • Do not make this public.
  • Treat it as opportunistic capacity, not an SLA-backed production dependency.
  • Prefer primary paid/high-quality providers for serious Hermes coding/devops work.

MarkItDown document-to-Markdown workflow

Microsoft MarkItDown is installed for token-efficient local document extraction before LLM analysis.

Runtime

  • Venv: /opt/markitdown/venv
  • Direct CLI: bytelyst-markitdown
  • Safe wrapper: bytelyst-doc2md
  • Status helper: markitdown-status
  • Hermes skill: markitdown-document-workflow

Installed support covers common local document formats: PDF, DOCX, PPTX, XLSX/XLS, HTML, CSV/JSON/XML/text-like inputs supported by MarkItDown.

Standard conversion

bytelyst-doc2md /path/to/document.pdf -o /tmp/document.md
python3 -m json.tool /tmp/document.md.stats.json

The stats file records source size, SHA256, output size, token estimate, truncation status, and conversion time.

For large files:

bytelyst-doc2md /path/to/document.pdf -o /tmp/document.head.md --max-chars 50000

Security behavior

bytelyst-doc2md is the preferred wrapper because it:

  • refuses URLs by default,
  • disables plugins by default,
  • strips long base64 data URIs by default,
  • records source SHA256 for local files,
  • writes token estimates before content is loaded into an LLM context.

Use --allow-url only for trusted URLs after considering SSRF/local-file exposure. Prefer downloading the file with a vetted tool first, then converting the local path.

Verification

markitdown-status

Expected output includes:

bytelyst-doc2md: ok
source_sha256_present: True
markdown_heading_present: True
url_refusal: ok

Hermes usage pattern

When a user sends a document or asks to analyze a Drive/PDF/Office file:

  1. Load the markitdown-document-workflow skill.
  2. Download or locate the file locally.
  3. Convert with bytelyst-doc2md.
  4. Read the stats JSON first.
  5. Search/page the generated Markdown instead of loading the whole file.
  6. Fall back to OCR tooling only when MarkItDown output is incomplete or image-only.

When a user asks to save LLM/provider cost:

  1. Use primary Hermes provider for sensitive or high-stakes work.
  2. Use FreeLLMAPI only for low-stakes fallback after provider keys have been added.
  3. Confirm freellmapi-status shows at least one provider key before routing work to it.