LLMFit Skill (llmfit)¶
The llmfit SkillPack adds node-level model placement analysis to Sympozium.
It uses your llmfit project (github.com/AlexsJones/llmfit) inside a skill sidecar and lets agents answer questions like:
- "Which node is best for
Qwen/Qwen2.5-Coder-14B-Instruct?" - "Show top coding-model placements across all nodes"
What it installs¶
- SkillPack manifest:
config/skills/llmfit.yaml - Sidecar image:
ghcr.io/sympozium-ai/sympozium/skill-llmfit:latest - Sidecar build context:
images/skill-llmfit/ Dockerfiletool-executor.shllmfit-probe-json.shllmfit-cluster-fit.sh
Helm bundled copy:
- charts/sympozium/files/skills/llmfit.yaml
Runtime design¶
Host access (default for llmfit)¶
The built-in llmfit sidecar now enables explicit host access by default so node-level hardware probes can read host information directly:
hostPID: true(pod-level)- Sidecar runs as
root(runAsRoot: true) - Read-only host mounts:
/proc→/host/proc/sys→/host/sys/dev→/host/dev/run/udev→/host/run/udev
The sidecar also exports helper environment variables:
LLMFIT_HOST_PROC=/host/procLLMFIT_HOST_SYS=/host/sysLLMFIT_HOST_DEV=/host/devLLMFIT_HOST_UDEV=/host/run/udev
This is configured in the SkillPack (spec.sidecar.hostAccess) and is not globally enabled for other skills.
Binary source¶
The sidecar installs llmfit from GitHub releases (v0.5.8+), using architecture-aware assets:
x86_64-unknown-linux-muslforamd64aarch64-unknown-linux-muslforarm64
This avoids host-level brew dependency and keeps installation deterministic in containers.
Cluster workflow¶
The primary command is:
llmfit-cluster-fit.sh --model "Qwen/Qwen2.5-Coder-14B-Instruct" --use-case coding --min-fit good --limit 10
It:
1. Discovers nodes with kubectl get nodes
2. Spawns one short-lived probe pod per node (nodeName pinned)
3. Runs llmfit on each node (system + recommend --json)
4. Aggregates and ranks results in a single JSON payload
REST API compatibility¶
If node-local daemons already run (llmfit serve), agent workflows can query:
/health/api/v1/system/api/v1/models/top/api/v1/models/{name}
RBAC¶
The skill provisions minimal scoped permissions:
- Namespace:
pods,pods/log(get/list/watch/create/delete) for probe lifecycle - Cluster:
nodes(get/list/watch) for node discovery
RBAC controls Kubernetes API access only. Host-level access is configured separately via spec.sidecar.hostAccess.
Usage examples¶
Preflight (recommended before queries):
Top models on default settings:
Top 5 candidate nodes for a coding model:
llmfit-cluster-fit.sh --model "Qwen2.5" --use-case coding --min-fit good --limit 10 | jq '.ranked_nodes[:5]'
Inspect full per-node evidence:
If ranked_nodes is empty at min-fit=good, retry with:
llmfit-cluster-fit.sh --model "*" --min-fit marginal --limit 10
llmfit-cluster-fit.sh --model "*" --min-fit too_tight --limit 10
If preflight fails (llmfit: not found), this indicates a stale/mismatched sidecar image in-cluster rather than a query issue.
Persona integration¶
platform-team now enables llmfit for the sre-watchdog persona so SRE flows can recommend model placement in chat without manual setup.