Writing Integration Tests¶
Integration tests verify that Sympozium tools work end-to-end in a real cluster. Unlike unit tests, they exercise the full pipeline: controller → Job → agent-runner → LLM → tool invocation.
Prerequisites¶
- A Kind (or other) cluster with Sympozium installed (
make install && kubectl apply -k config/) - An OpenAI API key (or other supported provider)
kubectlconfigured to talk to the cluster
Running Existing Tests¶
# Run all integration tests
make test-integration
# Run API smoke regressions (fast, no LLM execution)
make integration-tests
# Optional capability validation (same target, token-gated checks)
# CLAUDE_TOKEN validates Anthropic provider wiring
# GITHUB_TOKEN validates github-gitops token endpoint + secret persistence
CLAUDE_TOKEN=... GITHUB_TOKEN=... make integration-tests
# Run with a specific model
TEST_MODEL=gpt-5.2 ./test/integration/test-write-file.sh
# Override timeout (seconds)
TEST_TIMEOUT=180 ./test/integration/test-write-file.sh
# Use a pre-existing secret instead of OPENAI_API_KEY env var
kubectl create secret generic inttest-openai-key --from-literal=OPENAI_API_KEY=sk-...
./test/integration/test-write-file.sh
How Integration Tests Work¶
Each test follows the same pattern:
- Create resources — a test
SympoziumInstanceandAgentRunwith a deterministic task - Wait for completion — poll
status.phaseuntilSucceededorFailed - Validate results — check pod logs,
status.result, or the pod filesystem - Clean up — delete all test resources
The LLM is the real LLM, not a mock. The task prompt is carefully worded to produce a deterministic, verifiable outcome (e.g., "write exactly this string to this file").
Writing a New Test¶
1. Create the script¶
Place your test in test/integration/:
2. Use the standard template¶
#!/usr/bin/env bash
set -euo pipefail
# --- Configuration ---
NAMESPACE="${TEST_NAMESPACE:-default}"
INSTANCE_NAME="inttest-my-tool"
RUN_NAME="inttest-my-tool-run"
SECRET_NAME="inttest-openai-key"
MODEL="${TEST_MODEL:-gpt-4o-mini}"
TIMEOUT="${TEST_TIMEOUT:-120}"
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
pass() { echo -e "${GREEN}✓ $*${NC}"; }
fail() { echo -e "${RED}✗ $*${NC}"; }
info() { echo -e "${YELLOW}● $*${NC}"; }
failures=0
cleanup() {
info "Cleaning up..."
kubectl delete agentrun "$RUN_NAME" -n "$NAMESPACE" --ignore-not-found >/dev/null 2>&1 || true
kubectl delete sympoziuminstance "$INSTANCE_NAME" -n "$NAMESPACE" --ignore-not-found >/dev/null 2>&1 || true
kubectl delete jobs -n "$NAMESPACE" -l "sympozium.ai/agentrun=$RUN_NAME" --ignore-not-found >/dev/null 2>&1 || true
kubectl delete pods -n "$NAMESPACE" -l "sympozium.ai/agentrun=$RUN_NAME" --ignore-not-found >/dev/null 2>&1 || true
}
# Ensure secret exists
if ! kubectl get secret "$SECRET_NAME" -n "$NAMESPACE" >/dev/null 2>&1; then
if [[ -z "${OPENAI_API_KEY:-}" ]]; then
fail "Set OPENAI_API_KEY or create secret '$SECRET_NAME'"
exit 1
fi
kubectl create secret generic "$SECRET_NAME" \
--from-literal=OPENAI_API_KEY="$OPENAI_API_KEY" -n "$NAMESPACE"
fi
cleanup 2>/dev/null || true
sleep 2
# --- Create SympoziumInstance ---
cat <<EOF | kubectl apply -f -
apiVersion: sympozium.ai/v1alpha1
kind: SympoziumInstance
metadata:
name: ${INSTANCE_NAME}
namespace: ${NAMESPACE}
spec:
agents:
default:
model: ${MODEL}
authRefs:
- secret: ${SECRET_NAME}
EOF
# --- Create AgentRun ---
cat <<EOF | kubectl apply -f -
apiVersion: sympozium.ai/v1alpha1
kind: AgentRun
metadata:
name: ${RUN_NAME}
namespace: ${NAMESPACE}
labels:
sympozium.ai/instance: ${INSTANCE_NAME}
spec:
instanceRef: ${INSTANCE_NAME}
agentId: default
sessionKey: "inttest-$(date +%s)"
task: |
YOUR DETERMINISTIC TASK PROMPT HERE.
model:
provider: openai
model: ${MODEL}
authSecretRef: ${SECRET_NAME}
timeout: "3m"
EOF
# --- Wait for completion ---
elapsed=0
phase=""
pod=""
while [[ $elapsed -lt $TIMEOUT ]]; do
phase=$(kubectl get agentrun "$RUN_NAME" -n "$NAMESPACE" \
-o jsonpath='{.status.phase}' 2>/dev/null || echo "")
# Capture pod name early, before cleanup removes it
if [[ -z "$pod" ]]; then
pod=$(kubectl get pods -n "$NAMESPACE" \
-l "sympozium.ai/agentrun=$RUN_NAME" \
-o jsonpath='{.items[0].metadata.name}' 2>/dev/null || echo "")
fi
[[ "$phase" == "Succeeded" || "$phase" == "Failed" ]] && break
sleep 5
elapsed=$((elapsed + 5))
done
[[ "$phase" == "Succeeded" ]] && pass "AgentRun succeeded" || { fail "AgentRun phase: $phase"; cleanup; exit 1; }
# --- Validate ---
result=$(kubectl get agentrun "$RUN_NAME" -n "$NAMESPACE" \
-o jsonpath='{.status.result}' 2>/dev/null || echo "")
# YOUR VALIDATION LOGIC HERE
# Example:
# if echo "$result" | grep -qi "expected_string"; then
# pass "Result contains expected output"
# else
# fail "Result missing expected output"
# failures=$((failures + 1))
# fi
# --- Cleanup & exit ---
cleanup
if [[ $failures -gt 0 ]]; then
fail "$failures check(s) failed"
exit 1
fi
pass "Test complete"
3. Add to the Makefile (optional)¶
If you want it to run as part of make test-integration, add your script to that
target in the Makefile (it currently lists scripts explicitly). Or add a specific target:
Writing Good Task Prompts¶
The task prompt is the most important part. Since you're asking a real LLM to do something and then checking the result, the prompt must be:
- Deterministic — ask for a specific, verifiable output (exact string, exact filename)
- Minimal — one tool call, one action — don't test multiple things at once
- Unambiguous — tell the model exactly what to do, not what to figure out
Good examples¶
Use the write_file tool to write the exact text "sympozium-integration-ok"
to /workspace/test-output.txt. Do not add any extra content.
Bad examples¶
This is non-deterministic — you can't verify the output. This tests the LLM's coding ability, not the tool.Validation Strategies¶
| What to check | How | Reliability |
|---|---|---|
status.phase |
kubectl get agentrun -o jsonpath='{.status.phase}' |
High — always works |
status.result |
Check for keywords in the LLM's response text | High — survives pod cleanup |
| Pod logs | kubectl logs <pod> -c agent |
Medium — pod may be gone |
| File on disk | kubectl exec <pod> -c agent -- cat /path |
Low — pod must still be running |
Tip: Capture the pod name during the polling loop (while the run is Running),
not after it completes. The Job/pod is cleaned up quickly after the agent finishes.
Environment Variables¶
| Variable | Default | Description |
|---|---|---|
TEST_MODEL |
gpt-4o-mini |
LLM model to use |
TEST_TIMEOUT |
120 |
Max seconds to wait for completion |
TEST_NAMESPACE |
default |
Kubernetes namespace |
OPENAI_API_KEY |
(none) | Used to create the secret if it doesn't exist |
Existing Tests¶
| Test | File | What it verifies |
|---|---|---|
| API smoke | test/integration/test-api-smoke.sh |
API-only CRUD/list/get coverage for PersonaPacks, ad-hoc Instances, Skills, Policies, and Schedules |
| API PersonaPack provider switch | test/integration/test-api-personapack-provider-switch.sh |
Verifies OpenAI→Anthropic PersonaPack updates propagate to stamped instances and subsequent AgentRuns (provider/auth/model/skills) |
| API PersonaPack + ad-hoc correctness | test/integration/test-api-personapack-adhoc-correctness.sh |
Verifies PersonaPack propagation (authRef/provider/model/skills), ad-hoc parity, and that disabling a PersonaPack removes stamped instances |
| API AgentRun container shape | test/integration/test-api-agentrun-container-shape.sh |
Validates container count/names for AgentRun pods from plain instances (agent+ipc-bridge) and skill-backed instances (agent+ipc-bridge+skill sidecar) |
| API PersonaPack provisioning | test/integration/test-api-personapack-provisioning.sh |
Enabling a PersonaPack via API stamps out PersonaPack-labeled Instances and Schedules |
| API schedule dispatch | test/integration/test-api-schedule-dispatch.sh |
Creating a schedule via API results in dispatched AgentRuns (status.totalRuns / lastRunName) |
| API observability | test/integration/test-api-observability.sh |
OTEL collector deployment health + /api/v1/observability/metrics correctness (collectorReachable, payload sanity) |
| API web-endpoint | test/integration/test-api-web-endpoint.sh |
Web-endpoint skill enable/disable via PATCH, status endpoint, parameter propagation, idempotency |
| API serving-mode | test/integration/test-api-serving-mode.sh |
Server-mode AgentRun creation (Deployment + Service), web-endpoint status with deploymentName/serviceName, container env validation |
| API capabilities (optional) | test/integration/test-api-capabilities.sh |
Token-gated checks for Anthropic provider wiring (CLAUDE_TOKEN) and GitHub token endpoint/secret persistence (GITHUB_TOKEN) |
| write_file | test/integration/test-write-file.sh |
Agent uses write_file tool to create a file with specific content |
| k8s-ops nodes | test/integration/test-k8s-ops-nodes.sh |
Agent uses execute_command with k8s-ops skill to run kubectl get nodes |
| telegram | test/integration/test-telegram-channel.sh |
Channel deployment pipeline + optional full E2E with real bot |
| slack | test/integration/test-slack-channel.sh |
Channel deployment pipeline + optional full E2E via Socket Mode |
| developer-team | test/integration/developer_team_test.sh |
End-to-end PersonaPack simulation — 7 developer agents collaborate on a real GitHub repo (issues, PRs, code review, QA). Requires GITHUB_TOKEN with repo scope |