Features
mini-a packs a comprehensive set of features into a minimalist framework. This page covers current capabilities across models, tool orchestration, delegation, security, and output/runtime options.
Multi-Model Support
mini-a works with 10+ LLM providers out of the box. Switch between providers by changing a single environment variable — no code changes required.
| Provider | Prefix | Example Model |
|---|---|---|
| OpenAI | openai: |
gpt-5.2, gpt-5-mini |
| Google Gemini | google: |
gemini-2.0-flash, gemini-1.5-pro |
| Anthropic Claude | anthropic: |
claude-sonnet-4-20250514 |
| Ollama (local) | ollama: |
llama3, mistral, codellama |
| AWS Bedrock | bedrock: |
anthropic.claude-v2 |
| GitHub Models | github: |
openai/gpt-5 |
| Deepseek | deepseek: |
deepseek-chat |
| Groq | groq: |
llama3-70b-8192 |
| Cerebras | cerebras: |
llama3.1-70b |
| Mistral | mistral: |
mistral-large-latest |
| OpenRouter | openrouter: |
meta-llama/llama-3-70b |
Switching is as simple as setting the environment variable:
export OAF_MODEL="(type: openai, model: gpt-5.2, key: '...')" # OpenAI
export OAF_MODEL="(type: gemini, model: gemini-2.0-flash, key: '...')" # Google
export OAF_MODEL="(type: ollama, model: 'llama3', url: 'http://localhost:11434')" # Local
Set credentials directly in OAF_MODEL/OAF_LC_MODEL using key: '...' so configuration stays in one place. Ollama runs locally and requires no key.
Dual-Model Cost Optimization
One of mini-a’s most powerful features is its dual-model architecture. You can assign a cheaper, faster model to handle simple tasks (routing, summarization, classification) while reserving a more capable model for complex reasoning.
export OAF_MODEL="(type: openai, model: gpt-5.2, key: '...')" # Main model — complex reasoning
export OAF_LC_MODEL="(type: openai, model: gpt-5-mini, key: '...')" # Light model — simple tasks
The framework automatically decides which model to use for each subtask, optimizing cost without sacrificing quality where it matters.
You can also add a dedicated validation model for deep-research scoring when you want execution and validation separated:
export OAF_VAL_MODEL="(type: openai, model: gpt-5-mini, key: '...')"
# Or override only this run
mini-a deepresearch=true modelval="(type: openai, model: gpt-5-mini, key: '...')" \
goal='Research the latest release notes and summarize breaking changes'
Recent updates added dynamic escalation controls and per-run cost tracking so you can tune and measure this behavior:
lccontextlimitescalates to the main model when low-cost context gets too large.deescalatecontrols how many successful steps are needed before returning to the low-cost model.getCostStats()returns a per-session usage breakdown for low-cost and main model tiers.OAF_MINI_A_NOJSONPROMPT/OAF_MINI_A_LCNOJSONPROMPTlet you force text-prompt mode per model tier (Gemini main models auto-enable when unset).
Estimated Savings
| Task Type | Model Used | Estimated Savings |
|---|---|---|
| Simple routing & classification | Light model (OAF_LC_MODEL) |
~70% cheaper |
| Summarization | Light model (OAF_LC_MODEL) |
~60% cheaper |
| Planning & step decomposition | Light model (OAF_LC_MODEL) |
~50% cheaper |
| Complex reasoning & analysis | Main model (OAF_MODEL) |
0% (full model needed) |
When both models are configured, mini-a can report separate token usage and cost estimates for each tier, so you can track exactly how much you are saving.
Automatic Performance Optimizations
mini-a includes several built-in optimizations that reduce token consumption and keep conversations within context limits without manual intervention.
- Conversation compaction — When the conversation grows too long, mini-a automatically compresses earlier turns while preserving essential context. Trigger manually with
/compact. - Context summarization — Long tool outputs and intermediate results are summarized to save tokens. Trigger manually with
/summarize. - Token usage optimization — The framework tracks token counts and adjusts behavior to stay within budget.
- Smart prompt caching — Repeated prompt patterns are cached where the provider supports it, reducing redundant API calls.
Key Parameters
| Parameter | Description | Default |
|---|---|---|
maxcontext |
Maximum context window size (tokens) | Model default |
maxtokens |
Maximum tokens per response | Model default |
| Auto-compact | Automatically compact when context exceeds threshold | Enabled |
# Example: constrain context and response size
mini-a maxcontext=32000 maxtokens=4096
MCP Integration
MCP (Model Context Protocol) is an open standard that defines how LLMs discover and invoke external tools. Instead of hard-coding tool integrations, mini-a uses MCP servers that expose capabilities through a uniform interface.
mini-a ships with 25+ built-in MCP servers covering common tasks such as file operations, web browsing, databases, Kubernetes, finance, office documents, OpenAF helpers, and more.
STDIO vs HTTP Mode
| Mode | How It Works | Best For |
|---|---|---|
| STDIO | Launches the MCP server as a local child process, communicating over stdin/stdout | Local tools, development, single-user setups |
| HTTP | Connects to a remote MCP server over HTTP/SSE | Shared servers, cloud deployments, team setups |
# STDIO mode (default for built-in servers)
mini-a usetools=true
# HTTP mode (connect to a remote MCP server)
mini-a usetools=true mcpserver="http://mcp.example.com:8080"
For a complete list of available MCP servers and their capabilities, see the MCP Catalog.
MCP Proxy and Programmatic Tool Calling
When many tools are active, mcpproxy=true can collapse them into a single proxy-dispatch tool to reduce prompt bloat.
mini-a goal="compare release dates across APIs" \
usetools=true mcpproxy=true \
mcp="[(cmd: 'ojob mcps/mcp-time.yaml'), (cmd: 'ojob mcps/mcp-fin.yaml')]" \
useutils=true
Large proxy calls can spill arguments/results to files with argumentsFile and resultToFile=true, and mcpproxytoon=true can serialize spilled object payloads in TOON format for easier scanning.
mini-a can optionally start a localhost bridge that lets generated scripts list/search/call MCP tools in loops and batches.
mini-a useshell=true usetools=true mcpprogcall=true \
mcp="[(cmd: 'ojob mcps/mcp-time.yaml'), (cmd: 'ojob mcps/mcp-web.yaml')]"
Useful controls include mcpprogcallport, mcpprogcallmaxbytes, mcpprogcallresultttl, mcpprogcalltools, and mcpprogcallbatchmax.
Flexible Tool System
mini-a provides three categories of tools that the agent can use to accomplish goals:
1. Shell Commands
When enabled, the agent can execute shell commands directly on the host system. Disabled by default for security.
mini-a useshell=true
2. Built-in Utilities
File operations, text search, directory listing, and other common utilities that do not require spawning a shell.
When enabled, Mini Utils provides init, filesystemQuery, filesystemModify, and markdownFiles.
mini-a useutils=true
Use utilsallow and utilsdeny to control which utils are exposed to the agent:
# Expose only specific utils
mini-a useutils=true utilsallow="filesystemQuery,markdownFiles"
# Hide specific utils (applied after utilsallow)
mini-a useutils=true utilsdeny="filesystemModify"
When running in console mode, useutils=true also includes the userInput tool, which enables interactive console prompts (ask, choose, struct) for gathering structured input from the user.
For docs-aware workflows, enable:
mini-a useutils=true mini-a-docs=true
3. MCP Tools
Extensible tools provided by MCP servers — both built-in and custom.
mini-a usetools=true
Tool Configuration Summary
| Parameter | What It Enables | Default |
|---|---|---|
useshell |
Shell command execution | false |
useutils |
Built-in file and search utilities | true |
mini-a-docs |
Auto-set docs root for Mini Utils markdownFiles when utilsroot is unset |
false |
usetools |
MCP tool servers | true |
shellmaxbytes |
Truncate oversized shell output to keep context stable | 8000 |
shellallowpipes |
Allow shell pipes/redirection/control operators | false |
All three can be combined. When the agent receives a goal, it selects the appropriate tool type based on the task.
Multiple Interfaces
mini-a can be used through four distinct interfaces, each suited to different workflows.
Console (Interactive REPL)
The default mode. An interactive terminal session with tab completion, command history, and real-time streaming.
mini-a
Web UI
A browser-based interface with session management, conversation history, and streaming output.
mini-a onport=8080
Library (JavaScript API)
Use mini-a programmatically from your own OpenAF scripts.
loadLib("mini-a.js");
var agent = new MiniA({
model: "(type: openai, model: gpt-5.2, key: '...')",
usetools: true
});
var result = agent.ask("List all running Docker containers");
print(result);
Worker API
Run mini-a as a remote agent that accepts goals via an API endpoint.
Dynamic worker registration is also available. Parents can open a registration port with workerreg=<port>, and worker instances can self-register with workerregurl=<url> and heartbeat via workerreginterval=<ms>.
Streaming, Debugging, and Safety
Recent releases added several runtime improvements that are useful in production setups:
planner_streamSSE events distinguish planning tokens from normal answer tokens whenusestream=true.showthinking=truesurfaces XML-tagged<thinking>...</thinking>blocks as thought logs for providers that emit them.debugfile=<path>writes debug output as NDJSON instead of flooding the console with raw blocks.debugvalchroutes validation-model debugging to its own channel whenllmcomplexity=true.maxpromptcharscaps inbound web prompt size before any model call is made.- Untrusted input blocks and prompt normalization reduce prompt-injection risk from goals, chat text, and attachments.
mini-a workermode=true onport=9090
Other agents or applications can then delegate tasks to this worker instance.
Planning & Multi-Agent Delegation
For complex goals, mini-a can plan a series of steps before executing them, and optionally delegate subtasks to child agents or remote workers.
Think of delegation as an orchestration layer: one parent agent manages task decomposition, parallel execution, and result aggregation across multiple workers.
Planning Styles
| Style | Description |
|---|---|
simple |
Flat sequential steps — generates and executes one step at a time (default) |
legacy |
Phase-based hierarchical planning — creates a structured multi-phase plan upfront |
# Enable planning (simple style, default)
mini-a useplanning=true
# Use legacy hierarchical planning
mini-a useplanning=true planstyle=legacy
Delegation
When delegation is enabled, the main agent can spawn child agents to handle independent subtasks in parallel.
# Enable delegation to child agents
mini-a usedelegation=true
Delegation also works with remote workers — the main agent can send subtasks to mini-a instances running in worker mode on other machines.
Worker registration is also supported: set workerreg to a port to have mini-a start a worker registration HTTP server, optionally protected by workerregtoken. Use workerevictionttl to evict stale worker entries, and delegationmaxdepth to cap recursive delegation chains.
Worker Skills (A2A)
Workers can advertise specific capabilities as A2A skills. The parent reads each worker’s /.well-known/agent.json AgentCard and uses declared skills to route subtasks intelligently.
# Start a shell-capable worker (auto-emits the "shell" A2A skill)
mini-a workermode=true onport=9091 shellworker=true
# Start a worker with custom skills
mini-a workermode=true onport=9092 \
workerskills="shell,time" \
workerspecialties="finance,data-analysis"
The delegate-subtask tool call can then request specific skills:
{ "goal": "run the test suite", "skills": ["shell"] }
When dynamic worker registration is active (workerreg=<port>), the parent rebuilds the delegate-subtask tool description every 30 seconds (or immediately on profile change) to list available workers and their skills — so the LLM always routes to the right worker without guessing.
Orchestration Pattern
Use this pattern when you want mini-a to coordinate multiple agents for larger workloads:
- Start one or more worker instances (
mini-a workermode=true onport=9090). - Run a parent agent with planning and delegation enabled.
- Set concurrency limits so execution stays predictable.
- Let the parent combine worker outputs into a single final result.
mini-a useplanning=true planstyle=legacy usedelegation=true \
workers='http://worker1:9090,http://worker2:9090' maxconcurrent=4 \
goal='Analyze this monorepo, group findings by domain, and produce one prioritized action plan'
Working Memory
For long-running or multi-step goals, mini-a can maintain a structured working memory that persists key findings across tool calls and even across sessions.
mini-a usememory=true goal="Deep code analysis of auth module"
Memory is organized into 8 typed sections:
| Section | What is stored |
|---|---|
facts |
Confirmed facts discovered during the run |
evidence |
Raw observations and tool outputs worth keeping |
decisions |
Choices made and their rationale |
risks |
Identified risks or blockers |
openQuestions |
Unresolved questions to follow up on |
hypotheses |
Unconfirmed theories to test |
artifacts |
Generated files, configs, or summaries |
summaries |
Compressed summaries of completed work |
Entries are appended automatically at significant agent events (tool calls, plan critiques, validation results, final answers). Near-duplicate entries are suppressed by an 85% word-overlap fingerprint (memorydedup).
Persistence
By default, memory is held in-process. Pass an OpenAF channel definition to persist it across runs:
# Persist to a local JSON file
mini-a usememory=true \
memorych="(name: my_mem, type: file, options: (file: '/tmp/mini-a-mem.json'))" \
goal="Iterative research on cloud costs"
Scope
Use memoryscope to control which stores the agent reads/writes:
# Keep memory isolated to this session
mini-a usememory=true memoryscope=session goal="One-shot task"
# Share memory globally across all sessions
mini-a usememory=true memoryscope=global \
memorych="(type: file, options: (file: '/tmp/mini-a-global.json'))"
Tuning
# Larger limits for a heavy analysis run
mini-a usememory=true memorymaxpersection=200 memorymaxentries=1000 \
goal="Analyze all TypeScript files"
See Configuration → Working Memory for the full parameter reference.
Adaptive Tool Routing
mini-a includes an optional rule-based routing layer that selects how each action is dispatched — direct local tool, MCP direct call, MCP proxy, shell execution, utility wrapper, or delegated subtask.
mini-a adaptiverouting=true goal="Analyze logs and report"
When adaptive routing is on, each tool action goes through a lightweight router that scores candidate routes using intent hints:
- read vs. write intent
- payload size
- latency sensitivity
- determinism preference
- risk level
- structured output preference
- historical route success/failure
The router returns a selected route, rationale, and fallback chain. Failures are retried down the fallback chain (duplicate routes are skipped to prevent thrashing). Route decisions are appended to debug/audit output as [ROUTE ...] records when debug=true.
Controlling which routes are used
# Prefer MCP direct calls, fall back to proxy
mini-a adaptiverouting=true \
routerorder="mcp_direct_call,mcp_proxy_path,utility_wrapper"
# Restrict to non-shell routes only
mini-a adaptiverouting=true routerdeny="shell_execution"
# Only allow shell and utility routes
mini-a adaptiverouting=true routerallow="shell_execution,utility_wrapper"
When adaptiverouting=false (default), mini-a preserves legacy tool dispatch behavior.
Agent Files
mini-a lets you package an entire agent configuration — model, capabilities, tools, rules, knowledge, and persona — into a single markdown file with YAML frontmatter. Reuse and share agent profiles across your team or project without repeating command-line flags.
mini-a agent=examples/changelog-gen.agent.md goal="generate changelog"
See the Agent Files page for the complete reference: frontmatter keys, tool entry types, mini-a: overrides, relative file paths, precedence rules, and a full annotated example.
Chatbot Mode
Not every use case needs tools. Chatbot mode turns mini-a into a pure conversational assistant — no shell access, no file operations, no MCP tools. Just the LLM.
mini-a chatbotmode=true
This is useful for:
- Q&A — Answer questions using the model’s training data
- Education — Explain concepts, tutor on topics
- Brainstorming — Generate ideas, explore possibilities
- Drafting — Write text, emails, documentation
All other features (streaming, conversation management, dual-model) still work in chatbot mode.
Custom Slash Commands, Skills, and Hooks
mini-a supports template-based custom slash commands, skill templates, and local console hooks.
Custom Slash Commands
Create markdown templates under ~/.openaf-mini-a/commands/ and invoke them with /<name> ...args....
~/.openaf-mini-a/commands/my-command.md
Use placeholders inside the template: ,, ,, ``, …
Placeholder reference (works for command and skill templates):
- `` -> raw argument string after the command name (trimmed)
- `` -> parsed arguments as a JSON array
- `` -> parsed argument count
,, … -> positional argument values (1-based)
Example template ~/.openaf-mini-a/commands/my-command.md:
Follow these instructions exactly.
Primary target:
All args (raw):
Parsed args:
Argument count:
Run:
mini-a ➤ /my-command repo-a --fast "include docs"
Rendered prompt:
Follow these instructions exactly.
Primary target: repo-a
All args (raw): repo-a --fast "include docs"
Parsed args: ["repo-a","--fast","include docs"]
Argument count: 3
Load additional command directories with:
mini-a extracommands=/path/to/team-commands,/path/to/project-commands
Skills
mini-a discovers skills from ~/.openaf-mini-a/skills/ in two formats:
- Folder skill:
~/.openaf-mini-a/skills/<name>/SKILL.md - Single-file skill:
~/.openaf-mini-a/skills/<name>.md
Run skills with either /<name> ...args... or $<name> ...args.... Use /skills (or /skills <prefix>) to list discovered skills.
Load additional skill directories with:
mini-a extraskills=/path/to/shared-skills,/path/to/project-skills
Hooks
mini-a can run local hooks from ~/.openaf-mini-a/hooks/*.yaml|*.yml|*.json on events like:
before_goal,after_goalbefore_tool,after_toolbefore_shell,after_shell
Load additional hook directories with:
mini-a extrahooks=/path/to/team-hooks,/path/to/project-hooks
Non-interactive Template Execution
You can execute one command/skill template and exit:
mini-a exec="/my-command repo-a --fast"
References:
Docker Support
Run mini-a in a Docker container for full isolation, reproducible environments, and easy deployment.
Docker Compose Example
version: "3.8"
services:
mini-a:
image: openaf/mini-a
environment:
- OAF_MODEL="(type: openai, model: gpt-5.2, key: '...')"
- OAF_LC_MODEL="(type: openai, model: gpt-5-mini, key: '...')"
ports:
- "8080:8080"
volumes:
- ./workspace:/workspace
command: onport=8080
Docker containers provide a natural sandbox for shell execution — you can enable useshell=true inside the container without exposing your host system.
Security Features
mini-a is designed to be secure by default. Potentially dangerous features require explicit opt-in.
| Feature | Description | Default |
|---|---|---|
| Shell execution | Run arbitrary shell commands | Disabled |
| Read-only mode | Prevent file modifications | Available |
| Command allowlist | Only permit specific shell commands | shellallow="cmd1,cmd2" |
| Command ban list | Block specific shell commands | shellban="rm,shutdown" |
| Encrypted key storage | API keys stored encrypted via model manager | Supported |
| Docker isolation | Run in a container sandbox | Available |
| OS sandbox | Built-in OS-level sandbox for shell commands | usesandbox=off |
# Enable shell with allowlist only
mini-a useshell=true shellallow="ls,cat,grep,find"
# Enable shell but ban destructive commands
mini-a useshell=true shellban="rm,rmdir,dd,mkfs"
# Enable sandbox (auto-detects OS)
mini-a useshell=true usesandbox=auto
# Sandbox without network access
mini-a useshell=true usesandbox=auto sandboxnonetwork=true
These controls can be combined. For example, running inside Docker with a shell allowlist provides defense in depth. The built-in OS sandbox (usesandbox) adds an extra layer by wrapping shell commands in platform-specific OS-level restrictions — no separate container setup required.
Streaming Responses
mini-a supports real-time token streaming in both the console and web interfaces. Responses appear word by word as the model generates them, rather than waiting for the full response.
mini-a usestream=true
Streaming is enabled by default in most configurations. It provides a more responsive experience, especially for long-form outputs.
Conversation Management
mini-a provides several commands for managing conversation context during a session.
| Command | Description |
|---|---|
/compact [n] |
Compress older history while preserving up to the latest n exchanges (default 6) |
/summarize [n] |
Replace older history with a narrative summary and keep up to the latest n exchanges (default 6) |
/last [md] |
Reprint the most recent final answer (md for raw markdown) |
/save <path> |
Save the most recent final answer to a file |
These commands are especially useful in long sessions where context accumulates and token costs increase. Compacting a conversation can reduce context size by 40-60% while preserving the essential information the agent needs. When enough history exists, mini-a keeps at least one older entry eligible for summarization instead of preserving the entire tail.
Conversation History Persistence
Enable historykeep=true to automatically save console conversations to ~/.openaf-mini-a/history so they can be resumed in future sessions.
# Save all console sessions for later resumption
mini-a historykeep=true
Use historykeepperiod and historykeepcount to control how many saved sessions are retained:
# Delete history files older than 60 minutes, keep at most 10
mini-a historykeep=true historykeepperiod=60 historykeepcount=10
To resume a saved session, pass the history file path as the conversation input at startup. History files are stored as JSON in ~/.openaf-mini-a/history/.
Metrics & Usage Tracking
Track exactly how many tokens you are using and what they cost with built-in metrics and usage tracking.
> /stats
The /stats command displays:
- Token counts — Input and output tokens for the current session
- Cost estimates — Estimated cost based on provider pricing
- Model usage — Breakdown by main model vs. light model
- Request counts — Number of API calls made
This data helps you understand usage patterns, optimize model selection, and budget API costs.
Visual Outputs
mini-a can generate rich visual outputs directly in the terminal or web UI by enabling the appropriate flags.
| Parameter | Output Type | Example Use |
|---|---|---|
useascii=true |
ASCII art | Banners, logos, decorative text |
usesvg=true |
SVG visuals | Infographics, custom diagrams, UI mock visuals |
usediagrams=true |
Diagrams | Flowcharts, architecture diagrams, sequence diagrams |
usecharts=true |
Charts | Bar charts, histograms, data visualizations |
usevectors=true |
Vector bundle | Prefer Mermaid for structural diagrams and SVG for infographics/custom visuals |
usemaps=true |
Maps | Geographic data, network topology |
usemath=true |
Math rendering | Inline or block LaTeX formulas rendered via KaTeX in web UI |
# Enable all visual outputs
mini-a useascii=true usevectors=true usecharts=true usemaps=true usemath=true
These features instruct the LLM to include visual representations in its responses when appropriate, making outputs more informative and easier to understand at a glance. usevectors=true is the convenient bundle for vector-first output because it enables both SVG and diagram guidance together.
Ready to Try It?
Get mini-a running in under a minute and explore these features yourself.