Tool Output Design for Context Efficiency

November 16, 2025•2 min read

Tool outputs consume context window space fast. A single poorly designed API response or file read can bury the signal in noise. Design tool outputs to return minimal, focused data that LLMs can act on immediately.

TL;DR

Filter data before returning it to the LLM, not after. Return structured outputs when possible. Paginate or truncate large results with clear continuation patterns. Design tool responses to answer specific questions, not dump entire datasets into context.

Filter Before Returning

Process data inside the tool, return only what the LLM needs:

python

# Bad: returns 10,000 lines
def get_logs():
    return read_file("app.log")
 
# Good: filters first
def get_error_logs(last_n=50):
    logs = read_file("app.log")
    errors = [
        line for line in logs
        if "ERROR" in line
    ]
    return errors[-last_n:]

The LLM gets 50 relevant lines instead of 10,000 mixed ones. Context stays focused on actionable information.

Structured vs Unstructured

Return structured data when the LLM needs to process multiple items:

python

# Unstructured: hard to parse
return "User: alice, Email: alice@example.com\nUser: bob, Email: bob@example.com"
 
# Structured: easy to process
return [
    {"user": "alice", "email": "alice@example.com"},
    {"user": "bob", "email": "bob@example.com"}
]

LLMs handle structured outputs efficiently. They can filter, transform, and reason about data without parsing text.

Pagination and Truncation

For large datasets, return partial results with clear patterns:

python

def search_codebase(query, limit=10):
    results = grep(query)
    return {
        "matches": results[:limit],
        "total": len(results),
        "truncated": len(results) > limit
    }

The LLM sees whether more data exists and can request it if needed. Include metadata that enables informed decisions about fetching more.

Design for Questions

Tools should answer specific questions, not dump data:

python

# Bad: generic, unfocused
def get_database_contents():
    return db.query("SELECT * FROM users")
 
# Good: targeted queries
def get_users_created_today():
    today = datetime.now().date()
    return db.query(
        "SELECT id, email, created_at FROM users "
        "WHERE DATE(created_at) = ?",
        today
    )
 
def get_failed_transactions(hours=24):
    cutoff = datetime.now() - timedelta(hours=hours)
    return db.query(
        "SELECT id, user_id, error, timestamp "
        "FROM transactions "
        "WHERE status='failed' AND timestamp > ?",
        cutoff
    )

Each tool call should reduce uncertainty, not increase the amount of data the LLM must process. This pattern mirrors giving agents the map first—provide structure, let the agent request specifics.

References

Anatomy of a Context Window - Understanding what fills the context window
Give AI Agents the Map First - The pattern of structure before details