03 CORE STRATEGIES
Architectural patterns for reliable AI agent systems.
The Solution: External Memory & Structured State
The fundamental insight of context engineering is simple: Don't rely solely on the context window for state management. Instead, architect systems that externalize memory, manage state explicitly, and provide just-in-time information retrieval.
Strategy 1: System Prompt Architecture
The system prompt establishes foundational context: identity, capabilities, constraints, and behavioral guidelines.
Best Practices
- Define agent role explicitly: "You are a financial analyst assistant specialized in SaaS metrics..."
- Provide concrete examples rather than abstract rules
- Use structured delimiters: XML tags, markdown sections for clear boundaries
- Place critical constraints prominently: Models exhibit positional biasβput important rules at the beginning and end
Example Structure
<role>You are a senior software architect...</role>
<constraints>
- Never modify database schema without explicit approval
- Always validate input data before processing
- Prefer composition over inheritance
</constraints>
<examples>
<!-- Concrete code examples demonstrating expected behavior -->
</examples>
<role>You are a senior software architect...</role>
<constraints>
- Never modify database schema without explicit approval
- Always validate input data before processing
- Prefer composition over inheritance
</constraints>
<examples>
<!-- Concrete code examples demonstrating expected behavior -->
</examples>
Strategy 2: Memory Management
Effective context engineering requires active memory strategies to prevent context rot.
Summarization
Compress older exchanges while preserving key facts. After every N messages, generate a summary of decisions, facts, and state.
Implementation:
- Trigger summarization at regular intervals (e.g., every 10 exchanges)
- Store summaries in external database
- Inject relevant summaries into new context as needed
Windowing
Keep only recent N messages in active context. Older messages are archived but can be retrieved if needed.
Trade-offs:
- β Keeps context lean and focused
- β May lose important historical context
- π‘ Combine with selective retention for best results
Selective Retention
Preserve critical information, prune conversational filler. Not all messages are equally important.
Heuristics for retention:
- User instructions and requirements
- Agent commitments and decisions
- Error messages and corrections
- Explicit memory commands ("Remember that...")
Episodic Memory
Write important state to external storage, retrieve on demand. This mirrors human long-term memory.
Architecture:
User Query β Check Episodic Memory β Retrieve Relevant Episodes β
Inject into Context β Generate Response β Store New Episode
User Query β Check Episodic Memory β Retrieve Relevant Episodes β
Inject into Context β Generate Response β Store New Episode
Strategy 3: RAG Implementation
Retrieval Augmented Generation provides just-in-time information without bloating the context window.
Semantic Chunking
Break documents into meaningful segments that can be independently retrieved and understood.
Best practices:
- Chunk by logical boundaries (sections, paragraphs, code blocks)
- Maintain context overlap between chunks
- Include metadata (source, date, author)
Hybrid Search
Combine keyword and semantic search for optimal retrieval accuracy.
Approach:
- Keyword search: Fast, precise for exact matches
- Semantic search: Captures meaning, handles synonyms
- Fusion: Combine and rerank results
Reranking
Order results by relevance to current context, not just similarity to query.
Techniques:
- Cross-encoder models for pairwise relevance scoring
- LLM-based reranking ("Which of these passages is most relevant?")
- Metadata filtering (recency, source authority)
Strategy 4: State Machines
Explicit workflow state tracking prevents agents from losing their place in multi-step processes.
Implementation Pattern
type WorkflowState =
| { stage: 'requirements_gathering', data: Requirements }
| { stage: 'design', data: DesignSpec }
| { stage: 'implementation', data: CodeArtifacts }
| { stage: 'testing', data: TestResults }
| { stage: 'deployment', data: DeploymentConfig };
// Agent always knows current stage and available transitions
type WorkflowState =
| { stage: 'requirements_gathering', data: Requirements }
| { stage: 'design', data: DesignSpec }
| { stage: 'implementation', data: CodeArtifacts }
| { stage: 'testing', data: TestResults }
| { stage: 'deployment', data: DeploymentConfig };
// Agent always knows current stage and available transitions
Benefits
- Clarity: Agent knows exactly what stage it's in
- Validation: Prevent invalid state transitions
- Recovery: Easy to resume after interruption
- Debugging: Clear audit trail of state changes
Strategy 5: Context Budgets
Hard limits on token allocation per subsystem prevent any single component from monopolizing the context window.
Allocation Example
For a 128K token context window:
- System Prompt: 4K tokens (3%)
- Conversation History: 40K tokens (31%)
- Tool Definitions: 8K tokens (6%)
- Retrieved Documents: 32K tokens (25%)
- Working Memory: 16K tokens (13%)
- Response Generation: 28K tokens (22%)
Enforcement
- Monitor token usage in real-time
- Implement automatic summarization when budgets are exceeded
- Prioritize high-value information when making trade-offs
Strategy 6: Tool Integration
Provide agents with specific, well-defined capabilities rather than expecting them to do everything through generation.
Tool Design Principles
- Single Responsibility: Each tool does one thing well
- Clear Interfaces: Explicit input/output schemas
- Error Handling: Tools return structured errors, not exceptions
- Idempotency: Safe to call multiple times with same inputs
Example Tools
search_codebase(query: string): Semantic code searchget_design_system(): Retrieve current design tokenssave_decision(decision: string): Store important choicesquery_database(sql: string): Execute read-only queries
Putting It All Together
Effective context engineering combines these strategies into a cohesive architecture:
- System Prompt establishes identity and constraints
- Memory Management keeps context lean and relevant
- RAG provides just-in-time information access
- State Machines track workflow progress
- Context Budgets prevent resource exhaustion
- Tools extend capabilities beyond generation
The result: AI agents that maintain consistency, recall accurately, and scale to complex, long-running tasks.