Foundry AI Partners

03 CORE STRATEGIES

Architectural patterns for reliable AI agent systems.

The Solution: External Memory & Structured State

The fundamental insight of context engineering is simple: Don't rely solely on the context window for state management. Instead, architect systems that externalize memory, manage state explicitly, and provide just-in-time information retrieval.

Strategy 1: System Prompt Architecture

The system prompt establishes foundational context: identity, capabilities, constraints, and behavioral guidelines.

Best Practices

Define agent role explicitly: "You are a financial analyst assistant specialized in SaaS metrics..."
Provide concrete examples rather than abstract rules
Use structured delimiters: XML tags, markdown sections for clear boundaries
Place critical constraints prominently: Models exhibit positional bias—put important rules at the beginning and end

Example Structure

<role>You are a senior software architect...</role>

<constraints>
- Never modify database schema without explicit approval
- Always validate input data before processing
- Prefer composition over inheritance
</constraints>

<examples>
<!-- Concrete code examples demonstrating expected behavior -->
</examples>

<role>You are a senior software architect...</role>

<constraints>
- Never modify database schema without explicit approval
- Always validate input data before processing
- Prefer composition over inheritance
</constraints>

<examples>
<!-- Concrete code examples demonstrating expected behavior -->
</examples>

Strategy 2: Memory Management

Effective context engineering requires active memory strategies to prevent context rot.

Summarization

Compress older exchanges while preserving key facts. After every N messages, generate a summary of decisions, facts, and state.

Implementation:

Trigger summarization at regular intervals (e.g., every 10 exchanges)
Store summaries in external database
Inject relevant summaries into new context as needed

Windowing

Keep only recent N messages in active context. Older messages are archived but can be retrieved if needed.

Trade-offs:

✅ Keeps context lean and focused
❌ May lose important historical context
💡 Combine with selective retention for best results

Selective Retention

Preserve critical information, prune conversational filler. Not all messages are equally important.

Heuristics for retention:

User instructions and requirements
Agent commitments and decisions
Error messages and corrections
Explicit memory commands ("Remember that...")

Episodic Memory

Write important state to external storage, retrieve on demand. This mirrors human long-term memory.

Architecture:

User Query → Check Episodic Memory → Retrieve Relevant Episodes → 
Inject into Context → Generate Response → Store New Episode

User Query → Check Episodic Memory → Retrieve Relevant Episodes → 
Inject into Context → Generate Response → Store New Episode

Strategy 3: RAG Implementation

Retrieval Augmented Generation provides just-in-time information without bloating the context window.

Semantic Chunking

Break documents into meaningful segments that can be independently retrieved and understood.

Best practices:

Chunk by logical boundaries (sections, paragraphs, code blocks)
Maintain context overlap between chunks
Include metadata (source, date, author)

Hybrid Search

Combine keyword and semantic search for optimal retrieval accuracy.

Approach:

Keyword search: Fast, precise for exact matches
Semantic search: Captures meaning, handles synonyms
Fusion: Combine and rerank results

Reranking

Order results by relevance to current context, not just similarity to query.

Techniques:

Cross-encoder models for pairwise relevance scoring
LLM-based reranking ("Which of these passages is most relevant?")
Metadata filtering (recency, source authority)

Strategy 4: State Machines

Explicit workflow state tracking prevents agents from losing their place in multi-step processes.

Implementation Pattern

typescript

type WorkflowState = 
  | { stage: 'requirements_gathering', data: Requirements }
  | { stage: 'design', data: DesignSpec }
  | { stage: 'implementation', data: CodeArtifacts }
  | { stage: 'testing', data: TestResults }
  | { stage: 'deployment', data: DeploymentConfig };

// Agent always knows current stage and available transitions

type WorkflowState = 
  | { stage: 'requirements_gathering', data: Requirements }
  | { stage: 'design', data: DesignSpec }
  | { stage: 'implementation', data: CodeArtifacts }
  | { stage: 'testing', data: TestResults }
  | { stage: 'deployment', data: DeploymentConfig };

// Agent always knows current stage and available transitions

Benefits

Clarity: Agent knows exactly what stage it's in
Validation: Prevent invalid state transitions
Recovery: Easy to resume after interruption
Debugging: Clear audit trail of state changes

Strategy 5: Context Budgets

Hard limits on token allocation per subsystem prevent any single component from monopolizing the context window.

Allocation Example

For a 128K token context window:

System Prompt: 4K tokens (3%)
Conversation History: 40K tokens (31%)
Tool Definitions: 8K tokens (6%)
Retrieved Documents: 32K tokens (25%)
Working Memory: 16K tokens (13%)
Response Generation: 28K tokens (22%)

Enforcement

Monitor token usage in real-time
Implement automatic summarization when budgets are exceeded
Prioritize high-value information when making trade-offs

Strategy 6: Tool Integration

Provide agents with specific, well-defined capabilities rather than expecting them to do everything through generation.

Tool Design Principles

Single Responsibility: Each tool does one thing well
Clear Interfaces: Explicit input/output schemas
Error Handling: Tools return structured errors, not exceptions
Idempotency: Safe to call multiple times with same inputs

Example Tools

search_codebase(query: string): Semantic code search
get_design_system(): Retrieve current design tokens
save_decision(decision: string): Store important choices
query_database(sql: string): Execute read-only queries

Putting It All Together

Effective context engineering combines these strategies into a cohesive architecture:

System Prompt establishes identity and constraints
Memory Management keeps context lean and relevant
RAG provides just-in-time information access
State Machines track workflow progress
Context Budgets prevent resource exhaustion
Tools extend capabilities beyond generation

The result: AI agents that maintain consistency, recall accurately, and scale to complex, long-running tasks.