Chapter 11
Agent Memory, Context Management, and Context Windows
Learning Objective
Learn how agents remember useful information without overwhelming the LLM context window.
What it means
Agent memory refers to how an AI system stores and uses information across steps or sessions. Context management is deciding what information should be included in each model call. The context window is limited, so a system should not send everything to the LLM.
Healthcare Example
For a clinical review assistant, short-term memory may include case ID, document type, extracted diagnosis, and missing fields. Long-term memory may include previous similar cases or policy history. The prompt should include only what is relevant to the current decision.
Types of Memory
| Memory Type | Purpose | Storage |
|---|---|---|
| Short-term workflow memory | Current case state | LangGraph state, Redis |
| Long-term memory | Historical knowledge or preferences | SQL database, vector database |
| Conversation memory | Recent chat history | Session storage, summaries |
| Audit memory | Trace of actions and decisions | Logs, database records |
Code: Workflow State Management
case_state = {
"case_id": "CASE-1001",
"document_type": "clinical_note",
"diagnosis_code": "E11.9",
"missing_fields": [],
"confidence": 0.91,
"next_action": "review"
}
def update_state(state, key, value):
state[key] = value
return state
case_state = update_state(case_state, "review_reason", "Confidence below threshold")
print(case_state)Common Mistakes
- Putting all history into every prompt.
- No separation between workflow state and long-term memory.
- No summarization strategy.
- No data retention policy.
- Storing sensitive data without access controls.
Interview Q&A
Q: How do you manage agent memory?
A: I separate workflow state, short-term conversation context, long-term knowledge, and audit history. I use retrieval and summaries instead of sending everything to the model.
Q: How do you handle context window limits?
A: I use chunking, retrieval, summarization, token budgeting, and strict prompt construction.
Architect Takeaway
The goal is not to remember everything. The goal is to retrieve the right information at the right time with the least risk and cost.