Environment Modeling and Perception
How an agent understands and represents its world determines its effectiveness.
Structured vs. Unstructured Inputs
Structured: JSON, database records, API responses, forms
Unstructured: Free text, images, audio, video, PDFs
Agents must parse both into a format they can reason about.
Parsing and Preprocessing
# Example: Parsing user input into structured format
def parse_user_request(raw_input: str) -> dict:
# Use LLM to extract structured information
response = llm.generate(
f"Extract intent and entities from: '{raw_input}'\
"
f"Return JSON: {{intent, entities, urgency}}"
)
return json.loads(response)
Context Management and Memory
Agents need to manage what information is available:
- Immediate context: Current conversation/task
- Session context: Everything in this interaction
- Persistent context: User preferences, history across sessions
Short-term vs. Long-term Memory
| Type | Storage | Duration | Example |
| Short-term | Conversation buffer | Current session | Chat history |
| Working | Scratchpad | Current task | Intermediate results |
| Long-term | Vector DB | Permanent | User preferences, past interactions |
| Episodic | Event store | Permanent | Specific past experiences |
Key Takeaways
- Agents must handle both structured and unstructured inputs
- Preprocessing converts raw data into actionable information
- Memory architecture determines what the agent can remember and for how long
- Choose memory types based on your agent's needs