Memory and State: How Agents Remember and Keep Track

One of the most common questions people have when they start building agents is: why does it seem to forget what I told it? Or the opposite: how do I make it remember something across sessions?

The answer to both questions is memory. And memory in an agentic system is more layered than most people expect when they first encounter it.

Why memory matters more for agents than for chatbots

In a simple chatbot, memory is just the conversation. Everything that was said in the current session is available to the model. When you close the tab, it forgets.

Agents often need more than that. A task that runs across multiple steps, or a system that helps the same user across many days, needs a way to hold information that goes beyond what fits in one conversation window.

The way you design memory has a direct impact on what your agent can accomplish, how consistent it feels, and how well it handles tasks that span time.

The four types of memory

Think of memory in agents as four separate stores, each serving a different purpose.

In-context memory (short-term) This is the conversation history in the current session. Everything said so far, including tool calls and their results, is in-context memory. It is fast, immediately available to the model, and gone when the session ends. Most simple agents rely entirely on this.

The limitation is size. Every model has a maximum context window, and long-running tasks can push past it. When that happens, the model can no longer see the earlier parts of the conversation.

External memory (long-term) This is information stored in a database outside the model. A vector database, a SQL database, a file, or a key-value store. The agent writes information to it during a session and retrieves it later, either in the same session or a future one.

This is how you build an agent that remembers your preferences across weeks, tracks the status of an ongoing project, or builds up a personal knowledge base over time.

Episodic memory (what happened before) This is a record of past runs or interactions, stored somewhere the agent can look back at. Think of it as a log that the agent can consult: "What did I do the last time I processed invoices for this client?" or "What did the user ask me to change about this document last week?"

Episodic memory is useful for agents that repeat tasks, need to learn from past outcomes, or need to explain what they did and why.

Semantic memory (what the agent knows) This is background knowledge the agent can look up, distinct from the conversation. A product catalog, a company wiki, a library of past reports. Usually stored in a vector database and retrieved using similarity search.

When an agent uses RAG (retrieval-augmented generation), it is drawing on semantic memory. The agent searches the knowledge base for context relevant to the current task and pulls it into the conversation before responding.

How state flows through a multi-step task

State is the record of where a task currently stands. Which steps are done, what was found, what decisions were made, what still needs to happen.

In a short agent loop, state lives in the conversation. The model sees everything that happened and uses it to decide what to do next.

For longer tasks, state often needs to be explicitly managed. Here is what that looks like in practice:

Imagine an agent that processes job applications. The task has several steps: read each application, score it against a rubric, flag the top candidates, and draft a summary email. If the agent reads 50 applications in one run, it needs to track which ones it has read, what score each received, and which ones cleared the threshold.

That state can be stored in a simple object updated at each step:

{
  "applications_reviewed": 38,
  "applications_remaining": 12,
  "top_candidates": ["Alice W.", "James T.", "Priya K."],
  "current_step": "scoring"
}

The agent reads and updates this object as it works through the task. If it gets interrupted, it can resume from where it left off. If it needs to report progress, it has a clear record to draw from.

Practical patterns for managing memory

Pattern 1: Write a summary at the end of each session At the end of a work session, have the agent write a brief summary of what happened and what was decided. Store it in an external file or database. At the start of the next session, load the summary into the context. This is one of the simplest and most reliable ways to give an agent working memory across sessions.

Pattern 2: Use a key-value store for preferences For things that do not change often but need to be remembered across every session, a key-value store works well. The agent looks up preferences at the start of each task.

Example keys:

  • user.writing_style = "short paragraphs, no jargon"
  • user.preferred_email_tone = "warm and direct"
  • project.henderson.main_contact = "sarah@hendersonmfg.com"

Pattern 3: Store decisions with their reasons When an agent makes a choice during a task, log both the choice and why it was made. This makes the agent's behavior understandable and correctable. If it made a bad call, you can see exactly what reasoning led there.

Pattern 4: Chunk long tasks into resumable stages For any task that might take a long time or might be interrupted, break it into stages with a clear checkpoint between each one. The agent completes a stage, saves its state, and only moves to the next stage when the previous one is confirmed complete.

What to think about when designing memory for an agent

Three questions are worth asking before you decide on a memory approach:

Does this agent need to remember anything across sessions? If yes, you need at minimum a way to persist a summary or key context to an external store. In-context memory alone will not be enough.

Does this task take long enough that the context window might fill up? For tasks over a few thousand words of context, plan for this. Summarize older parts of the conversation as it grows.

What would the agent need to know to pick up where it left off after an interruption? The answer to that question is exactly what needs to be in your state object. Write it down before you build the rest.

Discussion

  • Loading…

← Back to Tutorials