How to build memory-driven agent AI that continuously learns through contextual experience and semantic patterns to achieve long-term autonomy
In this tutorial, we’ll explore how to leverage memory as a core feature to build agent systems that think beyond a single interaction. We will cover how episodic memory can be designed to store experience and semantic memory to capture long-term patterns, allowing agents to evolve their behavior over multiple sessions. As we plan, act, modify, and reflect, we see how the agent gradually adapts to user preferences and becomes more autonomous. Finally, we learned how memory-driven reasoning can help us create agents that feel more contextual, consistent, and intelligent with every interaction. Check The complete code is here.
import numpy as np
from collections import defaultdict
import json
from datetime import datetime
import pickle
class EpisodicMemory:
def __init__(self, capacity=100):
self.capacity = capacity
self.episodes = []
def store(self, state, action, outcome, timestamp=None):
if timestamp is None:
timestamp = datetime.now().isoformat()
episode = {
'state': state,
'action': action,
'outcome': outcome,
'timestamp': timestamp,
'embedding': self._embed(state, action, outcome)
}
self.episodes.append(episode)
if len(self.episodes) > self.capacity:
self.episodes.pop(0)
def _embed(self, state, action, outcome):
text = f"{state} {action} {outcome}".lower()
return hash(text) % 10000
def retrieve_similar(self, query_state, k=3):
if not self.episodes:
return []
query_emb = self._embed(query_state, "", "")
scores = [(abs(ep['embedding'] - query_emb), ep) for ep in self.episodes]
scores.sort(key=lambda x: x[0])
return [ep for _, ep in scores[:k]]
def get_recent(self, n=5):
return self.episodes[-n:]
class SemanticMemory:
def __init__(self):
self.preferences = defaultdict(float)
self.patterns = defaultdict(list)
self.success_rates = defaultdict(lambda: {'success': 0, 'total': 0})
def update_preference(self, key, value, weight=1.0):
self.preferences[key] = 0.9 * self.preferences[key] + 0.1 * weight * value
def record_pattern(self, context, action, success):
pattern_key = f"{context}_{action}"
self.patterns[context].append((action, success))
self.success_rates[pattern_key]['total'] += 1
if success:
self.success_rates[pattern_key]['success'] += 1
def get_best_action(self, context):
if context not in self.patterns:
return None
action_scores = defaultdict(lambda: {'success': 0, 'total': 0})
for action, success in self.patterns[context]:
action_scores[action]['total'] += 1
if success:
action_scores[action]['success'] += 1
best_action = max(action_scores.items(), key=lambda x: x[1]['success'] / max(x[1]['total'], 1))
return best_action[0] if best_action[1]['total'] > 0 else None
def get_preference(self, key):
return self.preferences.get(key, 0.0)
We define the core memory structures that the agent relies on. We build episodic memories to capture specific experiences and semantic memories to generalize patterns over time. When we build these foundations, we enable agents to learn from interactions just like humans. Check The complete code is here.
class MemoryAgent:
def __init__(self):
self.episodic_memory = EpisodicMemory(capacity=50)
self.semantic_memory = SemanticMemory()
self.current_plan = []
self.session_count = 0
def perceive(self, user_input):
user_input = user_input.lower()
if any(word in user_input for word in ['recommend', 'suggest', 'what should']):
intent="recommendation"
elif any(word in user_input for word in ['remember', 'prefer', 'like', 'favorite']):
intent="preference_update"
elif any(word in user_input for word in ['do', 'complete', 'finish', 'task']):
intent="task_execution"
else:
intent="conversation"
return {'intent': intent, 'raw': user_input}
def plan(self, state):
intent = state['intent']
user_input = state['raw']
similar_episodes = self.episodic_memory.retrieve_similar(user_input, k=3)
plan = []
if intent == 'recommendation':
genre_prefs = {k: v for k, v in self.semantic_memory.preferences.items() if 'genre_' in k}
if genre_prefs:
best_genre = max(genre_prefs.items(), key=lambda x: x[1])[0]
plan.append(('recommend', best_genre.replace('genre_', '')))
else:
plan.append(('recommend', 'general'))
elif intent == 'preference_update':
genres = ['sci-fi', 'fantasy', 'mystery', 'romance', 'thriller']
detected_genre = next((g for g in genres if g in user_input), None)
if detected_genre:
plan.append(('update_preference', detected_genre))
elif intent == 'task_execution':
best_action = self.semantic_memory.get_best_action('task')
if best_action:
plan.append(('execute', best_action))
else:
plan.append(('execute', 'default'))
self.current_plan = plan
return plan
We build perception and planning systems for agents. We process user input, detect intent, and use previously formed memories to formulate plans. We start to shape how the agent reasons and decides what to do next. Check The complete code is here.
def act(self, action):
action_type, param = action
if action_type == 'recommend':
if param == 'general':
return f"Let me learn your preferences first! What genres do you enjoy?"
return f"Based on your preferences, I recommend exploring {param}!"
elif action_type == 'update_preference':
self.semantic_memory.update_preference(f'genre_{param}', 1.0, weight=1.0)
return f"Got it! I'll remember you enjoy {param}."
elif action_type == 'execute':
return f"Executing task with strategy: {param}"
return "Action completed"
def revise_plan(self, feedback):
if 'no' in feedback.lower() or 'wrong' in feedback.lower():
if self.current_plan:
action_type, param = self.current_plan[0]
if action_type == 'recommend':
genre_prefs = sorted(
[(k, v) for k, v in self.semantic_memory.preferences.items() if 'genre_' in k],
key=lambda x: x[1],
reverse=True
)
if len(genre_prefs) > 1:
new_genre = genre_prefs[1][0].replace('genre_', '')
self.current_plan = [('recommend', new_genre)]
return True
return False
def reflect(self, state, action, outcome, success):
self.episodic_memory.store(state['raw'], str(action), outcome)
self.semantic_memory.record_pattern(state['intent'], str(action), success)
We define how the agent performs actions, modifies its decisions when feedback contradicts expectations, and reflects by storing experience. We continuously improve the agent’s behavior by letting it learn from each turn. Through this cycle, we make the system adaptive and self-correcting. Check The complete code is here.
def run_session(self, user_inputs):
self.session_count += 1
print(f"n{'='*60}")
print(f"SESSION {self.session_count}")
print(f"{'='*60}n")
results = []
for i, user_input in enumerate(user_inputs, 1):
print(f"Turn {i}")
print(f"User: {user_input}")
state = self.perceive(user_input)
plan = self.plan(state)
if not plan:
print("Agent: I'm not sure what to do with that.n")
continue
response = self.act(plan[0])
print(f"Agent: {response}n")
success="recommend" in plan[0][0] or 'update' in plan[0][0]
self.reflect(state, plan[0], response, success)
results.append({
'turn': i,
'input': user_input,
'intent': state['intent'],
'action': plan[0],
'response': response
})
return results
We simulate real interactions where an agent handles multiple user inputs in a single session. We watch the cycle of sensing β planning β action β reflection unfold repeatedly. As we run the session, we see how the agent gradually becomes more personal and intelligent. Check The complete code is here.
def evaluate_memory_usage(agent):
print("n" + "="*60)
print("MEMORY ANALYSIS")
print("="*60 + "n")
print(f"Episodic Memory:")
print(f" Total episodes stored: {len(agent.episodic_memory.episodes)}")
if agent.episodic_memory.episodes:
print(f" Oldest episode: {agent.episodic_memory.episodes[0]['timestamp']}")
print(f" Latest episode: {agent.episodic_memory.episodes[-1]['timestamp']}")
print(f"nSemantic Memory:")
print(f" Learned preferences: {len(agent.semantic_memory.preferences)}")
for pref, value in sorted(agent.semantic_memory.preferences.items(), key=lambda x: x[1], reverse=True)[:5]:
print(f" {pref}: {value:.3f}")
print(f"n Action patterns learned: {len(agent.semantic_memory.patterns)}")
print(f"n Success rates by context-action:")
for key, stats in list(agent.semantic_memory.success_rates.items())[:5]:
if stats['total'] > 0:
rate = stats['success'] / stats['total']
print(f" {key}: {rate:.2%} ({stats['success']}/{stats['total']})")
def compare_sessions(results_history):
print("n" + "="*60)
print("CROSS-SESSION ANALYSIS")
print("="*60 + "n")
for i, results in enumerate(results_history, 1):
recommendation_quality = sum(1 for r in results if 'preferences' in r['response'].lower())
print(f"Session {i}:")
print(f" Turns: {len(results)}")
print(f" Personalized responses: {recommendation_quality}")
We analyze how effectively an agent uses its memory. We examine stored events, learned preferences, and success patterns to evaluate how the agent develops. Check The complete code is here.
def run_demo():
agent = MemoryAgent()
print("nπ SCENARIO: Agent learns user preferences over multiple sessions")
session1_inputs = [
"Hi, I'm looking for something to read",
"I really like sci-fi books",
"Can you recommend something?",
]
results1 = agent.run_session(session1_inputs)
session2_inputs = [
"I'm bored, what should I read?",
"Actually, I also enjoy fantasy novels",
"Give me a recommendation",
]
results2 = agent.run_session(session2_inputs)
session3_inputs = [
"What do you suggest for tonight?",
"I'm in the mood for mystery too",
"Recommend something based on what you know about me",
]
results3 = agent.run_session(session3_inputs)
evaluate_memory_usage(agent)
compare_sessions([results1, results2, results3])
print("n" + "="*60)
print("EPISODIC MEMORY RETRIEVAL TEST")
print("="*60 + "n")
query = "recommend sci-fi"
similar = agent.episodic_memory.retrieve_similar(query, k=3)
print(f"Query: '{query}'")
print(f"Retrieved {len(similar)} similar episodes:n")
for ep in similar:
print(f" State: {ep['state']}")
print(f" Action: {ep['action']}")
print(f" Outcome: {ep['outcome'][:50]}...")
print()
if __name__ == "__main__":
print("="*60)
print("MEMORY & LONG-TERM AUTONOMY IN AGENTIC SYSTEMS")
print("="*60)
run_demo()
print("nβ
Tutorial complete! Key takeaways:")
print(" β’ Episodic memory stores specific experiences")
print(" β’ Semantic memory generalizes patterns")
print(" β’ Agents improve recommendations over sessions")
print(" β’ Memory retrieval guides future decisions")
We put everything together by running multiple sessions and testing memory retrieval. We observe the agent improving over the course of the interaction and refining recommendations based on accumulated knowledge. This comprehensive demonstration shows how the memory systems we build naturally generate long-term autonomy.
In summary, we recognize how the combination of episodic and semantic memory allows us to build agents that can continuously learn and make better decisions over time. We watch agents refine suggestions, adjust plans, and retrieve past experience to improve their responses session after session. Through these mechanisms, we see how long-term autonomy arises from simple yet effective memory structures.
Check The complete code is here. Please feel free to check out our GitHub page for tutorials, code, and notebooks. In addition, welcome to follow us twitter And donβt forget to join our 100k+ ML SubReddit and subscribe our newsletter. wait! Are you using Telegram? Now you can also join us via telegram.
Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for the benefit of society. His most recent endeavor is the launch of Marktechpost, an AI media platform that stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easy to understand for a broad audience. The platform has more than 2 million monthly views, which shows that it is very popular among viewers.
π FOLLOW MARKTECHPOST: Add us as your go-to source on Google.