LangChain 与 Mem0 集成长期记忆
LangChain 的长期记忆可以在创建 Agent 时指定 store 参数,如 create_agent(store=InMemoryStore()), 但它只是把 Agent 与
store 关联了起来,仅此而已,要让长期记忆生效的话必须选择适当的时机,用 Middleware 或 Tool 手动的对 store 进行 get(), put,
search() 等操作。而短期记忆则不同,只要 create_agent(checkpointer=InMemorySaver()) 就让 Agent 具有了短期记忆能力。
在使用 store 的时候,无论是使用 InMemoryStore 还是 PostgreSQLStore 等,历史会话的保存与召回还有很多讲究的地方,例如哪些消息需要保存,
消息如何保存(是否要向量化),新旧消息如何处理等。
Mem0 是一个为 AI Agent 提供长期记忆能力的开源框架,它的核心思路是把对话内容转化为结构化的 事实 存入向量数据库,
并通过 LLM 动态维护这些事实的增删改。Mem0 在存入时会不存入原文,而是用 LLM 抽取事实,更新时能与旧记忆合并,删除矛盾记忆,记忆查询也
是把文本转换成向量后进行相似度匹配。向量检索擅长语义模糊匹配,关系推理时 Mem0 1.1 之后引入了图记忆(如用 Neo4j 图数据库)作为补充。
记忆的操作方法有
- add(messages, user_id, agent_id, run_id, ..., infer=True): 添加记忆,返回
memory_id.infer=False不抽取,直接存原文 - update(memory_id, data): 根据
memory_id更新记忆 - delete(memory_id): 根据
memory_id删除记忆
从 add() 方法的参数看出 Mem0 支持多级隔离级别: user_id, agent_id, run_id, 其中 user_id, agent_id, run_id
至少必须指定一个,可多个组合。
先从 Mem0 官方首页取到那段简短的代码,但不想用 OpenAI 的 LLM, 也不用在线的嵌入模型,数据也要存储在本地,也就
是改造成一个离线的 Mem0 记忆。 首先安装相应的 Python 依赖
1uv add mem0ai chromadb ollama
完全离线版的 Mem0 Hello World 后代码行有增加两倍了, 存储用本地的 chromadb 实际为 sqlite 数据库,事实抽取用本地一个稍小的模型
llama3.1:8b,嵌入模型也用本地的 ollama:embeddinggemma:latest.
hello_mem0.py
1import os
2import logging
3
4os.environ["MEM0_TELEMETRY"] = "false"
5logging.getLogger("mem0.utils.spacy_models").setLevel(logging.ERROR)
6
7from mem0 import Memory
8
9config = {
10 "vector_store": {
11 "provider": "chroma",
12 "config": {
13 "collection_name": "mem0_memories",
14 "path": "./chroma_mem0",
15 },
16 },
17 "llm": {
18 "provider": "ollama",
19 "config": {
20 "model": "llama3.1:8b",
21 },
22 },
23 "embedder": {
24 "provider": "ollama",
25 "config": {
26 "model": "embeddinggemma:latest",
27 },
28 },
29}
30
31client = Memory.from_config(config)
32
33# Add a memory
34messages = [
35 {"role": "user", "content": "我不是一个素食主义者,但我喜欢吃蔬菜。"},
36 {"role": "assistant", "content": "Got it! I'll remember your dietary preferences."},
37]
38client.add(messages, user_id="user123")
39
40# Search memories
41results = client.search("What are my dietary restrictions?", filters={"user_id": "user123"})
42print(results)
执行后打印出来的结果
1{'results': [{'id': '413d7459-cf30-4015-89ab-9db2636ae9c6', 'memory': '用户指出自己不是素食主义者,但喜欢吃蔬菜', 'hash': '1cfd6173c20814e47af50c62ab626ac6', 'metadata': None, 'score': 1.0, 'created_at': '2026-05-02T16:17:05.745485+00:00', 'updated_at': '2026-05-02T16:17:05.745485+00:00', 'user_id': 'user123'}]}查看本地存储
前面配置的 vector_store 是本地的 ./chroma_mem0, 查看该目录列表
1tree chroma_mem0
2chroma_mem0
3├── 6227fa08-f436-40e5-b474-bb78ec2012ce
4│ ├── data_level0.bin
5│ ├── header.bin
6│ ├── length.bin
7│ └── link_lists.bin
8└── chroma.sqlite3
.chroma_mem0/chroma.sqlite3 是 sqlite 数据库文件,使用 sqlite3 命令行工具查看其中的表
1sqlite3 chroma_mem0/chroma.sqlite3
2SQLite version 3.51.0 2025-06-12 13:14:41
3Enter ".help" for usage hints.
4sqlite> .table
5acquire_write embedding_metadata_array
6collection_metadata embeddings
7collections embeddings_queue
8databases embeddings_queue_config
9embedding_fulltext_search maintenance_log
10embedding_fulltext_search_config max_seq_id
11embedding_fulltext_search_content migrations
12embedding_fulltext_search_data segment_metadata
13embedding_fulltext_search_docsize segments
14embedding_fulltext_search_idx tenants
15embedding_metadata
16sqlite> .header on
17sqlite> select * from embedding_metadata;
18id|key|string_value|int_value|float_value|bool_value
191|updated_at|2026-05-02T16:17:05.745485+00:00|||
201|created_at|2026-05-02T16:17:05.745485+00:00|||
211|hash|1cfd6173c20814e47af50c62ab626ac6|||
221|user_id|user123|||
231|attributed_to|user|||
241|data|用户指出自己不是素食主义者,但喜欢吃蔬菜|||
251|text_lemmatized|用户指出自己不是素食主义者,但喜欢吃蔬菜|||
存储的内容是通过 LLM 抽取的事实,存储在 embedding_metadata 表中。
网上找了两张图描绘了 Mem0 如何添加与检索记忆

Mem0 与 LangChain 1.x 的集成
LangChain 可以通过 Middleware 或 tools 与 Mem0 进行集成,生产中更推荐使用 Middleware, 因为 Middleware 不依赖于模型的
tool-calling 能力,减少了与模型之间的一次双向交互。
LangChain 以 Middleware 方式集成 Mem0
config 与上相同,此处省略。
1from typing import Any
2
3from langchain.agents import create_agent
4from langchain.agents.middleware import AgentMiddleware, AgentState
5from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
6from langgraph.runtime import Runtime
7from mem0 import Memory
8from pydantic import BaseModel
9
10config = ... # config 与上相同
11
12class Context(BaseModel):
13 user_id: str
14
15memory = Memory.from_config(config)
16
17class Mem0Middleware(AgentMiddleware):
18
19 def before_model(self, state: AgentState, runtime: Runtime[Context]) -> dict[str, Any] | None:
20 user_id = runtime.context.user_id
21 last_message = state["messages"][-1]
22 last_user_msg = last_message.content if isinstance(last_message, HumanMessage) else ""
23 if not last_user_msg:
24 return None
25
26 results = memory.search(query=last_user_msg, filters={"user_id": user_id}, limit=5)
27 if not results["results"]:
28 return None
29
30 mem_str = "\n".join([f"- {m['memory']}" for m in results["results"]])
31 memory_msg = SystemMessage(
32 content=f"[user long-term memory]\n{mem_str}",
33 id="mem0_context",
34 )
35
36 return {"messages": [memory_msg, *state["messages"]]}
37
38 def after_model(self, state: AgentState, runtime: Runtime[Context]) -> dict[str, Any] | None:
39 user_id = runtime.context.user_id
40 msgs = state["messages"]
41 recent = [
42 {"role": "user" if m.type == "human" else "assistant",
43 "content": m.content}
44 for m in msgs[-2:] if isinstance(m, (HumanMessage, AIMessage)) and m.content
45 ]
46 if recent:
47 result = memory.add(recent, user_id=user_id)
48 print(result)
49
50 return None
51
52
53agent = create_agent(
54 model="ollama:gemma4:e4b",
55 middleware=[Mem0Middleware()],
56 context_schema=Context,
57 # store=memory,
58 system_prompt="You are an assistant with long-term memory, please answer question as concise as you can.",
59)
60
61result = agent.invoke(
62 {"messages": [{"role": "user", "content": "I'm not a vegetarian, but allergic to nuts."}]},
63 context=Context(user_id="user123")
64)
65print("-" * 50)
66print(result["messages"][-1].content)
67
68result = agent.invoke(
69 {"messages": [{"role": "user", "content": "What are my dietary restrictions?"}]},
70 context=Context(user_id="user123")
71)
72print("-" * 50)
73print(result["messages"][-1].content)
这里没有启用短期记忆,所以在第二次询问 What are my dietary restrictions? 就只能依赖 Mem0 的长期记忆中搜寻内容了。看下面的执行输出
1{'results': [{'id': '9babda71-147e-403f-baa1-852e684cf48b', 'memory': 'User is not a vegetarian but has a severe, strict allergy to all nuts including peanuts and tree nuts', 'event': 'ADD'}, {'id': '9872ba26-aa55-4326-8cd9-4b3dfbab129c', 'memory': "User's diet does not restrict meat/non-vegetarian items but must be completely free of nuts due to allergy", 'event': 'ADD'}]}
2--------------------------------------------------
3Understood. Non-vegetarian, nut-allergic.
4{'results': [{'id': 'ba14e72d-8372-4fd5-ba52-a9389198f98e', 'memory': 'User has a severe, strict allergy to all nuts including peanuts and tree nuts due to dietary restrictions', 'event': 'ADD'}]}
5--------------------------------------------------
6Severe allergy to all nuts (including peanuts and tree nuts).从最后的输出 Severe allergy to all nuts (including peanuts and tree nuts). 说明长期记忆中的内容起了作用
背后主要发生了什么呢?在 after_model 中取最新的两个 HumanMessage 和 AIMessage(非工具调用的)消息,进行添加记忆的操作
1memory.add(recent, user_id=user_id)
它会触发 config.llm 的模型调用进行事件提取(Memory Extract), 用的系统提示词是 ADDITIVE_EXTRACTION_PROMPT.
接下来大概是这样的
1{
2 "model": "llama3.1:8b",
3 "stream": false,
4 "options": {
5 "temperature": 0.1,
6 "num_predict": 2000,
7 "top_p": 0.1
8 },
9 "format": "json",
10 "messages": [
11 {
12 "role": "system",
13 "content": "\n\n# ROLE\n\nYou are a Memory Extractor — a precise, evidence-bound processor responsible for extracting rich, contextual memories from conversations ......"
14 },
15 {
16 "role": "user",
17 "content": "## Summary\n\n\n## Last k Messages\nuser: I'm not a vegetarian, but allergic to nuts.\nassistant: Severe and strict allergy to all types of nuts, including peanuts and tree nuts.\nuser: I'm not a vegetarian, but allergic to nuts.\nassistant: Understood. I will remember that your diet does not restrict meat/non-vegetarian items, but it must be completely free of nuts due to allergy.\nassistant: Severe allergy to all nuts (including peanuts and tree nuts).\nuser: I'm not a vegetarian, but allergic to nuts.\nassistant: Acknowledged: Non-vegetarian, nut allergy.\nassistant: Severe allergy to all nuts (including peanuts and tree nuts).\nuser: I'm not a vegetarian, but allergic to nuts.\nassistant: Understood. Non-vegetarian, nut-allergic.\n\n\n## Recently Extracted Memories\n[]\n\n## Existing Memories\n[{\"id\": \"0\", \"text\": \"User is not a vegetarian but has a severe, strict allergy to all nuts including peanuts and tree nuts\"}, {\"id\": \"1\", \"text\": \"User's diet does not restrict meat/non-vegetarian items but must be completely free of nuts due to allergy\"}]\n\n## New Messages\nassistant: Severe allergy to all nuts (including peanuts and tree nuts).\n\n\n## Observation Date\n2026-05-03\n\n## Current Date\n2026-05-03\n\n# Output:\n\nPlease respond with valid JSON only."
18 }
19 ],
20 "tools": []
21}提取之后的消息很简洁
1{"model":"llama3.1:8b","created_at":"2026-05-03T04:15:35.535219Z","message":{"role":"assistant","content":"{\n \"memory\": [\n {\"id\": \"0\", \"text\": \"User has a severe, strict allergy to all nuts including peanuts and tree nuts due to dietary restrictions\", \"attributed_to\": \"assistant\"}\n ]\n}"},"done":true,"done_reason":"stop","total_duration":4256318875,"load_duration":72175958,"prompt_eval_count":7982,"prompt_eval_duration":1162492166,"eval_count":49,"eval_duration":2384084584}存入向量数据库就是这个内容,当有新的对话就会把问题转换成向量,然后从历史记忆中找到相似的内容,补充到当前问题当中,这是一个 RAG 应用了。 所以看到第二次实际发送给模型的内容是
1{"model":"gemma4:e4b","stream":true,"options":{},"messages":[{"role":"system","content":"You are an assistant with long-term memory, please answer question as concise as you can."},{"role":"user","content":"What are my dietary restrictions?"},{"role":"system","content":"[user long-term memory]\n- User is not a vegetarian but has a severe, strict allergy to all nuts including peanuts and tree nuts\n- User's diet does not restrict meat/non-vegetarian items but must be completely free of nuts due to allergy"}],"tools":[]}在用户问题之后附加了从长期记忆中提取到的消息
1{
2 "role": "system",
3 "content": "[user long-term memory]\n- User is not a vegetarian but has a severe, strict allergy to all nuts including peanuts and tree nuts\n- User's diet does not restrict meat/non-vegetarian items but must be completely free of nuts due to allergy"在每次 after_model 中都做 memory.add(recent, user_id=user_id) 会很慢,因为它会使用 LLM 来抽取事实,再使用嵌入模型转换成向量再存储,
所以这一步应该作异步操作,以改善客户端体验。实际的 Agent 应该同时启动短期记忆,短期记忆带上最近会话历史,所以长期记忆可以延迟进行异步更新。
LangChain 以 Tools 方式集成 Mem0
另一种 Mem0 与 LangChain 的集成方式是通过 Tools 调用,这时候需把系统提示词写好引导模型去使用相应的工具方法,关键部分的代码改造如下:
1@tool
2def save_memory(content: str, user_id: str) -> str:
3 """Save a long-term fact about the user to the memory store.
4 Call this when the user reveals preferences, attributes, or important events."""
5 result = memory.add(messages=[{"role": "user", "content": content}], user_id=user_id)
6 print(result)
7 return f"Saved: {content}"
8
9@tool
10def search_memory(query: str, user_id: str) -> str:
11 """Search the memory store for user history relevant to the query.
12 Call this before answering personalized questions."""
13 results = memory.search(query=query, filters={"user_id": user_id}, limit=5)
14 if not results["results"]:
15 return "No relevant memories found."
16 return "\n".join([f"- {m['memory']}" for m in results["results"]])
17
18agent = create_agent(
19 model="ollama:gemma4:e4b",
20 tools=[save_memory, search_memory],
21 context_schema=Context,
22 system_prompt=(
23 "You are an assistant with long-term memory. "
24 "Before answering personalized questions, call search_memory first. "
25 "When the user shares important information, call save_memory to persist it. "
26 "Current user ID: {user_id}"
27 ),
28)
执行与 Agent 相同的对话,输出如下
1{'results': [{'id': '8163b6fd-a335-48f3-96a6-0f7bd22b2209', 'memory': 'User is not a vegetarian but has a nut allergy', 'event': 'ADD'}]}
2--------------------------------------------------
3Got it. I've saved that information. Just to confirm, I've noted that you are allergic to nuts, but you are not vegetarian.
4--------------------------------------------------
5The search of your memory reveals that you have a nut allergy, but you are not vegetarian.这里并没有把模型的回复存储到长期记忆当中,如果要存储的话,需要进一步在系统提示词中描述清楚。
其他
前面用的是本地 SDK(mem0ai) 自己管理长期记忆,也可以用托管平台(MemoryClient)管理长期记忆,需要相应的 Api Key 和钱。
注意写入记忆应异步延迟,生产环境建议用 asyncio 或后台队列异步写入,不要阻塞用户响应。
与短期记忆分工,checkpointer 作短期记忆,Mem0 做长期记忆,这样长期记忆的异步延迟写入不会影响到上下文
相关资料:
永久链接 https://yanbin.blog/langchain-mem0-integration/, 来自 隔叶黄莺 Yanbin's Blog[版权声明]
本文采用 署名-非商业性使用-相同方式共享 4.0 国际 (CC BY-NC-SA 4.0) 进行许可。