# Conversational Memory

As we mentioned in previous recipes, large language models have no internal state, i.e., they do not retain any conversational context from previous messages. A multi-turn conversation works by passing an increasingly longer prompt to the model that includes all previous messages in addition to the most recent one. There are several ways to manage the conversational context, or conversational memory, which have their individual strenghts and weaknesses. In this recipe, we will explore some of the most common ones.



In [None]:
from dotenv import find_dotenv, load_dotenv

load_dotenv(find_dotenv())

In [95]:
from langchain_dartmouth.llms import ChatDartmouth

llm = ChatDartmouth(model_name="llama-3-1-8b-instruct", temperature=0, seed=42)

## Non-persistent conversational memory 

 We can think of the conversational memory as the history of all messages that have been passed to and received from the model so far. In [Prompt Basics](06-prompt-basics.ipynb), we saw that we can pass a list of messages to a chat model. We can use this mechanism to create a simple conversational memory system by appending every message (outgoing and incoming) to a list:


In [None]:
from langchain_core.messages import HumanMessage

first_message = HumanMessage("Ask me a riddle!")
conversation = [first_message]

first_response = llm.invoke(conversation)
conversation.append(first_response)

for message in conversation:
    message.pretty_print()

In [None]:
second_message = HumanMessage("Is it a unicorn?")
conversation.append(second_message)

second_response = llm.invoke(conversation)
conversation.append(second_response)

for message in conversation:
    message.pretty_print()

While this technique works for relatively simple scenarios, it's not very elegant and requires quite a bit of code to maintain the history. It also can be potentially problematic when the LLM is part of a chain and we don't want to pass the conversation history as an input to the chain.

Instead, the LLM component should keep track of the message history internally!

Fortunately, LangChain offers a way to make that happen. We need two things for this:
- a component that keeps track of the message history (replacing the simple list above)
- a way for the LLM to interact with this list whenever a new (input or output) message arrives (replacing the list management code we wrote above)

To keep track of the message history, we can use a class called `ChatMessageHistory`:

In [98]:
from langchain_community.chat_message_histories import ChatMessageHistory

history = ChatMessageHistory()

This component works very similarly to the simple list we used above, but is more explicitly designed to be used with messages. For example, here is how we create the history from above:

In [None]:
history.add_message(first_message)
history.add_message(first_response)
history.add_message(second_message)
history.add_message(second_response)

history.messages

To make an LLM use a `ChatMessageHistory` object, we need to "attach" it to the `ChatDartmouth` component by wrapping them with a class called `RunnableWithMessageHistory`. 

This class assumes that we want to be able to manage multiple conversation histories, as we would in a chat application. It therefore expects a function that returns a chat message history object given a session id. In this example, we only keep track of a single conversation, so we can just return the same history every time. So we just need to write a very simple dummy function:

In [100]:
history = ChatMessageHistory()


def get_history(session_id):
    return history

```{note}
We have to make sure to instantiate the history outside the function. Otherwise, the message history would not persist between calls to `get_history`!
```

Now we have everything we need to tie it all together:

In [101]:
from langchain_core.runnables.history import RunnableWithMessageHistory


llm_with_memory = RunnableWithMessageHistory(
    runnable=llm,
    get_session_history=get_history,
)

```{hint}

LangChain calls all components that implement the standard interface of the `invoke` and `stream` methods (and some others) a _runnable_.
```

When we invoke this runnable, we have to specify the session id that will be passed to `get_history` (even though we don't use it here):

In [None]:
llm_with_memory.invoke(
    {
        "input": "Tell me a riddle!",
    },
    config={"configurable": {"session_id": "whatever"}},
).pretty_print()

In [None]:
llm_with_memory.invoke(
    {"input": "Give me a hint"},
    config={"configurable": {"session_id": "whatever"}},
).pretty_print()

In [None]:
llm_with_memory.invoke(
    {"input": "Is it a river?"},
    config={"configurable": {"session_id": "whatever"}},
).pretty_print()

We can check the message history object to see that indeed keeps track of all the messages:

In [None]:
for message in history.messages:
    message.pretty_print()

Looks great, doesn't it?

One issue remains, however: Depending on our use case, we might want to persist the message history between runs of the program. Or maybe we want to be able to do something more meaningful with the session id in `get_history`, e.g., manage multiple conversations. In the next section, we will learn about ways to achieve both of those things!

## Persistent conversational memory

While we could write the message history to disk at the end of every run of our program and read it back in at the start of each run, that would be a bit cumbersome and would require additional boilerplate code. We also might want to consider different options to store the history, like a SQL database.

LangChain offers a variety of implementations for the [message history](https://api.python.langchain.com/en/latest/community_api_reference.html#module-langchain_community.chat_message_histories), built on different services. For example, we can store the history to a SQLite database:

In [106]:
from langchain_community.chat_message_histories import SQLChatMessageHistory

DB_NAME = "chat_history.db"


def get_history(session_id):
    return SQLChatMessageHistory(session_id, connection=f"sqlite:///{DB_NAME}")

Now every time we call `get_history`, the chat history will be retrieved from the specified SQLite database, using the session ID as a filter.

We can now manage multiple conversations by specifying a separate ID for each conversation thread:

In [None]:
llm_with_memory = RunnableWithMessageHistory(
    runnable=llm,
    get_session_history=get_history,
)

llm_with_memory.invoke(
    {"input": "Hi, I am Simon!"},
    config={"configurable": {"session_id": "simons_convo"}},
).pretty_print()

In [None]:
llm_with_memory.invoke(
    {"input": "Hi, I am Alex!"},
    config={"configurable": {"session_id": "alex_convo"}},
).pretty_print()

We can inspect the database like any other SQLite database, e.g. with Python's built-in `sqlite3` module:

In [None]:
import sqlite3

con = sqlite3.connect(DB_NAME)
cur = con.cursor()

# By default, the table name is 'message_store'
cur.execute("SELECT * FROM message_store;").fetchall()

As we can see, the conversations are organized by the session ID, so we can continue to have separate conversations by passing the respective session ID:

In [None]:
llm_with_memory.invoke(
    {"input": "What's my name again?"},
    config={"configurable": {"session_id": "simons_convo"}},
).pretty_print()

In [None]:
llm_with_memory.invoke(
    {"input": "What's my name again?"},
    config={"configurable": {"session_id": "alex_convo"}},
).pretty_print()

Since the database is stored on disk by default, the history is automatically persisted across multiple runs of the program. If you want to use one of the [other implementations ](https://api.python.langchain.com/en/latest/community_api_reference.html#module-langchain_community.chat_message_histories)of the chat message history, you only need to change the `get_history` function!

## Summary

LLMs are stateless and thus require the entire conversation to generate the next turn. We can keep track of the conversation manually by maintaining a list of all outgoing and incoming messages. If we want a more elegant solution that can optionally persist the message history using a variety of backends (e.g., a SQL database), we can use one of [LangChain's implementations of the chat message history](https://api.python.langchain.com/en/latest/community_api_reference.html#module-langchain_community.chat_message_histories).