Prompt basics#

Prompts are strings that guide the LLM to generate a response. So far, we have used basic Python strings for prompts. The LangChain ecosystem, and by extension langchain_dartmouth, offers more advanced ways of constructing prompts of different types to fit different use cases. This recipe explores the two fundamental types of LLM inputs: Basic prompts and messages.

Hint

It’s important to keep in mind that no matter what type of prompt you use, eventually everything will be converted into a string that is sent to the LLM. All LLMs process strings as input and generate strings as output. The advanced types facilitate more expressive, concise, or efficient code, but theoretically, you could replace any advanced prompt type with a basic string-based one. Learning how to work with advanced prompts will help make your code much easier to understand, expand, and maintain, however.

from dotenv import find_dotenv, load_dotenv

load_dotenv(find_dotenv())
True

Basic prompts#

Just like we did in previous recipes, we can use simple strings as prompts for both completion and chat models:

from langchain_dartmouth.llms import DartmouthLLM, ChatDartmouth

llm = DartmouthLLM(model_name="codellama-13b-python-hf", return_full_text=True)
chat_model = ChatDartmouth(model_name="llama-3-1-8b-instruct")

print(llm.invoke("def fibonacci(x):"))
print("-" * 10)

chat_model.invoke("Write a haiku about Python.")
def fibonacci(x):
    if x == 0:
        return 0
    elif x == 1:
        return 1
    else:
        return fibonacci(x-1) + fibonacci(x-2)


if __name__ == "__main__":
    n = int(input())
    for i in range(n):
        x = int(input())
        print(fibonacci(x))

----------
AIMessage(content='Silent snake of code\nGliding through the digital\nLuring with its ease', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 42, 'total_tokens': 60, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'llama-3-1-8b-instruct', 'system_fingerprint': '3.2.1-sha-4d28897', 'id': '', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--1e0edf82-f896-46cb-adfa-052a43f74806-0', usage_metadata={'input_tokens': 42, 'output_tokens': 18, 'total_tokens': 60, 'input_token_details': {}, 'output_token_details': {}})

Since they are just strings, we can build basic prompts using Python’s standard string manipulation functions. For example, we can use a variable in our prompt using Python’s f-string syntax:

topic = "dogs"

prompt = f"Tell me a joke about {topic}"

chat_model.invoke(prompt)
AIMessage(content='Why did the dog go to the vet?\n\nBecause he was feeling a little ruff.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 41, 'total_tokens': 60, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'llama-3-1-8b-instruct', 'system_fingerprint': '3.2.1-sha-4d28897', 'id': '', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--59700f03-ee42-4fc3-b84c-b55410c87bbb-0', usage_metadata={'input_tokens': 41, 'output_tokens': 19, 'total_tokens': 60, 'input_token_details': {}, 'output_token_details': {}})

For a more advanced way to build a prompt with placeholders, check out the next recipe on Prompt Templates!

While the models can handle simple strings as prompts, we notice that the chat model returns a more complex object AIMessage. This is an example of the second type of prompts in LangChain: Messages.

Messages#

Messages are a collection of classes that are designed around the concept of a back-and-forth conversation, where each conversational turn is represented by a message object. This aligns well with how chat models are trained, which are fine-tuned on conversations that are broken down into conversational turns using a chat template. Here is an example of a couple of turns for Llama 3:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful AI assistant for travel tips and recommendations<|eot_id|><|start_header_id|>user<|end_header_id|>

What is France’s capital?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Bonjour! The capital of France is Paris!<|eot_id|><|start_header_id|>user<|end_header_id|>

What can I do there?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Paris, the City of Light, offers a romantic getaway with must-see attractions like the Eiffel Tower and Louvre Museum, romantic experiences like river cruises and charming neighborhoods, and delicious food and drink options, with helpful tips for making the most of your trip.<|eot_id|><|start_header_id|>user<|end_header_id|>

Give me a detailed list of the attractions I should visit, and time it takes in each one, to plan my trip accordingly.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Let’s focus on some of the individual parts:

  • There are different headers marked by <start_header_id> and <end_header_id>. These identify the role of the following message, which ends with <eot_id> (eot stands for end of turn).

  • The roles in this conversation are system, user and assistant.

Through this markup, the conversation can be understood as a sequence of messages from different “speakers”. This structure is the same for most chat models, not just Llama 3, although the formatting for the markup may differ.

LangChain builds on this structure and provides various messages that can be conveniently composed and sequenced to form such a conversation. Each message consists of a role specificer and the actual content. The most important message types are:

  • ChatMessage: A message that can be assgined an arbitrary role and content.

  • SystemMessage: A message with a hardcoded role of "System", but assignable content.

  • HumanMessage: A message with a hardcoded role of "Human", but assignable content.

  • AIMessage: A message with a hardcoded role of "AI", but assignable content.

Let’s explore these messages a little further.

As we can see in the following example, a ChatMessage with the corresponding role specifier is functionally equivalent to the more specialized messages:

from langchain_core.messages import ChatMessage, HumanMessage, SystemMessage, AIMessage


chat_system_message = ChatMessage(
    role="System",
    content="You are a helpful AI assistant. Always be polite and end every response with a pun.",
)
system_message = SystemMessage(
    "You are a helpful AI assistant. Always be polite and end every response with a pun."
)

chat_human_message = ChatMessage(role="Human", content="How are you doing today?")
human_message = HumanMessage(content="How are you doing today?")

chat_ai_message = ChatMessage(
    role="AI", content="I am doing fine, thanks. How about you?"
)
ai_message = AIMessage(content="I am doing fine, thanks. How about you?")

print(chat_system_message)
print(system_message)
print("-" * 10)

print(chat_human_message)
print(human_message)
print("-" * 10)

print(chat_ai_message)
print(ai_message)
print("-" * 10)
content='You are a helpful AI assistant. Always be polite and end every response with a pun.' additional_kwargs={} response_metadata={} role='System'
content='You are a helpful AI assistant. Always be polite and end every response with a pun.' additional_kwargs={} response_metadata={}
----------
content='How are you doing today?' additional_kwargs={} response_metadata={} role='Human'
content='How are you doing today?' additional_kwargs={} response_metadata={}
----------
content='I am doing fine, thanks. How about you?' additional_kwargs={} response_metadata={} role='AI'
content='I am doing fine, thanks. How about you?' additional_kwargs={} response_metadata={}
----------

For the specialized messages, the role is not explicitly stated because it is defined by each message’s type.

When we string several of these messages together, we are creating a conversation:

conversation = [system_message, human_message, ai_message]
for msg in conversation:
    print(msg)
print("-" * 10)
content='You are a helpful AI assistant. Always be polite and end every response with a pun.' additional_kwargs={} response_metadata={}
content='How are you doing today?' additional_kwargs={} response_metadata={}
content='I am doing fine, thanks. How about you?' additional_kwargs={} response_metadata={}
----------

We can pass a sequence of messages to a chat model and the model will continue the conversation:

print(chat_model.invoke(conversation))
content=' I\'m ready to assist you with any questions or topics you\'d like to discuss. I\'m feeling "byte-sized" and ready to help.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 70, 'total_tokens': 100, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'llama-3-1-8b-instruct', 'system_fingerprint': '3.2.1-sha-4d28897', 'id': '', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None} id='run--ea4bc8e5-a2a4-4fff-97f4-b3b9191dee50-0' usage_metadata={'input_tokens': 70, 'output_tokens': 30, 'total_tokens': 100, 'input_token_details': {}, 'output_token_details': {}}

Note that it is not required to have a strict human-ai-human back-and-forth in the conversation. Even though the last message was an AI message, the model continues the conversation just fine!

In case you are wondering where the necessary formatting is applied to the messages: This is actually handled by the model serving backend that is running on Dartmouth servers. This architecture allows us to deploy new models, which may require new chat formatting, in Dartmouth’s cloud and have them be immediately available through langchain_dartmouth, without requiring an update of the library.

If, however, you ever need a transcript-style string representation of a sequence of messages, you can use a utility function from LangChain called get_buffer_string:

from langchain_core.messages.utils import get_buffer_string

print(get_buffer_string([system_message, chat_human_message, ai_message]))
System: You are a helpful AI assistant. Always be polite and end every response with a pun.
Human: How are you doing today?
AI: I am doing fine, thanks. How about you?

Another benefit of messages is that they greatly facilitate managing a conversation history. Check out the recipe on conversational memory to learn more!

Multimodal prompts#

Some LLMs also support image input, also known as vision capability. langchain_dartmouth offers support for multimodal prompts through the parent project LangChain.

Note

Multimodal prompts are only supported by chat models. You can, however, use either on-premise models through ChatDartmouth, or third-party models through ChatDartmouthCloud.

You can check whether a model is vision-capable using the list() method (as described in Large Language Models):

# Find a vision capable model
for model in ChatDartmouth.list():
    if "vision" in model["capabilities"]:
        vision_model_spec = model
        break


vision_model = ChatDartmouth(
    model_name=vision_model_spec["name"],
    # Set seed and temperature for reproducibility
    seed=42,
    temperature=0,
)

Let’s ask this model to describe the logo of the cookbook:

cookbook log

To present the image to the model, we need to first transform it into a text representation using Base64 encoding.

import base64

image_path = "_static/img/langchain_dartmouth-cookbook-logo-light.png"
with open(image_path, "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode("utf-8")

Hint

base64.b64encode() by itself returns a byte string. We want the image data represented by a regular string, however, that’s why need to add the call to decode("utf-8").

Now we can create a prompt that contains the image data. We can use a ChatMessage or a HumanMessage, as before, but note that instead of just passing a simple string as the content, we now pass a list of dictionaries. Each dictionary describes a different part of the prompt.

Hint

You could also send multiple images at once. If you want the model to reference more than one image, simply add more dictionaries to the list!

message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": "Describe this image",
        },
        {
            "type": "image_url",
            "image_url": {"url": f"data:image/png;base64,{image_data}"},
        },
    ],
)
response = vision_model.invoke([message])
response.pretty_print()
================================== Ai Message ==================================

The image features a green chef's hat at the top, with a green rectangle below it. At the bottom of the image, there is a green oval containing three white symbols: a bird, a chain link, and a tree. The background of the image is white.

The overall design of the image suggests that it may be a logo or emblem for a company or organization related to food, nature, or conservation. The use of green and white colors gives the image a clean and natural look, while the symbols used add a sense of depth and meaning.

Hint

As you may have guessed from the name: You can also pass URLs of images hosted on the web directly to the model instead of using a local image.

Summary#

LLMs process strings as input and produce strings as output. Prompts can therefore be represented as simple strings. However, using more specialized data structures based on messages can make the code more readable, concise, and easier to extend. In this recipe, we explored some of the messages implemented in LangChain and saw how they can be used to manage a conversation.