Large Language Models#

A number of Large Language Models (LLMs) are available in langchain_dartmouth.

LLMs in this library generally come in two flavors:

  • Baseline completion models:

    • These models are trained to simply continue the given prompt by adding the next token.

  • Instruction-tuned chat models:

    • These models are built on baseline completion models, but further trained using a specific prompt format to allow a conversational back-and-forth. Dartmouth offers limited access to various third-party commercial chat models, e.g., OpenAI’s GPT-4o or Anthropic’s Claude. Daily token limits per user apply.

Each of these models are supported by langchain_dartmouth using a separate component.

You can find all available models using the list() method of the respective class, as we will see below.

Let’s explore these components! But before we we get started, we need to load our Dartmouth API key and Dartmouth Chat API key from the .env file:

from dotenv import find_dotenv, load_dotenv

load_dotenv(find_dotenv())
True

Baseline Completion Models#

Baseline completion models are trained to simply continue the given prompt by adding the next token. The continued prompt is then considered the next input to the model, which extends it by another token. This continues until a specified maximum number of tokens have been added, or until a special token called a stop token is generated.

A popular use-case for completion models is to generate code. Let’s try an example and have the LLM generate a function based on its signature!

All baseline completion models are available through the component DartmouthLLM in the submodule langchain_dartmouth.llms, so we first need to import that class:

from langchain_dartmouth.llms import DartmouthLLM

We can find out, which models are available, by using the static method list():

Note

A static method is a function that is defined on the class itself, not on an instance of the class. It’s essentially just a regular function, but tied to a class for grouping purposes. In practice, that means that we can call a static method without instantiating an object of the class first. That is why there are no parentheses after the class name in the next code block!

DartmouthLLM.list()
[{'name': 'llama-3-8b-instruct',
  'provider': 'meta',
  'display_name': 'Llama 3 8B Instruct',
  'tokenizer': 'meta-llama/Meta-Llama-3-8B-Instruct',
  'type': 'llm',
  'capabilities': ['chat'],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 8192}},
 {'name': 'llama-3-1-8b-instruct',
  'provider': 'meta',
  'display_name': 'Llama 3.1 8B Instruct',
  'tokenizer': 'meta-llama/Llama-3.1-8B-Instruct',
  'type': 'llm',
  'capabilities': ['chat'],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 8192}},
 {'name': 'llama-3-2-11b-vision-instruct',
  'provider': 'meta',
  'display_name': 'Llama 3.2 11B Vision Instruct',
  'tokenizer': 'meta-llama/Llama-3.2-11B-Vision-Instruct',
  'type': 'llm',
  'capabilities': ['chat', 'vision'],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 127999}},
 {'name': 'codellama-13b-instruct-hf',
  'provider': 'meta',
  'display_name': 'CodeLlama 13B Instruct HF',
  'tokenizer': 'meta-llama/CodeLlama-13b-Instruct-hf',
  'type': 'llm',
  'capabilities': ['chat'],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 6144}},
 {'name': 'codellama-13b-python-hf',
  'provider': 'meta',
  'display_name': 'CodeLlama 13B Python HF',
  'tokenizer': 'meta-llama/CodeLlama-13b-Python-hf',
  'type': 'llm',
  'capabilities': [],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 2048}}]

We can now instantiate a specific LLM by specifying its name as it appears in the listing. Since the model will generate the continuation of our prompt, it usually makes sense to repeat our prompt in the response, which we can request by setting the parameter return_full_text to True:

llm = DartmouthLLM(model_name="codellama-13b-python-hf", return_full_text=True)

We can now send a prompt to the model and receive its response by using the invoke() method:

response = llm.invoke("def remove_digits(s: str) -> str:")
print(response)
def remove_digits(s: str) -> str:
    result = []
    for c in s:
        if not c.isdigit():
            result.append(c)
    return "".join(result)

Since they are only trained to continue the given prompt, completion models are not great at responding to chat-like prompts:

response = llm.invoke("How can I define a class in Python?")
print(response)
How can I define a class in Python?

To define a class, you must use the keyword class. Classes in Python are not called as functions, but are declared as a class.

A Python Class is a user-defined datatype, which is a blueprint for creating objects.

When you create an object, Python allocates memory for storing the object’s attributes (data).

You can define a class using the keyword class. The class name is followed by the parentheses, which contain the base classes of the class. If no base classes are present, you can omit the parentheses.

A class is made up of class attributes and class methods. The attributes are the data and methods are the functions. You can define attributes and methods within the class definition.

Example

# Create a class called MyClass.

class MyClass:
    x = 5

# Create an object called p1.

p1 = MyClass()

# Output: 5

print(p1.x)

# Create another object called p2.

p2 = MyClass()

# Change the value of x for p2.

p2.x = 10

# Output: 5

print(p1.x)

# Output: 10

print(p2.x)

# Class Attributes

# Class attributes are the attributes of a class and are shared by all the objects of the class.

# Class attributes are created by assigning values to the attributes outside the class.

# Example

# Create a class called MyClass.

class MyClass:
    x = 5

# Output: 5

print(MyClass.x)

# Create an object called p1.

p1 = MyClass()

# Output: 5

print(p1.x)

# Change the value of x for p1.

p1.x = 10

# Output: 10

print(p1.x)

# Output: 5

print(MyClass.x)

# Example

# Create a class called MyClass.

class MyClass:
    x = 5

# Create an object called p1.

p1 = MyClass()

# Output: 5

print(p1.x

As we can see, the model just continues the prompt in a way that is similar to what it has seen during its training. If we want to use it in a conversational way, we need to use an instruction-tuned chat model.

Instruction-Tuned Chat Models#

Instruction-tuned chat models are trained to follow a specific set of instructions that the model is expected to follow. These models can be used in conversational scenarios, where the user asks the model questions and the model replies with answers. The model will not just continue the prompt but also understand the context of the conversation preceding the prompt. To achieve this, baseline completion models are fine-tuned (i.e., further trained) on conversational text material that is formatted following a particular template. That is why we often see multiple variants of an LLM: the base model and the instruct version (see, e.g., CodeLlama).

Let’s see what happens if we ask an instruction-tuned model our question from the previous section:

llm = DartmouthLLM(model_name="codellama-13b-instruct-hf")
response = llm.invoke("How can I define a class in Python?")
print(response)
I can't remember how I define a class in Python. I want to use some classes in Python, and I can't remember the syntax.

Comment: [Here is the Python docs on classes](https://docs.python.org/3/tutorial/classes.html)

Answer: \begin{code}
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    def speak(self):
        print(self.name + ' says "Hello"!')

# create an instance of the Person class
p = Person("Bob", 32)
p.speak()

# access the name attribute of p
print(p.name)

# access the age attribute of p
print(p.age)
\end{code}

You can also make classes private:

\begin{code}
class _Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    def speak(self):
        print(self.name + ' says "Hello"!')
\end{code}

Comment: And if you want to make classes private (or protected, for the curious), prefix the name with an underscore.

Comment: Does this syntax work in Python 3.x?

Comment: @AkseliPalén Yes, it does. I'm running Python 3.7.3.

Comment: @AkseliPalén `print` is a function in python3.x. The parentheses are necessary.

Comment: @AkseliPalén No problem. I'm glad to help.

Comment: @AkseliPalén You can accept the answer by clicking on the grey checkmark next to it.

Comment: @AkseliPalén No problem. Glad to help.

Answer: You can also use metaclasses to create classes.

\begin{code}
class Meta(type):
    def __new__(cls, name, bases, attrs):
        attrs['x'] = 4
        return type.__new__(cls, name, bases, attrs)

class A(metaclass=Meta):
    pass

a = A()

Well, that does not seem very helpful… What went wrong here?

The problem is that the prompt we use during inference (when we invoke the model) needs to follow the same format that was used during the instruction-tuning. This format is not the same for every model! Let’s try our prompt again using CodeLlama’s Instructions format:

response = llm.invoke("<s>[INST] How can I define a class in Python? [/INST] ")

print(response)
 In Python, you can define a class by using the `class` keyword followed by the name of the class, and then defining its methods and attributes between curly braces.

Here is an example of a simple class:
```
class Dog:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def bark(self):
        print("Woof!")
```
This class defines a `Dog` class with two attributes: `name` and `age`. It also defines a method called `bark` that prints "Woof!" when called.

You can create an instance of the `Dog` class by using the `()` syntax:
```
my_dog = Dog("Buddy", 3)
```
This creates an instance of the `Dog` class with the name "Buddy" and the age 3. You can then call the `bark` method on this instance:
```
my_dog.bark()
```
This will print "Woof!" to the console.

You can also define methods that take arguments:
```
class Dog:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def bark(self, times):
        print("Woof!")
        for i in range(times):
            print("Woof!")
```
This class defines a `Dog` class with a `bark` method that takes an argument `times` and prints "Woof!" that many times. You can call this method with a specific argument:
```
my_dog = Dog("Buddy", 3)
my_dog.bark(3)
```
This will print "Woof! Woof! Woof!".

You can also define methods that return values:
```
class Dog:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def bark(self):
        return "Woof!"
```
This class defines a `Dog` class with a `bark` method that returns the string "Woof!". You can call this method and assign

That looks a lot better!

Note

You may notice that the last sentence gets cut off. This is due to the default value for the maximum number of generated tokens, which may be too low. You can set a higher limit when you instantiate the DartmouthLLM object. Check the API reference for more information.

Managing the prompt format can quickly get tedious, especially if you want to switch between different models. Fortunately, the ChatDartmouth component handles the prompt formatting “under-the-hood” and we can just pass the actual message when we invoke it:

from langchain_dartmouth.llms import ChatDartmouth

llm = ChatDartmouth(model_name="meta.llama-3.2-11b-vision-instruct")
response = llm.invoke("How can I define a class in Python?")

print(response.content)
**Defining a Class in Python**
================================

In Python, you can define a class using the `class` keyword followed by the name of the class. The basic syntax is as follows:

```python
class ClassName:
    # class body
```

Here is a simple example of a class definition:

```python
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def greet(self):
        print(f"Hello, my name is {self.name} and I am {self.age} years old.")
```

**Class Body**
---------------

The class body contains the methods and variables that belong to the class. In the example above, we have two methods: `__init__` and `greet`. The `__init__` method is a special method that is called when an object is created from the class, and it is used to initialize the object's attributes. The `greet` method is a regular method that prints a greeting message.

**Attributes and Methods**
---------------------------

Attributes are variables that belong to the class, and methods are functions that belong to the class. In the example above, `name` and `age` are attributes, and `greet` is a method.

**Example Use Case**
---------------------

Here's an example of how you can use the `Person` class:

```python
# Create a new object from the Person class
person = Person("John", 30)

# Call the greet method
person.greet()

# Access the attributes
print(person.name)
print(person.age)
```

This will output:

```
Hello, my name is John and I am 30 years old.
John
30
```

**Best Practices**
------------------

* Use meaningful and descriptive names for your classes, methods, and attributes.
* Use the `self` parameter to refer to the current object in methods.
* Use the `__init__` method to initialize the object's attributes.
* Use comments to explain the purpose of your code.
* Keep your classes and methods organized and easy to read.

That looks a lot better!

Note

ChatDartmouth returns more than just a raw string: It returns an AIMessage object, which you can learn more about in LangChain’s API reference.

We will see more of these message objects in the recipe on prompts!

By the way, just like with DartmouthLLM, we can get a list of the available chat models using the static method list():

models = ChatDartmouth.list(base_only=True)

for model in models:
    print(model)
    print("------")
id='openai.gpt-oss-120b' name='GPT-OSS 120b' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/). It can keep up with many of the Cloud models.' is_embedding=False capabilities=['usage', 'reasoning', 'tool calling'] is_local=True cost='free'
------
id='google.gemma-3-27b-it' name='Gemma 3 27b' description=None is_embedding=False capabilities=['vision', 'usage', 'vision'] is_local=True cost='free'
------
id='meta.llama-3.2-11b-vision-instruct' name='Llama 3.2 11b' description=None is_embedding=False capabilities=['vision', 'usage', 'vision'] is_local=True cost='free'
------
id='qwen.qwen3-vl:32b' name='Qwen3-VL 32b' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling'] is_local=True cost='free'
------
id='qwen.qwen3.5-122b' name='Qwen3.5 122b' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision', 'reasoning'] is_local=True cost='free'
------
id='anthropic.claude-haiku-4-5-20251001' name='Claude Haiku 4.5 2025-10-01' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling', 'hybrid reasoning'] is_local=False cost='$'
------
id='anthropic.claude-opus-4-5-20251101' name='Claude Opus 4.5 2025-11-01' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'hybrid reasoning', 'tool calling'] is_local=False cost='$$$$'
------
id='anthropic.claude-opus-4-6' name='Claude Opus 4.6' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'hybrid reasoning', 'tool calling'] is_local=False cost='$$$$'
------
id='anthropic.claude-opus-4-7' name='Claude Opus 4.7' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'hybrid reasoning', 'tool calling'] is_local=False cost='$$$$'
------
id='anthropic.claude-sonnet-4-5-20250929' name='Claude Sonnet 4.5 2025-09-29' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'hybrid reasoning', 'tool calling'] is_local=False cost='$$$'
------
id='anthropic.claude-sonnet-4-6' name='Claude Sonnet 4.6' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'hybrid reasoning', 'tool calling'] is_local=False cost='$$$'
------
id='vertex_ai.gemini-3-flash-preview' name='Gemini 3 Flash Preview' description='This is a Reasoning model that "thinks" before it responds. It is set to Minimal and you can change the degree of reasoning effort in the chat settings(https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling', 'reasoning'] is_local=False cost='$'
------
id='vertex_ai.gemini-3.1-pro-preview' name='Gemini 3.1 Pro Preview' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'reasoning', 'vision'] is_local=False cost='$$$'
------
id='openai.gpt-5.3-chat-latest' name='GPT 5.3 Instant' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling'] is_local=False cost='$$$'
------
id='openai.gpt-5.4-2026-03-05' name='GPT 5.4 2026-03-05' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'reasoning', 'tool calling'] is_local=False cost='$$$'
------
id='openai.gpt-5.4-mini-2026-03-17' name='GPT 5.4 Mini 2026-03-17' description=None is_embedding=False capabilities=['vision', 'usage', 'reasoning', 'vision', 'tool calling'] is_local=False cost='$'
------
id='openai.gpt-5.5-2026-04-23' name='GPT-5.5-2026-04-23' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling', 'hybrid reasoning'] is_local=False cost='$$$$'
------
id='mistral.mistral-large-2512' name='Mistral Large 3' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$$'
------
id='mistral.mistral-medium-2508' name='Mistral Medium 2508' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling'] is_local=False cost='$$'
------
id='mistral.pixtral-large-2411' name='Pixtral Large 2024-11-01' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$$$'
------
id='meta.llama-3-2-3b-instruct' name='Llama 3.2 3b' description=None is_embedding=False capabilities=['vision', 'usage'] is_local=True cost='free'
------
id='meta.codellama-13b-instruct-hf' name='CodeLlama 13b Instruct HF' description=None is_embedding=False capabilities=['usage'] is_local=True cost='free'
------
id='mistral.devstral-2512' name='Mistral Devstral 2' description=None is_embedding=False capabilities=['vision', 'usage'] is_local=False cost='undefined'
------
id='vertex_ai.gemini-3.1-flash-lite-preview' name='Gemini 3.1 Flash Lite Preview' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'reasoning', 'vision'] is_local=False cost='$'
------
id='vertex_ai.gemini-2.5-flash' name='Gemini 2.5 Flash' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'reasoning', 'vision'] is_local=False cost='$'
------
id='vertex_ai.gemini-2.5-pro' name='Gemini 2.5 Pro' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'reasoning', 'vision'] is_local=False cost='$$$'
------
id='openai.gpt-5.2-chat-latest' name='GPT 5.2 Instant' description=None is_embedding=False capabilities=['vision', 'usage'] is_local=False cost='undefined'
------
id='google.gemma-4-31B-it' name='Gemma 4 31b' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling'] is_local=True cost='free'
------

Third-party chat models#

In addition to the locally-deployed, open-source models, Dartmouth also offers access to various third-party chat models. These models are available through the ChatDartmouth class, just like the locally deployed models.

Note

Remember: You need a separate API key for ChatDartmouth. Follow the instructions to get yours, and then store it in an environment variable called DARTMOUTH_CHAT_API_KEY.

The ChatDartmouth.list() method returns a list of ModelInfo objects, which contain helpful information on whether a model is local or off-prem, how much it costs, and what capabilities it has.

Warning

All models available through ChatDartmouth that are marked as is_local == False are commercial, third-party models. This means that your data will be sent to the model provider to be processed. If you have privacy concerns, please reach out to Research Computing to obtain a copy of the terms of use for the model you are interested in.

Note

Dartmouth pays for a significant daily token allotment per user, but eventually you may hit a limit. If you need a larger volume of tokens for your project, please reach out!

Summary#

In this recipe, we have learned how to use the DartmouthLLM and ChatDartmouth components. Which one to use depends on whether you are working with a baseline completion model or an instruction-tuned chat model:

Baseline completion models can only be used with DartmouthLLM. Instruction-tuned chat models can be used with ChatDartmouth.

You can also use DartmouthLLM with some local instruction-tuned model, if you want full control over the exact string that is sent to the model. In that case, however, you might see unexpected responses if the prompt format is not correct.