Large Language Models#

A number of Large Language Models (LLMs) are available in langchain_dartmouth.

LLMs in this library generally come in two flavors:

  • Baseline completion models:

    • These models are trained to simply continue the given prompt by adding the next token.

  • Instruction-tuned chat models:

    • These models are built on baseline completion models, but further trained using a specific prompt format to allow a conversational back-and-forth. Dartmouth offers limited access to various third-party commercial chat models, e.g., OpenAI’s GPT-4o or Anthropic’s Claude. Daily token limits per user apply.

Each of these models are supported by langchain_dartmouth using a separate component.

You can find all available models using the list() method of the respective class, as we will see below.

Let’s explore these components! But before we we get started, we need to load our Dartmouth API key and Dartmouth Chat API key from the .env file:

from dotenv import find_dotenv, load_dotenv

load_dotenv(find_dotenv())
True

Baseline Completion Models#

Baseline completion models are trained to simply continue the given prompt by adding the next token. The continued prompt is then considered the next input to the model, which extends it by another token. This continues until a specified maximum number of tokens have been added, or until a special token called a stop token is generated.

A popular use-case for completion models is to generate code. Let’s try an example and have the LLM generate a function based on its signature!

All baseline completion models are available through the component DartmouthLLM in the submodule langchain_dartmouth.llms, so we first need to import that class:

from langchain_dartmouth.llms import DartmouthLLM

We can find out, which models are available, by using the static method list():

Note

A static method is a function that is defined on the class itself, not on an instance of the class. It’s essentially just a regular function, but tied to a class for grouping purposes. In practice, that means that we can call a static method without instantiating an object of the class first. That is why there are no parentheses after the class name in the next code block!

DartmouthLLM.list()
[{'name': 'llama-3-8b-instruct',
  'provider': 'meta',
  'display_name': 'Llama 3 8B Instruct',
  'tokenizer': 'meta-llama/Meta-Llama-3-8B-Instruct',
  'type': 'llm',
  'capabilities': ['chat'],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 8192}},
 {'name': 'llama-3-1-8b-instruct',
  'provider': 'meta',
  'display_name': 'Llama 3.1 8B Instruct',
  'tokenizer': 'meta-llama/Llama-3.1-8B-Instruct',
  'type': 'llm',
  'capabilities': ['chat'],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 8192}},
 {'name': 'llama-3-2-11b-vision-instruct',
  'provider': 'meta',
  'display_name': 'Llama 3.2 11B Vision Instruct',
  'tokenizer': 'meta-llama/Llama-3.2-11B-Vision-Instruct',
  'type': 'llm',
  'capabilities': ['chat', 'vision'],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 127999}},
 {'name': 'codellama-13b-instruct-hf',
  'provider': 'meta',
  'display_name': 'CodeLlama 13B Instruct HF',
  'tokenizer': 'meta-llama/CodeLlama-13b-Instruct-hf',
  'type': 'llm',
  'capabilities': ['chat'],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 6144}},
 {'name': 'codellama-13b-python-hf',
  'provider': 'meta',
  'display_name': 'CodeLlama 13B Python HF',
  'tokenizer': 'meta-llama/CodeLlama-13b-Python-hf',
  'type': 'llm',
  'capabilities': [],
  'server': 'text-generation-inference',
  'parameters': {'max_input_tokens': 2048}}]

We can now instantiate a specific LLM by specifying its name as it appears in the listing. Since the model will generate the continuation of our prompt, it usually makes sense to repeat our prompt in the response, which we can request by setting the parameter return_full_text to True:

llm = DartmouthLLM(model_name="codellama-13b-python-hf", return_full_text=True)

We can now send a prompt to the model and receive its response by using the invoke() method:

response = llm.invoke("def remove_digits(s: str) -> str:")
print(response)
def remove_digits(s: str) -> str:
    digits = set("1234567890")
    res = ""
    for c in s:
        if c not in digits:
            res += c
    return res


assert remove_digits("123abc111") == "abc"
assert remove_digits("") == ""
assert remove_digits("p7h02hf2h10") == "php"

Since they are only trained to continue the given prompt, completion models are not great at responding to chat-like prompts:

response = llm.invoke("How can I define a class in Python?")
print(response)
How can I define a class in Python?

What is the equivalent of `classname` in Java in Python?

Answer: \begin{code}
class ClassName:
\end{code}

Note that `ClassName` must be a valid Python identifier (i.e., it cannot contain spaces, etc.). Also, it is recommended to use CamelCase rather than all uppercase, since uppercase is usually reserved for constants.

For more information, check out the Python tutorial [here](https://docs.python.org/3/tutorial/classes.html).

Answer: \begin{code}
class Foo:
    def __init__(self):
        self.bar = 10
\end{code}

Comment: Thanks for the answer. I guess my question should have been "How can I define a class in Python" :)

Comment: @Faisal, no problem, I updated the title of your question to match the body.

Answer: \begin{code}
class Foo:
    def __init__(self):
        self.bar = 10
\end{code}

As we can see, the model just continues the prompt in a way that is similar to what it has seen during its training. If we want to use it in a conversational way, we need to use an instruction-tuned chat model.

Instruction-Tuned Chat Models#

Instruction-tuned chat models are trained to follow a specific set of instructions that the model is expected to follow. These models can be used in conversational scenarios, where the user asks the model questions and the model replies with answers. The model will not just continue the prompt but also understand the context of the conversation preceding the prompt. To achieve this, baseline completion models are fine-tuned (i.e., further trained) on conversational text material that is formatted following a particular template. That is why we often see multiple variants of an LLM: the base model and the instruct version (see, e.g., CodeLlama).

Let’s see what happens if we ask an instruction-tuned model our question from the previous section:

llm = DartmouthLLM(model_name="codellama-13b-instruct-hf")
response = llm.invoke("How can I define a class in Python?")
print(response)
\begin{code}
class MyClass(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
\end{code}

Or

\begin{code}
class MyClass:
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
\end{code}

Which is the correct way?

Answer: Your second example is correct. It is a new-style class. Your first example is an old-style class.

In Python 2.1 and newer, all classes are new-style classes. In Python 2.0 and older, new-style classes are classes that inherit from `object`. Old-style classes do not inherit from `object` and are considered legacy.

All new-style classes are subclasses of `object`, and therefore support [methods](https://docs.python.org/3/tutorial/classes.html#class-objects) (e.g. `__init__`) and [properties](https://docs.python.org/3/tutorial/classes.html#property-objects) (e.g. `__getitem__`, `__getattr__`, `__setattr__`, `__delattr__`) that are not defined in their parent class.

Old-style classes do not support these features.

The `object` class is the base class of all classes, and is the only class in Python that is not a subclass of itself.

Comment: In other words, if you want to use `property` or `__getitem__` or any other special methods, you have to make sure your class inherits from `object`.

Comment: And also for __getattr__, __setattr__, __delattr__

Comment: @MikeC.: `__getattr__`, `__setattr__` and `__delattr__` are all special methods of the `object` class, not of the `property` class.

Comment: Ah, my bad. I meant for the class as a whole.

Answer: The second example is the correct way.

The first example uses a syntax from before Python 2.2.  Before that version, all classes were [

Well, that does not seem very helpful… What went wrong here?

The problem is that the prompt we use during inference (when we invoke the model) needs to follow the same format that was used during the instruction-tuning. This format is not the same for every model! Let’s try our prompt again using CodeLlama’s Instructions format:

response = llm.invoke("<s>[INST] How can I define a class in Python? [/INST] ")

print(response)
 In Python, you can define a class using the `class` keyword followed by the name of the class and the body of the class enclosed in curly braces. Here is an example:
```
class MyClass:
    pass
```
This defines a class called `MyClass` with no attributes or methods. You can add attributes and methods to the class by using the `def` keyword to define functions inside the class. Here is an example:
```
class MyClass:
    def __init__(self, name):
        self.name = name

    def say_hello(self):
        print("Hello, my name is", self.name)
```
This defines a class called `MyClass` with an `__init__` method that takes a `name` parameter and sets it as an attribute of the class, and a `say_hello` method that prints a greeting message using the `name` attribute.

You can also use inheritance to define a class that inherits from another class. Here is an example:
```
class Animal:
    def __init__(self, name):
        self.name = name

    def say_hello(self):
        print("Hello, my name is", self.name)

class Dog(Animal):
    def bark(self):
        print("Woof!")
```
This defines a class called `Dog` that inherits from the `Animal` class and adds a `bark` method.

You can also use abstract classes, which are classes that cannot be instantiated and are used to define a common interface for multiple classes. Here is an example:
```
from abc import ABC, abstractmethod

class Animal(ABC):
    @abstractmethod
    def say_hello(self):
        pass

class Dog(Animal):
    def say_hello(self):
        print("Woof!")
```
This defines a class called `Animal` that is abstract and has an abstract `say_hello` method, and a class called `Dog` that inherits from `Animal` and defines a concrete `say_hello` method.

I hope this helps! Let me know if you have any questions.

That looks a lot better!

Note

You may notice that the last sentence gets cut off. This is due to the default value for the maximum number of generated tokens, which may be too low. You can set a higher limit when you instantiate the DartmouthLLM object. Check the API reference for more information.

Managing the prompt format can quickly get tedious, especially if you want to switch between different models. Fortunately, the ChatDartmouth component handles the prompt formatting “under-the-hood” and we can just pass the actual message when we invoke it:

from langchain_dartmouth.llms import ChatDartmouth

llm = ChatDartmouth(model_name="meta.llama-3.2-11b-vision-instruct")
response = llm.invoke("How can I define a class in Python?")

print(response.content)
**Defining a Class in Python**
=====================================

In Python, you can define a class using the `class` keyword followed by the name of the class. Here's a basic example:

```python
class MyClass:
    # Class attributes and methods go here
```

**Class Structure**
-------------------

A class typically consists of:

1. **Class attributes**: These are variables that are shared by all instances of the class. They are defined inside the class definition, but outside any method.
2. **Methods**: These are functions that are part of the class and can be called on instances of the class.

**Example Class**
-----------------

Here's an example of a simple class called `Person`:

```python
class Person:
    def __init__(self, name, age):
        """Initialize a new Person instance"""
        self.name = name
        self.age = age

    def greet(self):
        """Print a greeting message"""
        print(f"Hello, my name is {self.name} and I am {self.age} years old.")

    def increment_age(self):
        """Increment the person's age by 1 year"""
        self.age += 1
```

**Instantiating a Class**
-------------------------

To create a new instance of a class, you use the `()` operator:

```python
person = Person("John", 30)
```

**Accessing Class Attributes and Methods**
-----------------------------------------

You can access class attributes and methods using the dot notation:

```python
print(person.name)  # Output: John
person.greet()  # Output: Hello, my name is John and I am 30 years old.
person.increment_age()
print(person.age)  # Output: 31
```

**Class Inheritance**
--------------------

Python supports inheritance, which allows you to create a new class that is a modified version of an existing class. You can use the `(ParentClass)` syntax to inherit from a parent class:

```python
class Employee(Person):
    def __init__(self, name, age, department):
        super().__init__(name, age)
        self.department = department

    def greet(self):
        """Print a greeting message with department"""
        print(f"Hello, my name is {self.name} and I am {self.age} years old. I work in the {self.department} department.")
```

Note that `super()` is used to call the parent class's `__init__` method.

**Best Practices**
------------------

* Use meaningful

That looks a lot better!

Note

ChatDartmouth returns more than just a raw string: It returns an AIMessage object, which you can learn more about in LangChain’s API reference.

We will see more of these message objects in the recipe on prompts!

By the way, just like with DartmouthLLM, we can get a list of the available chat models using the static method list():

models = ChatDartmouth.list(base_only=True)

for model in models:
    print(model)
    print("------")
id='openai.gpt-oss-120b' name='GPT-OSS 120b' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/). It can keep up with many of the Cloud models.' is_embedding=False capabilities=['usage', 'reasoning', 'tool calling'] is_local=True cost='free'
------
id='google.gemma-3-27b-it' name='Gemma 3 27b' description=None is_embedding=False capabilities=['vision', 'usage', 'vision'] is_local=True cost='free'
------
id='meta.llama-3.2-11b-vision-instruct' name='Llama 3.2 11b' description=None is_embedding=False capabilities=['vision', 'usage', 'vision'] is_local=True cost='free'
------
id='qwen.qwen3-vl-32b-instruct-fp8' name='Qwen3-VL 32b' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling'] is_local=True cost='free'
------
id='openai_responses.gpt-5.1-chat-latest' name='GPT-5.1 Instant' description='**Please note:** The GPT-5 family of models does not [produce formatted text](https://dartgo.org/dchat-gpt5) by default.' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$$$'
------
id='openai_responses.gpt-5.1-2025-11-13' name='GPT-5.1 Thinking 2025-11-13' description='**Please note:** The GPT-5 family of models does not [produce formatted text](https://dartgo.org/dchat-gpt5) by default.\nThis is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'hybrid reasoning', 'tool calling'] is_local=False cost='$$$'
------
id='openai_responses.gpt-5.2-chat-latest' name='GPT-5.2 Instant' description='This model is optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively "think" on harder queries in an attempt to improve accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is trained to be warmer and more conversational by default, with better instruction following and more stable short-form reasoning.' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$$$'
------
id='openai_responses.gpt-5.2-2025-12-11' name='GPT-5.2 Thinking 2025-12-11' description='This is a Hybrid Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/). Possible degrees are "none", "low" (default), "medium", and "high".' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'hybrid reasoning', 'vision'] is_local=False cost='$$$'
------
id='openai.gpt-4.1-mini-2025-04-14' name='GPT 4.1 mini 2025-04-14' description='**Looking for GPT-5 or another frontier model?** Select all the latest reasoning and open-weight models from the model selector, and click *Set as default* to make that selection stick.\nGPT-4.1 mini is our recommended default model because it offers a good compromise between speed and capability.' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$'
------
id='openai.gpt-4.1-2025-04-14' name='GPT 4.1 2025-04-14' description=None is_embedding=False capabilities=['vision', 'usage', 'vision'] is_local=False cost='$$$'
------
id='openai_responses.gpt-5-mini-2025-08-07' name='GPT-5 mini 2025-08-07' description='**Please note:** The GPT-5 family of models does not [produce formatted text](https://dartgo.org/dchat-gpt5) by default.\nThis is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage'] is_local=False cost='undefined'
------
id='openai_responses.gpt-5-2025-08-07' name='GPT-5 2025-08-07' description='**Please note:** The GPT-5 family of models does not [produce formatted text](https://dartgo.org/dchat-gpt5) by default.\nThis is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'reasoning', 'tool calling'] is_local=False cost='$$$'
------
id='anthropic.claude-3-5-haiku-20241022' name='Claude 3.5 Haiku 2024-10-22' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$'
------
id='anthropic.claude-haiku-4-5-20251001' name='Claude Haiku 4.5 2025-10-01' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling', 'hybrid reasoning'] is_local=False cost='$'
------
id='anthropic.claude-3-7-sonnet-20250219' name='Claude 3.7 Sonnet 2025-02-19' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'hybrid reasoning', 'tool calling', 'vision'] is_local=False cost='$$$$'
------
id='anthropic.claude-opus-4-5-20251101' name='Claude Opus 4.5 2025-11-01' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'hybrid reasoning', 'tool calling'] is_local=False cost='$$$$'
------
id='anthropic.claude-sonnet-4-20250514' name='Claude Sonnet 4 2025-05-14' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'hybrid reasoning', 'tool calling', 'vision'] is_local=False cost='$$$'
------
id='anthropic.claude-sonnet-4-5-20250929' name='Claude Sonnet 4.5 2025-09-29' description='This is a Hybrid Reasoning model that can respond immediately or "think" before it responds. By default, the model responds immediately. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'vision', 'hybrid reasoning', 'tool calling'] is_local=False cost='$$$'
------
id='vertex_ai.gemini-2.0-flash-001' name='Gemini 2.0 Flash 2025-02-05' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$'
------
id='vertex_ai.gemini-2.5-flash' name='Gemini 2.5 Flash' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'reasoning', 'vision'] is_local=False cost='$'
------
id='vertex_ai.gemini-2.5-pro' name='Gemini 2.5 Pro' description='This is a Reasoning model that "thinks" before it responds. You can [change the degree of reasoning effort in the chat settings](https://rc.dartmouth.edu/ai/online-resources/reasoning-settings/).' is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'reasoning', 'vision'] is_local=False cost='$$$'
------
id='mistral.mistral-large-2512' name='Mistral Large 3' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$$'
------
id='mistral.mistral-medium-2508' name='Mistral Medium 2508' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling'] is_local=False cost='$$'
------
id='mistral.pixtral-large-2411' name='Pixtral Large 2024-11-01' description=None is_embedding=False capabilities=['vision', 'usage', 'tool calling', 'vision'] is_local=False cost='$$$'
------
id='meta.llama-3-2-3b-instruct' name='Llama 3.2 3b' description=None is_embedding=False capabilities=['vision', 'usage'] is_local=True cost='free'
------
id='meta.codellama-13b-instruct-hf' name='CodeLlama 13b Instruct HF' description=None is_embedding=False capabilities=['usage'] is_local=True cost='free'
------
id='openai_responses.gpt-5.1-codex' name='GPT-5.1 Codex' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling'] is_local=False cost='$$$'
------
id='openai_responses.gpt-5.1-codex-mini' name='GPT-5.1 Codex Mini' description=None is_embedding=False capabilities=['vision', 'usage', 'vision', 'tool calling'] is_local=False cost='$'
------
id='mistral.devstral-2512' name='Mistral Devstral 2' description=None is_embedding=False capabilities=['vision', 'usage'] is_local=False cost='undefined'
------

Third-party chat models#

In addition to the locally-deployed, open-source models, Dartmouth also offers access to various third-party chat models. These models are available through the ChatDartmouth class, just like the locally deployed models.

Note

Remember: You need a separate API key for ChatDartmouth. Follow the instructions to get yours, and then store it in an environment variable called DARTMOUTH_CHAT_API_KEY.

The ChatDartmouth.list() method returns a list of ModelInfo objects, which contain helpful information on whether a model is local or off-prem, how much it costs, and what capabilities it has.

Warning

All models available through ChatDartmouth that are marked as is_local == False are commercial, third-party models. This means that your data will be sent to the model provider to be processed. If you have privacy concerns, please reach out to Research Computing to obtain a copy of the terms of use for the model you are interested in.

Note

Dartmouth pays for a significant daily token allotment per user, but eventually you may hit a limit. If you need a larger volume of tokens for your project, please reach out!

Summary#

In this recipe, we have learned how to use the DartmouthLLM and ChatDartmouth components. Which one to use depends on whether you are working with a baseline completion model or an instruction-tuned chat model:

Baseline completion models can only be used with DartmouthLLM. Instruction-tuned chat models can be used with ChatDartmouth.

You can also use DartmouthLLM with some local instruction-tuned model, if you want full control over the exact string that is sent to the model. In that case, however, you might see unexpected responses if the prompt format is not correct.