Embeddings
- class langchain_dartmouth.embeddings.DartmouthEmbeddings
Embedding models deployed on Dartmouth’s cluster.
- Parameters:
model_name (str, optional) – The name of the embedding model to use, defaults to
"baai.bge-large-en-v1-5".model_kwargs (dict, optional) – Keyword arguments to pass to the model.
dimensions (int, optional) – The number of dimensions the resulting output embeddings should have. Not supported by all models.
dartmouth_chat_api_key – A Dartmouth Chat API key (obtainable from https://chat.dartmouth.edu). If not specified, it is attempted to be inferred from an environment variable
DARTMOUTH_CHAT_API_KEY.embeddings_server_url (str, optional) – URL pointing to an embeddings endpoint, defaults to
"https://chat.dartmouth.edu/api/".
Example
With an environment variable named
DARTMOUTH_CHAT_API_KEYpointing to your key obtained from Dartmouth Chat, using anembedding model only takes a few lines of code:from langchain_dartmouth.embeddings import DartmouthEmbeddings embeddings = DartmouthEmbeddings() response = embeddings.embed_query("Hello? Is there anybody in there?") print(response)
- static list(dartmouth_chat_api_key=None, url='https://chat.dartmouth.edu/api/')
List the models available through
DartmouthEmbeddings.- Parameters:
dartmouth_chat_api_key (str, optional) –
A Dartmouth Chat API key (obtainable from https://chat.dartmouth.edu). If not specified, it is attempted to be inferred from an environment variable
DARTMOUTH_CHAT_API_KEY.url (str, optional) – URL of the listing server
- Returns:
A list of descriptions of the available models
- Return type:
list[ModelInfo]
- async aembed_documents(texts, chunk_size=None, **kwargs)
Async Call to the embedding endpoint to retrieve the embeddings of multiple texts.
- Parameters:
text (str) – The list of texts to embed.
texts (List[str])
chunk_size (int | None)
kwargs (Any)
- Returns:
Embeddings for the texts.
- Return type:
List[List[float]]
- async aembed_query(text, **kwargs)
Async Call to the embedding endpoint to retrieve the embedding of the query text.
- Parameters:
text (str) – The text to embed.
kwargs (Any)
- Returns:
Embeddings for the text.
- Return type:
List[float]
- embed_documents(texts, chunk_size=None, **kwargs)
Call out to the embedding endpoint to retrieve the embeddings of multiple texts.
- Parameters:
text (str) – The list of texts to embed.
texts (List[str])
chunk_size (int | None)
kwargs (Any)
- Returns:
Embeddings for the texts.
- Return type:
List[List[float]]
- embed_query(text, **kwargs)
Call out to the embedding endpoint to retrieve the embedding of the query text.
- Parameters:
text (str) – The text to embed.
kwargs (Any)
- Returns:
Embeddings for the text.
- Return type:
List[float]
Large Language Models
- class langchain_dartmouth.llms.DartmouthLLM
Dartmouth-deployed Large Language Models. Use this class for non-chat models (e.g., CodeLlama 13B).
This class does not format the prompt to adhere to any required templates. The string you pass to it is exactly the string received by the LLM. If the desired model requires a chat template (e.g., Llama 3.1 Instruct), you may want to use
ChatDartmouthinstead.- Parameters:
model_name (str, optional) – Name of the model to use, defaults to
"codellama-13b-python-hf".temperature (float, optional) – Temperature to use for sampling (higher temperature means more varied outputs), defaults to
0.8.max_new_tokens (int) – Maximum number of generated tokens, defaults to
512.streaming (bool) – Whether to generate a stream of tokens asynchronously, defaults to
Falsetop_k (int, optional) – The number of highest probability vocabulary tokens to keep for top-k-filtering.
top_p (float, optional) – If set to < 1, only the smallest set of most probable tokens with probabilities that add up to
top_por higher are kept for generation, defaults to0.95.typical_p (float, optional) – Typical Decoding mass. See Typical Decoding for Natural Language Generation for more information, defaults to
0.95.repetition_penalty (float, optional) – The parameter for repetition penalty. 1.0 means no penalty. See this paper for more details.
return_full_text (bool) – Whether to prepend the prompt to the generated text, defaults to
Falsetruncate (int, optional) – Truncate inputs tokens to the given size
stop_sequences (List[str], optional) – Stop generating tokens if a member of
stop_sequencesis generated.seed (int, optional) – Random sampling seed
do_sample (bool) – Activate logits sampling, defaults to
False.watermark (bool) – Watermarking with A Watermark for Large Language Models, defaults to
Falsemodel_kwargs (dict, optional) – Parameters to pass to the model (see the documentation of LangChain’s HuggingFaceTextGenInference class.)
dartmouth_api_key (str, optional) – A Dartmouth API key (obtainable from https://developer.dartmouth.edu). If not specified, it is attempted to be inferred from an environment variable DARTMOUTH_API_KEY.
authenticator (Callable, optional) – A Callable returning a JSON Web Token (JWT) for authentication.
jwt_url (str, optional) – URL of the Dartmouth API endpoint returning a JSON Web Token (JWT).
inference_server_url (str) – URL pointing to an inference endpoint, defaults to
"https://ai-api.dartmouth.edu/tgi/".timeout (int) – Timeout in seconds, defaults to
120server_kwargs (dict, optional) – Holds any text-generation-inference server parameters not explicitly specified
**_ – Additional keyword arguments are silently discarded. This is to ensure interface compatibility with other langchain components.
Example
With an environment variable named
DARTMOUTH_API_KEYpointing to your key obtained from https://developer.dartmouth.edu, using a Dartmouth-hosted LLM only takes a few lines of code:from langchain_dartmouth.llms import DartmouthLLM llm = DartmouthLLM(model_name="codellama-13b-hf") response = llm.invoke("Write a Python script to swap two variables."") print(response)
- static list(dartmouth_api_key=None, url='https://api.dartmouth.edu/api/ai/models/')
List the models available through
DartmouthLLM.- Parameters:
dartmouth_api_key (str, optional) – A Dartmouth API key (obtainable from https://developer.dartmouth.edu). If not specified, it is attempted to be inferred from an environment variable
DARTMOUTH_API_KEY.url (str, optional) – URL of the listing server
- Returns:
A list of descriptions of the available models
- Return type:
list[dict]
- async ainvoke(*args, **kwargs)
Asynchronously transforms a single input into an output.
See LangChain’s API documentation for details on how to use this method.
- Returns:
The LLM’s completion of the input string.
- Return type:
str
- invoke(*args, **kwargs)
Transforms a single input into an output.
See LangChain’s API documentation for details on how to use this method.
- Returns:
The LLM’s completion of the input string.
- Return type:
str
- class langchain_dartmouth.llms.ChatDartmouth
Chat models made available by Dartmouth.
Use this class if you want to use any chat model, e.g., Anthropic’s Claude or OpenAI’s GPT, made accessible by Dartmouth.
Both free on-prem models, as well as paid third-party models are available.
To see which models are available, which features they support, and how much they cost, run ChatDartmouth.list().
- Parameters:
model_name (str) – Name of the model to use, defaults to
"openai.gpt-oss-120b".streaming (bool) – Whether to stream the results or not, defaults to
False.temperature (float) – Temperature to use for sampling (higher temperature means more varied outputs), defaults to
0.7.max_tokens (int) – Maximum number of tokens to generate, defaults to 512
logprobs (bool, optional) – Whether to return logprobs
stream_usage (bool) – Whether to include usage metadata in streaming output. If
True, additional message chunks will be generated during the stream including usage metadata, defaults toFalse.presence_penalty (float, optional) – Penalizes repeated tokens.
frequency_penalty (float, optional) – Penalizes repeated tokens according to frequency.
seed (int, optional) – Seed for generation
top_logprobs (int, optional) – Number of most likely tokens to return at each token position, each with an associated log probability.
logprobsmust be set to true if this parameter is used.logit_bias (dict, optional) – Modify the likelihood of specified tokens appearing in the completion.
n (int, optional) – Number of chat completions to generate for each prompt, defaults to None (i.e., use upstream default)
top_p (float, optional) – Total probability mass of tokens to consider at each step.
model_kwargs (dict, optional) – Holds any model parameters valid for
createcall not explicitly specified.dartmouth_chat_api_key (str, optional) – A Dartmouth Chat API key (see here for how to obtain one). If not specified, it is attempted to be inferred from an environment variable
DARTMOUTH_CHAT_API_KEY.inference_server_url (str, optional) – The URL of the inference server (e.g., https://chat.dartmouth.edu/api/)
**_ – Additional keyword arguments are silently discarded. This is to ensure interface compatibility with other langchain components.
Example
With an environment variable named
DARTMOUTH_CHAT_API_KEYpointing to your key obtained from https://chat.dartmouth.edu, using a third-party LLM provided by Dartmouth only takes a few lines of code:from langchain_dartmouth.llms import ChatDartmouth llm = ChatDartmouth(model_name="openai.gpt-oss-120b") response = llm.invoke("Hi there!") print(response.content)
Note
Paid cloud models are billed by token consumption using different pricing depending on their complexity. Dartmouth pays for the use, but a daily token limit per user applies. Your token budget is the same as in Dartmouth Chat. Learn more about credits in Dartmouth Chat’s documentation.
- validator validate_temperature » all fields
Currently o and GPT 5 models only allow temperature=1.
- Parameters:
values (dict[str, Any])
- Return type:
Any
- static list(dartmouth_chat_api_key=None, base_only=True, url='https://chat.dartmouth.edu/api/')
List the models available through
ChatDartmouth.- Parameters:
dartmouth_chat_api_key (str, optional) –
A Dartmouth Chat API key (obtainable from https://chat.dartmouth.edu). If not specified, it is attempted to be inferred from an environment variable
DARTMOUTH_CHAT_API_KEY.base_only (bool, optional) – If True, only regular Large Language Models are returned. If False, Workspace models are also returned.
url (str, optional) – URL of the listing server
- Returns:
A list of descriptions of the available models
- Return type:
list[ModelInfo]
- async ainvoke(*args, **kwargs)
Asynchronously invokes the model to get a response to a query.
See LangChain’s API documentation for details on how to use this method.
- Returns:
The LLM’s response to the prompt.
- Return type:
BaseMessage
- invoke(*args, **kwargs)
Invokes the model to get a response to a query.
See LangChain’s API documentation for details on how to use this method.
- Returns:
The LLM’s response to the prompt.
- Return type:
BaseMessage
Reranking
- class langchain_dartmouth.retrievers.document_compressors.DartmouthReranker
Reranks documents using a reranking model deployed in the Dartmouth cloud.
- Parameters:
model_name (str, optional) – The name of the embedding model to use, defaults to
"bge-reranker-v2-m3".top_n (int) – Number of documents to return, defaults to
3dartmouth_api_key (str, optional) – A Dartmouth API key (obtainable from https://developer.dartmouth.edu). If not specified, it is attempted to be inferred from an environment variable
DARTMOUTH_API_KEY.authenticator (Callable, optional) – A Callable returning a JSON Web Token (JWT) for authentication.
jwt_url (str, optional) – URL of the Dartmouth API endpoint returning a JSON Web Token (JWT).
embeddings_server_url (str, optional) – URL pointing to an embeddings endpoint, defaults to
"https://ai-api.dartmouth.edu/tei/".
Example
With an environment variable named
DARTMOUTH_API_KEYpointing to your key obtained from https://developer.dartmouth.edu, using a Dartmouth-hosted Reranker only takes a few lines of code:from langchain.docstore.document import Document from langchain_dartmouth.retrievers.document_compressors import DartmouthReranker docs = [ Document(page_content="Deep Learning is not..."), Document(page_content="Deep learning is..."), ] query = "What is Deep Learning?" reranker = DartmouthReranker() ranked_docs = reranker.compress_documents(query=query, documents=docs) print(ranked_docs)
- static list(dartmouth_api_key=None, url='https://api.dartmouth.edu/api/ai/models/')
List the models available through
DartmouthReranker.- Parameters:
dartmouth_api_key (str, optional) – A Dartmouth API key (obtainable from https://developer.dartmouth.edu). If not specified, it is attempted to be inferred from an environment variable
DARTMOUTH_API_KEY.url (str, optional) – URL of the listing server
- Returns:
A list of descriptions of the available models
- Return type:
list[dict]
- compress_documents(documents, query, callbacks=None)
Returns the most relevant documents with respect to a query.
- Parameters:
documents (Sequence[Document]) – Documents to compress.
query (str) – Query to consider.
callbacks (Callbacks, optional) – Callbacks to run during the compression process, defaults to
None
- Returns:
The
top_nhighest-ranked documents- Return type:
Sequence[Document]
Model Listing
Model Listing and Information Classes.
This module provides classes and utilities for discovering and querying available AI models through Dartmouth’s infrastructure. It includes functionality for listing models from both on-premises and cloud-based services, along with detailed metadata about each model’s capabilities, costs, and properties.
The primary classes in this module are:
ModelInfo: A Pydantic model representing detailed information about a specific AI model, including its capabilities, hosting location, and cost.DartmouthModelListing: Interface for listing on-premises models hosted by Dartmouth.CloudModelListing: Interface for listing cloud-based models available through Dartmouth Chat.
Model listing functions are integrated into the respective model class interfaces.
For example, you can call ChatDartmouth.list() or
DartmouthEmbeddings.list() to discover available models for those specific
use cases.
Examples
Listing available cloud models:
>>> from langchain_dartmouth.model_listing import CloudModelListing
>>> import os
>>> listing = CloudModelListing(
... api_key=os.environ["DARTMOUTH_CHAT_API_KEY"],
... url=os.environ["LCD_CLOUD_BASE_URL"]
... )
>>> models = listing.list(base_only=True)
>>> for model in models:
... print(f"{model.name}: {model.description}")
Accessing model information:
>>> model = models[0]
>>> print(f"Model ID: {model.id}")
>>> print(f"Capabilities: {model.capabilities}")
>>> print(f"Cost: {model.cost}")
>>> print(f"Local hosting: {model.is_local}")
Notes
The model listing functionality requires appropriate API credentials and access to Dartmouth’s AI infrastructure. Model availability and metadata may vary based on your access level and the current deployment configuration.
- class langchain_dartmouth.model_listing.ModelInfo
A class representing information about a model.
This class encapsulates metadata about language models, embedding models, and other AI models available through Dartmouth’s infrastructure. It provides a structured way to access model properties, capabilities, and configuration details.
The class automatically processes and validates model metadata from API responses, extracting relevant information from nested structures and tags.
- id
Unique identifier used to access the model in API calls.
- Type:
str
- name
Human-readable name of the model for display purposes.
- Type:
str | None
- description
Detailed description of the model’s purpose and characteristics, as displayed in Dartmouth Chat interface.
- Type:
str | None
- is_embedding
Flag indicating whether this model can be used for generating embeddings.
- Type:
bool | None
- capabilities
List of model capabilities such as ‘vision’, ‘tool calling’, ‘reasoning’, etc.
- Type:
list[str] | None
- is_local
Indicates model hosting location: - True: Model is hosted on-premises by Dartmouth - False: Model is hosted off-premises by a third-party provider
- Type:
bool | None
- cost
Relative cost indicator for model usage: - “free”: No cost for usage - “$” to “$$$$”: Increasing cost levels - “undefined”: Cost information not available
- Type:
Literal[“undefined”, “free”, “$”, “$$”, “$$$”, “$$$$”] | None
Examples
>>> model = ModelInfo( ... id="gpt-4", ... name="GPT-4", ... description="Advanced language model", ... is_embedding=False, ... capabilities=["vision", "tool calling"], ... is_local=False, ... cost="$$$" ... ) >>> print(model.id) gpt-4 >>> print(model.capabilities) ['vision', 'tool calling']
- class langchain_dartmouth.model_listing.DartmouthModelListing(api_key, url)
Interface for listing on-premises models hosted by Dartmouth.
This class provides access to models hosted on Dartmouth’s on-premises infrastructure. It handles authentication using Dartmouth API keys and supports filtering models by various criteria.
- Parameters:
api_key (str) – Dartmouth API key for authentication.
url (str) – Base URL of the Dartmouth model listing API.
Examples
>>> from langchain_dartmouth.model_listing import DartmouthModelListing >>> import os >>> listing = DartmouthModelListing( ... api_key=os.environ["DARTMOUTH_API_KEY"], ... url="https://api.dartmouth.edu/models/" ... ) >>> models = listing.list(type="llm") >>> for model in models: ... print(model["id"])
Notes
Authentication is performed using JWT tokens obtained via the
dartmouth_authpackage. The token is automatically refreshed if a request fails due to authentication issues.- list(**kwargs)
Get a list of available on-premises models.
Retrieves models from Dartmouth’s on-premises infrastructure, with optional filtering by server, type, or capabilities.
- Parameters:
**kwargs (dict) –
Optional filtering parameters:
- serverstr
Filter by specific server name
- typestr
Filter by model type (e.g., “llm”, “embedding”)
- capabilitiesstr or list[str]
Filter by model capabilities
- Returns:
List of model descriptions as dictionaries. Each dictionary contains model metadata including id, name, capabilities, etc.
- Return type:
List[dict]
- Raises:
requests.HTTPError – If the API request fails after retry with re-authentication.
Examples
List all models:
>>> models = listing.list()
Filter by model type:
>>> llm_models = listing.list(type="llm")
Filter by capabilities:
>>> vision_models = listing.list(capabilities="vision")
Notes
If the initial request fails, the method automatically attempts to re-authenticate and retry the request once.
- class langchain_dartmouth.model_listing.CloudModelListing(api_key, url)
Interface for listing cloud-based models available through Dartmouth Chat.
This class provides access to models available through Dartmouth’s cloud infrastructure, including both base models and customized/fine-tuned variants. It returns structured
ModelInfoobjects with detailed metadata.- Parameters:
api_key (str) – API key for Dartmouth Chat authentication.
url (str) – Base URL of the cloud model listing API.
Examples
>>> from langchain_dartmouth.model_listing import CloudModelListing >>> import os >>> listing = CloudModelListing( ... api_key=os.environ["DARTMOUTH_CHAT_API_KEY"], ... url=os.environ["LCD_CLOUD_BASE_URL"] ... ) >>> models = listing.list(base_only=True) >>> for model in models: ... print(f"{model.name} - Cost: {model.cost}")
Notes
Cloud models are accessed through bearer token authentication. The returned
ModelInfoobjects provide structured access to model metadata including capabilities, costs, and hosting details.- list(base_only=False)
Get a list of available cloud models.
Retrieves models from Dartmouth’s cloud infrastructure, with the option to filter for only base models or include customized variants.
- Parameters:
base_only (bool, optional) – If True, return only base models. If False (default), return both base models and customized/fine-tuned variants.
- Returns:
List of
ModelInfoobjects containing detailed metadata about each available model. Only active models are included.- Return type:
List[ModelInfo]
- Raises:
requests.HTTPError – If the API request fails.
Examples
List all available models:
>>> all_models = listing.list()
List only base models:
>>> base_models = listing.list(base_only=True)
Access model details:
>>> for model in base_models: ... if "vision" in model.capabilities: ... print(f"{model.name} supports vision")
Notes
The method automatically filters out inactive models from the results. Model metadata is validated and structured using the
ModelInfoPydantic model, which extracts capabilities, costs, and other properties from the API response.
Base URLs
Configuration definitions for langchain_dartmouth.
This module contains base URLs and configuration constants used throughout
the langchain_dartmouth library. All URLs can be overridden via environment
variables.
- langchain_dartmouth.definitions.EMBEDDINGS_BASE_URL = 'https://api.dartmouth.edu/api/ai/tei/'
Base URL for the embeddings API endpoint.
Can be overridden by setting the
LCD_EMBEDDINGS_BASE_URLenvironment variable. Defaults tohttps://api.dartmouth.edu/api/ai/tei/.- Type:
str
- langchain_dartmouth.definitions.RERANK_BASE_URL = 'https://api.dartmouth.edu/api/ai/tei/'
Base URL for the reranking API endpoint.
Can be overridden by setting the
LCD_RERANK_BASE_URLenvironment variable. Defaults tohttps://api.dartmouth.edu/api/ai/tei/.- Type:
str
- langchain_dartmouth.definitions.LLM_BASE_URL = 'https://api.dartmouth.edu/api/ai/tgi/'
Base URL for the Large Language Model API endpoint.
Can be overridden by setting the
LCD_LLM_BASE_URLenvironment variable. Defaults tohttps://api.dartmouth.edu/api/ai/tgi/.- Type:
str
- langchain_dartmouth.definitions.CLOUD_BASE_URL = 'https://chat.dartmouth.edu/api/'
Base URL for the Dartmouth Chat API endpoint.
Can be overridden by setting the
LCD_CLOUD_BASE_URLenvironment variable. Defaults tohttps://chat.dartmouth.edu/api/.- Type:
str
- langchain_dartmouth.definitions.MODEL_LISTING_BASE_URL = 'https://api.dartmouth.edu/api/ai/models/'
Base URL for the model listings API endpoint.
Can be overridden by setting the
LCD_MODEL_LISTINGS_BASE_URLenvironment variable. Defaults tohttps://api.dartmouth.edu/api/ai/models/.- Type:
str