Building Chains#
Applications built around Large Language Models (LLMs) often have a pipeline structure: Input is pre-processed into a prompt, the model is invoked using this prompt, and the model’s response may be post-processed to generate the desired output data structure or format.
This pattern can be abstractly represented as a chain of transformations, where each transformation is handled by invoking a component. Let’s re-use one of the chains from the previous recipe on output parsing. Let’s also apply a prompt template instead of using basic string manipulation.
from dotenv import find_dotenv, load_dotenv
load_dotenv(find_dotenv())
True
from langchain_dartmouth.llms import ChatDartmouth
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser
UNSTRUCTURED_TEXT = """The original, historic library building is the Fisher Ames Baker Memorial Library; it opened in 1928 with a collection of 240,000 volumes. The building was designed by Jens Fredrick Larson, modeled after Independence Hall in Philadelphia, and funded by a gift to Dartmouth College by George Fisher Baker in memory of his uncle, Fisher Ames Baker, Dartmouth class of 1859. The facility was expanded in 1941 and 1957–1958 and received its one millionth volume in 1970.
In 1992, John Berry and the Baker family donated US $30 million for the construction of a new facility, the Berry Library designed by architect Robert Venturi, adjoining the Baker Library. The new complex, the Baker-Berry Library, opened in 2000 and was completed in 2002.[6] The Dartmouth College libraries presently hold over 2 million volumes in their collections."""
# Setting up the components
prompt = PromptTemplate(
template="Extract a succinct timeline of events directly related the Library from the following text. Return the timeline as a list of dictionaries, where each dictionary has two keys: 'year' and 'event'. Format your output in JSON format. The text: \n\n{unstructured_text}"
)
llm = ChatDartmouth(model_name="llama-3-1-8b-instruct")
parser = JsonOutputParser()
# Invoking the components in sequence
formatted_prompt = prompt.invoke({"unstructured_text": UNSTRUCTURED_TEXT})
llm_response = llm.invoke(formatted_prompt)
timeline = parser.invoke(llm_response)
# Print the events in the timeline
for event in timeline:
print(event)
{'year': 1928, 'event': 'Fisher Ames Baker Memorial Library opened with a collection of 240,000 volumes'}
{'year': 1941, 'event': 'Library expansion'}
{'year': 1957, 'event': 'Library expansion started'}
{'year': 1958, 'event': 'Library expansion completed'}
{'year': 1970, 'event': 'Library received its one millionth volume'}
{'year': 1992, 'event': 'John Berry and the Baker family donated $30 million for a new library'}
{'year': 2000, 'event': 'Baker-Berry Library opened'}
{'year': 2002, 'event': 'Baker-Berry Library completed'}
As we can see, the final result is indeed generated by invoking each component with the output of the previous component. To make this sequential processing more elegant, LangChain offers the concept of chains. We compose chains by concatenating the components with the |
operator:
timeline_extraction_chain = prompt | llm | parser
The |
operator works very similarly to the pipe operator in a Unix/Linux shell: The output after invoking the component to the left of the operator will be passed (or piped) as input to the invocation of the component to the right.
The chain we just created can now be invoked with the input required by the first component and produces the output of the final component:
timeline_extraction_chain.invoke({"unstructured_text": UNSTRUCTURED_TEXT})
[{'year': 1928,
'event': 'Fisher Ames Baker Memorial Library opens with a collection of 240,000 volumes'},
{'year': 1941, 'event': 'Building expansion'},
{'year': 1957, 'event': 'First year of building expansion'},
{'year': 1958, 'event': 'Second year of building expansion'},
{'year': 1970, 'event': 'Reaching one million volumes'},
{'year': 1992,
'event': 'John Berry and the Baker family donated $30 million for a new facility'},
{'year': 2000, 'event': 'Baker-Berry Library opens'},
{'year': 2002, 'event': 'Baker-Berry Library completion'}]
Summary#
When stringing together components with the |
operator, we can replace a sequence of calls to the various components’ invoke
methods with a single call to chain.invoke
. This is the main benefit of using chains: they allow us to compose multiple components into a single pipeline that can be invoked in one call.