All Posts

How to Develop Apps Using LangChain and LLMs

LangChain is the glue that connects LLMs to your data. We explain Chains, Prompts, and Agents, and how to build your first app.

Abstract AlgorithmsAbstract Algorithms
ยทยท5 min read
Share
Share on X / Twitter
Share on LinkedIn
Copy link

TLDR: LangChain is a framework that simplifies building LLM applications. It provides abstractions for Chains (linking steps), Memory (remembering chat history), and Agents (using tools). It turns raw API calls into composable building blocks.


๐Ÿ“– Lego Bricks for LLM Apps

Building with the raw OpenAI API means writing the same boilerplate endlessly: formatting prompts, managing conversation history, parsing outputs, calling tools when needed.

LangChain is the Lego set โ€” pre-assembled pieces (prompt templates, memory stores, output parsers, tool wrappers) that snap together so you can focus on logic rather than plumbing.

Raw APILangChain
Manual string formattingChatPromptTemplate
Manual history appendingConversationBufferMemory
Manual tool calling logicAgentExecutor
Manual output parsingStrOutputParser, JsonOutputParser

๐Ÿ”ข The Three Core Abstractions

A. Chains โ€” Linking Steps

A Chain connects: User Input โ†’ Prompt Template โ†’ LLM โ†’ Output Parser.

The | operator in LCEL (LangChain Expression Language) pipes the output of one step into the next:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

model = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_template("Translate to French: {text}")
chain = prompt | model | StrOutputParser()

result = chain.invoke({"text": "Hello, how are you?"})
# "Bonjour, comment allez-vous ?"

Chains are composable โ€” the output of chain can be piped into another chain.

B. Memory โ€” State Across Turns

LLMs are stateless: each API call starts fresh. LangChain's Memory objects inject conversation history into the next prompt automatically.

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

memory = ConversationBufferMemory()
conversation = ConversationChain(llm=model, memory=memory)

conversation.predict(input="My name is Alice.")
conversation.predict(input="What is my name?")
# "Your name is Alice."
Memory TypeKeepsBest For
ConversationBufferMemoryFull historyShort sessions
ConversationSummaryMemoryLLM-generated summaryLong sessions
ConversationBufferWindowMemoryLast N turnsChatbots with context limit

C. Agents โ€” LLMs That Use Tools

An Agent is an LLM that can decide which tools to call based on the user's question.

from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_community.tools import WikipediaQueryRun

tools = [WikipediaQueryRun()]
agent = create_openai_tools_agent(model, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

executor.invoke({"input": "What is the boiling point of mercury?"})
# Agent calls Wikipedia โ†’ reads result โ†’ returns answer

The Agent loop:

flowchart TD
    Q["User Question"] --> LLM["LLM: Choose Action"]
    LLM -->|calls tool| Tool["Tool (Wikipedia, Calculator, DB)"]
    Tool --> Observation["Observation (result)"]
    Observation --> LLM
    LLM -->|has enough info| Answer["Final Answer"]

โš™๏ธ Building a RAG Pipeline with LangChain

Retrieval-Augmented Generation (RAG) is the most common real-world LangChain pattern: load documents โ†’ embed them โ†’ retrieve relevant chunks โ†’ answer with context.

from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.chains import RetrievalQA

# 1. Load and split documents
loader = TextLoader("my_docs.txt")
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)

# 2. Embed and store
vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

# 3. Build the QA chain
qa = RetrievalQA.from_chain_type(
    llm=model,
    retriever=vectorstore.as_retriever(search_kwargs={"k": 4})
)

qa.invoke("What is the refund policy?")
flowchart LR
    Q["User Question"]
    Embed["Embed Question"]
    VDB["Vector Store\n(Chroma/FAISS)"]
    Chunks["Top-K Chunks"]
    LLM["LLM + Context"]
    A["Answer"]

    Q --> Embed --> VDB --> Chunks --> LLM --> A

๐Ÿง  LangSmith: Observability for LLM Chains

In production, you need to debug why a chain produced a wrong answer. LangSmith (LangChain's tracing backend) records every step:

  • Which prompt was sent.
  • What the LLM returned.
  • Which tool was called and with what arguments.
  • Total latency and token cost per step.

Enable tracing:

import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your_key"

All chain invocations are now automatically traced.


โš–๏ธ LangChain Trade-offs

BenefitRisk
Rapid prototyping with composable building blocksAdds abstraction layers that can obscure errors
Built-in integrations (100+ LLMs, vector stores, tools)Version churn โ€” API changes frequently
Memory management out of the boxToken cost grows if memory strategy is not tuned
Tracing via LangSmithProduction overhead if not carefully sampled

When to skip LangChain: If your use case is a single LLM call with a fixed prompt, the raw API (OpenAI SDK) is simpler and more debuggable. LangChain pays off when you have multi-step chains, conditional tool use, or complex memory strategies.


๐ŸŽฏ What to Study Next


๐Ÿ“Œ Summary

  • Chains (LCEL): Compose prompt โ†’ LLM โ†’ parser pipelines with the | operator.
  • Memory: Inject conversation history automatically. Choose the right memory type for session length.
  • Agents: LLMs that call tools in a loop until they have enough information to answer.
  • RAG: Load โ†’ chunk โ†’ embed โ†’ retrieve โ†’ answer. The most common production pattern.
  • LangSmith: Trace every chain step for debugging and cost analysis.

๐Ÿ“ Practice Quiz

  1. What does the | operator do in LangChain Expression Language (LCEL)?

    • A) It's a bitwise OR operation.
    • B) It chains the output of one step to the input of the next in a pipeline.
    • C) It runs two chains in parallel.
      Answer: B
  2. An LLM chatbot loses context after a few turns. Which LangChain component solves this?

    • A) OutputParser.
    • B) Memory (e.g., ConversationBufferMemory).
    • C) AgentExecutor.
      Answer: B
  3. When should you prefer the raw OpenAI SDK over LangChain?

    • A) Always โ€” LangChain is too slow.
    • B) For simple single-call applications where the abstraction adds more complexity than it saves.
    • C) Only when deploying to AWS.
      Answer: B

Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms