All Posts

AI Agents Explained: When LLMs Start Using Tools

An LLM can talk, but an AI Agent can *act*. We explain how Agents use the ReAct framework to brow...

Abstract AlgorithmsAbstract Algorithms
··5 min read
Cover Image for AI Agents Explained: When LLMs Start Using Tools
Share
Share on X / Twitter
Share on LinkedIn
Copy link

TLDR: A standard LLM is a brain in a jar — it can reason but cannot act. An AI Agent connects that brain to tools (web search, code execution, APIs). Instead of just answering a question, an agent executes a loop of Thought → Action → Observation until the goal is reached.


📖 Brain in a Jar vs Brain with Arms

A plain LLM generates text. Give it "What is the weather in Tokyo today?" and it will:

  • Answer from training data (which is months or years old).
  • Confidently hallucinate a plausible-sounding answer.

An AI agent would:

  1. Recognize it needs current weather data.
  2. Call a weather API tool.
  3. Return the real, live answer.

The difference: the agent can act on the world, not just describe it.


⚙️ The ReAct Loop: Thought → Action → Observation

The dominant pattern for agents is ReAct (Reasoning + Acting). The model cycles through three steps until the task is complete:

StepTypeContent
1ThoughtI need to find out when the movie Titanic was released.
2Actionsearch("Titanic movie release date")
3Observation"Titanic was released in December 1997."
4ThoughtNow I need to find who was US president in December 1997.
5Actionsearch("US President December 1997")
6Observation"Bill Clinton was US President in December 1997."
7ThoughtI have all the information. I can answer.
8Final AnswerBill Clinton was president when Titanic was released.
flowchart TD
    Start([User Goal]) --> T[Thought: What do I need?]
    T --> A[Action: Call a Tool]
    A --> O[Observation: Tool Result]
    O --> D{Goal reached?}
    D -- No --> T
    D -- Yes --> Answer([Return Final Answer])

This loop continues until the model decides it has enough information to answer.


🔢 Tool Definitions: How an Agent Knows What It Can Do

A tool is a function the model can call. In LangChain you define tools with a name, description, and input schema:

from langchain.tools import tool

@tool
def search_web(query: str) -> str:
    """Search the web for current information. Use this for recent events or facts."""
    return web_search_api(query)

@tool
def run_python(code: str) -> str:
    """Execute Python code and return the output. Use this for calculations."""
    return exec_sandbox(code)

The model receives the tool descriptions in its system prompt and decides which to call (and with what arguments) based on the task. It never sees the implementation — only the name and docstring.


🧠 Building a Simple Agent with LangChain

from langchain.agents import create_react_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from langchain import hub

llm = ChatOpenAI(model="gpt-4o-mini")
tools = [search_web, run_python]

prompt = hub.pull("hwchase17/react")          # standard ReAct prompt
agent = create_react_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

result = executor.invoke({"input": "Who was US president when Titanic was released?"})
print(result["output"])

The verbose=True flag shows you the full Thought/Action/Observation chain — invaluable for debugging.


🌍 Real-World Agent Use Cases

Use caseTools used
Customer support triageCRM lookup, ticket creation, knowledge base search
Data analyst botSQL runner, Python executor, chart renderer
Code reviewer agentGitHub file reader, linter, test runner
Travel bookingFlight search API, hotel API, calendar API
Research assistantWeb search, PDF reader, citation manager

⚖️ When Agents Fail: Hallucinations, Loops, and Cost Blowouts

Agents introduce new failure modes beyond plain LLMs:

Hallucinated tool calls — the model invents arguments or calls a non-existent tool. Fix: validate tool schemas strictly; use structured outputs.

Infinite loops — the agent gets stuck in Thought→Action→Observation cycles with no progress. Fix: set a hard max_iterations limit.

Cost explosion — each loop iteration is an API call + tool call. A task that needs 15 iterations with GPT-4 can cost $1 per query. Fix: use cheaper models for planning steps; cache repeated tool results.

Context overflow — long observation histories can push earlier context out of the window. Fix: summarize or prune old observations periodically.


📌 Key Takeaways

  • A plain LLM generates text; an agent generates text and calls tools to act.
  • The dominant loop is ReAct: Thought → Action → Observation, repeated until the task is complete.
  • Tools are functions with a name and description; the LLM decides when and how to call them.
  • Key failure modes: hallucinated tool calls, infinite loops, cost explosion, and context overflow.
  • Always set max_iterations and monitor tool call costs in production.

🧩 Test Your Understanding

  1. What is the difference between an LLM and an AI agent?
  2. In the ReAct pattern, what triggers the agent to stop looping?
  3. Why are tool descriptions (docstrings) so important for agent reliability?
  4. Name two ways to prevent an agent from running up an unexpectedly large API bill.

Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms