Building Your First AI Agent with Claude Agent SDK

Building Your First AI Agent with Claude Agent SDK: A Practical Guide

As a Senior Software Engineer with a decade of experience across Frontend, Web3, and now deeply immersed in AI, I've had the privilege of witnessing – and contributing to – some incredible technological shifts. Generative AI, particularly the explosion of sophisticated Large Language Models (LLMs) like Claude, represents one of the most exciting. But here's the thing: while LLMs are powerful, their true potential often lies not just in their ability to generate text, but in their capacity to act. This is where AI agents come in.

An AI agent, in essence, is an autonomous or semi-autonomous program that can perceive its environment, make decisions, and take actions to achieve specific goals. Imagine a personal assistant that not only understands your requests but can also browse the web, write code, interact with APIs, and complete multi-step tasks – all by itself. Sound futuristic? Not anymore.

Today, I’m thrilled to guide you through building your very first AI agent using Anthropic's Claude Agent SDK. We’re going to move beyond just prompting and dive into practical agentic workflows. By the end of this post, you'll have a functional agent capable of leveraging tools to achieve tasks autonomously.

Why Agents? Why Claude Agent SDK?

You might be asking, "Why bother with agents when I can just prompt Claude directly?" The answer lies in complexity and autonomy. Single prompts are great for single-turn tasks. But for multi-step problems, where information needs to be gathered, processed, and acted upon sequentially – often involving external tools – agents are indispensable. They provide:

Autonomy: The agent can decide the next best action, not you.
Tool Use: Agents can be equipped with various tools (web search, code interpreters, APIs) to extend their capabilities beyond pure text generation.
Persistence: They can maintain context across multiple turns and interactions.
Problem-Solving: Agents can break down complex problems into smaller, manageable sub-tasks.

The Claude Agent SDK, built on top of Anthropic's robust Claude models, offers a fantastic starting point. It provides a structured way to define tools, manage state, and orchestrate the agent's decision-making process.

Getting Started: Setup and Prerequisites

Before we dive into code, let's get our environment ready. You'll need:

Node.js (v18 or higher): For running our TypeScript/JavaScript code.
npm or yarn: Package managers.
Anthropic API Key: You can obtain one from the Anthropic console. Ensure you have access to models like Claude 3 Opus, Sonnet, or Haiku. Opus is generally preferred for agents due to its superior reasoning capabilities.

Let's create a new project directory and initialize it:

mkdir my-first-claude-agent
cd my-first-claude-agent
npm init -y
npm install @anthropic-ai/sdk dotenv
npm install -D typescript ts-node @types/node
npx tsc --init

Now, open your tsconfig.json and ensure target is set to es2020 or higher, and moduleResolution to node.

Create a .env file in your project root and add your API key:

ANTHROPIC_API_KEY="sk-YOUR_ANTHROPIC_API_KEY_HERE"

Defining Our Agent's Tools

The power of an agent comes from its tools. For our first agent, let's give it two essential capabilities:

Web Search: To gather information from the internet.
Code Interpreter: To execute JavaScript/TypeScript code, useful for calculations, data manipulation, and simple scripting.

1. Implementing the Web Search Tool

We'll use a simple mock function for web search to keep things focused. In a real application, you'd integrate with a search API (e.g., Google Custom Search, SerpAPI, Tavily API).

Create a file src/tools.ts:

// src/tools.ts
import { Tool } from "@anthropic-ai/sdk/lib/tool_use_utils";

export const webSearchTool: Tool = {
  name: "web_search",
  description: "Searches the internet for information based on a query.",
  input_schema: {
    type: "object",
    properties: {
      query: {
        type: "string",
        description: "The search query.",
      },
    },
    required: ["query"],
  },
};

// A mock implementation of the web search function
export async function executeWebSearch(query: string): Promise<string> {
  console.log(`Executing web search for: "${query}"`);
  // In a real application, you'd call a search API here
  // For demonstration, we'll return a hardcoded response
  if (query.toLowerCase().includes("current weather in new york")) {
    return "The current temperature in New York City is 68°F and partly cloudy. (Mock data)";
  } else if (query.toLowerCase().includes("population of france")) {
    return "The estimated population of France in 2023 is approximately 68 million people. (Mock data)";
  } else if (query.toLowerCase().includes("capital of australia")) {
    return "The capital of Australia is Canberra. (Mock data)";
  }
  return `Search result for "${query}": No specific information found in mock data. (Mock data)`;
}

2. Implementing the Code Interpreter Tool

This tool will allow our agent to run simple JavaScript code. Be extremely cautious when implementing and using a real code interpreter in a production environment due to security implications. For this demo, we'll use a very basic eval which is not secure for untrusted input.

Add to src/tools.ts:

// src/tools.ts (continued)

export const codeInterpreterTool: Tool = {
  name: "code_interpreter",
  description: "Executes JavaScript code and returns the output. Useful for calculations, data manipulation, or scripting. Be careful with security.",
  input_schema: {
    type: "object",
    properties: {
      code: {
        type: "string",
        description: "The JavaScript code to execute.",
      },
    },
    required: ["code"],
  },
};

// A highly insecure but simple implementation of a code interpreter
// DO NOT USE IN PRODUCTION WITH UNTRUSTED INPUT
export async function executeCodeInterpreter(code: string): Promise<string> {
  console.log(`Executing code:\n${code}`);
  try {
    const result = eval(code); // UNSAFE for production environments!
    return String(result);
  } catch (error: any) {
    return `Error executing code: ${error.message}`;
  }
}

Building Our Claude Agent

Now, let's put it all together. We'll create our agent in src/agent.ts.

// src/agent.ts
import Anthropic from "@anthropic-ai/sdk";
import "dotenv/config"; // Load environment variables
import { executeWebSearch, executeCodeInterpreter, webSearchTool, codeInterpreterTool } from "./tools";

// Initialize Anthropic client
const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

async function runAgent(prompt: string) {
  console.log(`\n--- Agent Starting: Task - "${prompt}" ---`);
  let messages: Anthropic.Messages.MessageParam[] = [
    {
      role: "user",
      content: prompt,
    },
  ];

  const availableTools = [
    webSearchTool,
    codeInterpreterTool,
  ];

  let toolOutputs: Anthropic.Messages.ToolUseBlock["output"][] = [];
  let systemMessages: string[] = [];

  for (let i = 0; i < 10; i++) { // Limit iterations to prevent infinite loops
    console.log(`\n--- Agent Iteration ${i + 1} ---`);
    const response = await anthropic.messages.create({
      model: "claude-3-opus-20240229", // Or sonnet-20240229, haiku-20240307
      max_tokens: 4096,
      messages: [
        {
          role: "system",
          content: `You are a helpful AI assistant. You can use the following tools to assist the user.
            ${systemMessages.join('\n')}
            When you have enough information to answer the user's request thoroughly, respond with the final answer using a 'text' block.`,
        },
        ...messages,
        ...toolOutputs.map(output => ({
          role: "tool_use" as const, // Cast to literal type
          id: output.tool_use_id,
          input: output.input, // This property is not actually available from a ToolUseBlock, but conceptually it helps.
                              // The SDK typically uses `tool_use_id` for tool_result messages.
                              // For simplicity, we track outputs here.
        })),
      ],
      tools: availableTools,
    });

    if (response.stop_reason === "end_turn") {
      // Claude sometimes just responds with text without tools, which means it thinks it's done.
      const textBlock = response.content.find(block => block.type === "text");
      if (textBlock && textBlock.type === "text") {
        console.log(`\nAgent Text Response:\n${textBlock.text}`);
        return textBlock.text;
      }
    }

    const toolUseBlocks = response.content.filter(
      (block): block is Anthropic.Messages.ToolUseBlock => block.type === "tool_use"
    );

    if (toolUseBlocks.length > 0) {
      console.log("Agent wants to use tools:");
      for (const toolUse of toolUseBlocks) {
        console.log(` - Tool: ${toolUse.name}, Input: ${JSON.stringify(toolUse.input)}`);

        let toolResult: string | undefined;
        let toolInput: any = toolUse.input; // Assuming input is an object

        if (toolUse.name === "web_search") {
          toolResult = await executeWebSearch(toolInput.query);
        } else if (toolUse.name === "code_interpreter") {
          toolResult = await executeCodeInterpreter(toolInput.code);
        } else {
          toolResult = `Error: Unknown tool "${toolUse.name}"`;
        }

        // Add the tool result to the toolOutputs for the next turn
        messages.push({
          role: "assistant",
          content: [toolUse], // The original tool_use block from the assistant
        });
        messages.push({
          role: "user", // The tool result is presented by the user to the assistant
          content: [
            {
              type: "tool_result",
              tool_use_id: toolUse.id,
              content: toolResult ?? "No output",
            },
          ],
        });

        systemMessages.push(`Tool ${toolUse.name} returned: ${toolResult}`);
      }
    } else {
      console.log("Agent did not use tools and reached stop_reason:", response.stop_reason);
      const textBlock = response.content.find(block => block.type === "text");
      if (textBlock && textBlock.type === "text") {
        console.log(`\nAgent Final Answer:\n${textBlock.text}`);
        return textBlock.text;
      } else {
        console.log("Agent completed turn without text response or tool use. Something might be wrong or agent is done.");
        return "Agent completed its turn. No further action or text response. Task might be effectively completed.";
      }
    }
  }

  return "Agent reached iteration limit without completing the task or providing a final answer.";
}

// Example usage
(async () => {
  // Task 1: Find information using web search
  await runAgent("What is the capital of Australia?");

  // Task 2: Perform a calculation using the code interpreter
  await runAgent("What is the result of (123 * 45) + 678?");

  // Task 3: Combine web search and a simple calculation
  await runAgent("What is the current weather in New York City, and if it's 68°F, what is that in Celsius? (C = (F - 32) * 5/9)");

  // Task 4: A more complex code interpretation
  await runAgent("I have a list of numbers: [10, 25, 30, 45, 50]. Please write JS code to filter out numbers greater than 30 and then sum the remaining numbers.");

})();

Let's break down the runAgent function's logic:

Initialization: We set up the Anthropic client and initialize messages with the user's prompt. We also define our availableTools.
Iterative Loop: The agent operates in a loop, allowing it to take multiple steps. We set a max_iterations to prevent runaway loops.
Claude Call: Inside the loop, we call anthropic.messages.create.

We provide a clear system prompt instructing Claude on its role and how to use tools.
The messages array keeps the conversation history, including previous user prompts, assistant responses, and tool results.
Crucially, we pass our availableTools to Claude. This is what allows Claude to choose to use them.
Model: claude-3-opus-20240229 is recommended for its powerful reasoning, but Sonnet or Haiku can also work depending on complexity.

Parsing Response:

If Claude decides to use a tool (stop_reason isn't end_turn and tool_use_blocks are present), we iterate through each tool_use instruction.
We execute the corresponding function (executeWebSearch, executeCodeInterpreter).
The result of the tool execution is then added back to the messages history as a tool_result where the role is user. This is crucial: the agent receives the tool's output as if the user provided it, allowing it to continue its reasoning.

Final Answer: If Claude's response includes a text block (and no tool_use), it means it believes it has completed the task and is providing the final answer. The loop then breaks.

Running Our Agent

To run this, ensure you have ts-node installed (npm install -g ts-node) or compile first.

npx ts-node src/agent.ts

You'll see a lot of console output as the agent thinks, uses tools, and processes results.

Expected Output (demonstrating agent thought process):

--- Agent Starting: Task - "What is the capital of Australia?" ---

--- Agent Iteration 1 ---
Agent wants to use tools:
 - Tool: web_search, Input: {"query":"capital of Australia"}
Executing web search for: "capital of Australia"

--- Agent Iteration 2 ---
Agent Final Answer:
The capital of Australia is Canberra.

--- Agent Starting: Task - "What is the result of (123 * 45) + 678?" ---

--- Agent Iteration 1 ---
Agent wants to use tools:
 - Tool: code_interpreter, Input: {"code":"(123 * 45) + 678"}
Executing code:
(123 * 45) + 678

--- Agent Iteration 2 ---
Agent Final Answer:
The result of (123 * 45) + 678 is 6203.

--- Agent Starting: Task - "What is the current weather in New York City, and if it's 68°F, what is that in Celsius? (C = (F - 32) * 5/9)" ---

--- Agent Iteration 1 ---
Agent wants to use tools:
 - Tool: web_search, Input: {"query":"current weather in New York City"}
Executing web search for: "current weather in New York City"

--- Agent Iteration 2 ---
Agent wants to use tools:
 - Tool: code_interpreter, Input: {"code":"(68 - 32) * 5/9"}
Executing code:
(68 - 32) * 5/9

--- Agent Iteration 3 ---
Agent Final Answer:
The current temperature in New York City is 68°F and partly cloudy. In Celsius, 68°F is approximately 20°C.

... (and so on for the other tasks)

Notice how the agent correctly identifies when to use web_search and when to use code_interpreter. For the New York weather task, it performs a search first, then uses the code interpreter to convert the temperature, demonstrating a multi-step thought process.

For the last task, even though the code_interpreter is insecure and simple, Claude can formulate correct JS code to achieve the goal:

const numbers = [10, 25, 30, 45, 50];
const filteredNumbers = numbers.filter(num => num <= 30);
const sum = filteredNumbers.reduce((acc, curr) => acc + curr, 0);
sum;

This showcases the agent's ability to reason about a problem and decompose it into tool calls and then finally synthesize the information.

Next Steps and Considerations

Congratulations!