Structured JSON Outputs with Claude in 2026: tool_use, response_format, and Pydantic

Claude has no native json_mode flag — but tool_use forcing gives you 100% valid JSON. Here is the production pattern with Pydantic and Zod schemas.

The problem: LLMs hallucinate JSON

Anyone who has shipped an LLM-powered feature knows the pain. You ask the model for "a JSON object with fields name, email, score" and 99% of the time you get exactly that. The other 1% you get:

A leading ```json code fence the parser chokes on.
A trailing comment like // note: score estimated.
A trailing comma that breaks JSON.parse.
A nested string with an unescaped quote.
A field renamed to e_mail because the model thought it looked nicer.
Half the response in markdown explaining what the JSON means.

At a million calls a month, that 1% is ten thousand silent production failures. Free-form text generation is fundamentally a poor fit for machine-to-machine contracts.

OpenAI shipped response_format: { type: "json_schema", strict: true } in 2024 and effectively solved this on their stack — the model's decoding is constrained at the token level so the output is grammar-guaranteed valid against the schema. Anthropic took a different route. As of 2026 there is still no json_mode flag on the native Messages API. Instead, Anthropic asks you to lean on the feature they already had: tool use.

This article is the production pattern we recommend for getting 100% valid, schema-conformant JSON out of Claude — covering the native API via tool_use forcing, the OpenAI-compatible response_format shim, Pydantic and Zod schema generation, validation + retry loops, and a comparison with GPT‑4o's native json_schema mode.

All examples point at the Claudexia gateway base URL https://api.claudexia.tech/v1, which exposes both the native Anthropic Messages API and an OpenAI-compatible Chat Completions endpoint. Pricing and model coverage are in our Claude API pricing 2026 post.

Anthropic's answer: forced tool_use

Claude's tool use feature was designed for agentic workflows — letting the model call get_weather or search_db. But the mechanism it uses to emit tool calls is exactly what we want for structured output: the model produces a tool_use block whose input field is always a valid JSON object matching the tool's input_schema.

Anthropic's decoder enforces this. The model is not "asked nicely" to produce JSON; the tool input is the JSON. Combine that with tool_choice: { type: "tool", name: "..." } to force the model to call exactly that one tool, and you have a structured-output API in everything but name.

The recipe:

Define your output shape as a Pydantic model (Python) or a Zod schema (TypeScript).
Convert it to a JSON Schema.
Wrap it as a single tool with input_schema = <your JSON Schema>.
Send the request with tools=[that_tool] and tool_choice={"type":"tool","name":"<your tool name>"}.
Read response.content[0].input — that's your validated object.

Python: Pydantic → JSON Schema → tool_use

import anthropic
from pydantic import BaseModel, Field
from typing import Literal

class InvoiceLineItem(BaseModel):
    description: str
    quantity: int = Field(ge=1)
    unit_price_cents: int = Field(ge=0)

class Invoice(BaseModel):
    invoice_number: str
    issued_on: str = Field(description="ISO-8601 date, e.g. 2026-04-03")
    currency: Literal["USD", "EUR", "RUB", "GBP"]
    vendor_name: str
    line_items: list[InvoiceLineItem]
    total_cents: int

client = anthropic.Anthropic(
    base_url="https://api.claudexia.tech/v1",
    api_key="sk-cxa-...",
)

extract_tool = {
    "name": "record_invoice",
    "description": "Record a parsed invoice into the accounting system.",
    "input_schema": Invoice.model_json_schema(),
}

resp = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=2048,
    tools=[extract_tool],
    tool_choice={"type": "tool", "name": "record_invoice"},
    messages=[
        {"role": "user", "content": f"Extract the invoice:\n\n{raw_invoice_text}"}
    ],
)

raw = resp.content[0].input          # already a dict, already valid against the schema
invoice = Invoice.model_validate(raw)  # belt-and-suspenders Pydantic check

The resp.content[0].input value comes back as a Python dict that already parses cleanly and already satisfies the JSON Schema you sent. The extra Invoice.model_validate(raw) call is defensive — it gives you Pydantic's type coercion (e.g., string "42" → int 42) and triggers your custom validators.

TypeScript: Zod → JSON Schema → tool_use

import Anthropic from "@anthropic-ai/sdk";
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";

const SupportTicket = z.object({
  category: z.enum(["billing", "bug", "feature_request", "abuse", "other"]),
  urgency: z.enum(["low", "normal", "high", "critical"]),
  summary: z.string().max(280),
  suggested_owner_team: z.enum(["payments", "platform", "growth", "trust_safety"]),
  contains_pii: z.boolean(),
});

const client = new Anthropic({
  baseURL: "https://api.claudexia.tech/v1",
  apiKey: process.env.CLAUDEXIA_KEY!,
});

const resp = await client.messages.create({
  model: "claude-sonnet-4.5",
  max_tokens: 1024,
  tools: [{
    name: "classify_ticket",
    description: "Classify an inbound customer support ticket.",
    input_schema: zodToJsonSchema(SupportTicket, { target: "openAi" }) as any,
  }],
  tool_choice: { type: "tool", name: "classify_ticket" },
  messages: [{ role: "user", content: ticketBody }],
});

const block = resp.content.find((b) => b.type === "tool_use");
const ticket = SupportTicket.parse(block?.input);

Note target: "openAi" on zodToJsonSchema: it produces a JSON Schema variant Claude (and OpenAI) accept without complaint. The default Zod target emits $ref patterns the Anthropic API will reject.

OpenAI-compatible: response_format on the Claudexia gateway

If you have an existing codebase built against the OpenAI SDK, Claudexia's gateway accepts the OpenAI Chat Completions shape and translates response_format to forced tool_use under the hood. You don't have to rewrite anything.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.claudexia.tech/v1",
    api_key="sk-cxa-...",
)

resp = client.chat.completions.create(
    model="claude-sonnet-4.5",
    messages=[{"role": "user", "content": resume_text}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "Resume",
            "strict": True,
            "schema": Resume.model_json_schema(),
        },
    },
)

resume = Resume.model_validate_json(resp.choices[0].message.content)

This is the path of least resistance if you're migrating off OpenAI. Same SDK, same response_format semantics, Claude-quality output.

Three concrete examples

1. Extract an invoice

Input: a noisy PDF-extracted text of a vendor invoice. Output: the Invoice model from the Python snippet above. Forced tool_use means you never have to write a regex to scrape line items again — the model returns a typed list and your downstream code consumes invoice.line_items[0].unit_price_cents directly.

2. Classify a support ticket

Use the SupportTicket Zod schema. The category and urgency enums constrain the model so you can route on the result without a defensive .toLowerCase() everywhere. contains_pii is a boolean you can plumb straight into a redaction pipeline.

3. Parse a resume

class WorkExperience(BaseModel):
    company: str
    title: str
    started_on: str = Field(description="YYYY-MM")
    ended_on: str | None = Field(description="YYYY-MM or null if current")
    highlights: list[str]

class Resume(BaseModel):
    full_name: str
    email: str | None
    phone: str | None
    years_of_experience: int = Field(ge=0, le=60)
    skills: list[str]
    experience: list[WorkExperience]

Drop in the same tool_use plumbing and you have a structured-resume API.

Schema design tips

After shipping a few dozen of these in production, a handful of patterns matter more than the rest.

Use enum aggressively. Every field whose value belongs to a finite set should be an enum. The model is dramatically more accurate at picking from a list than at free-form generation.
Descriptions are not decoration. Claude reads description strings on every field. "ISO-8601 date" and "phone number in E.164 format" change the output. Treat them like prompts.
Keep nested depth shallow. Two levels deep is the sweet spot. At three or four levels you start seeing the model lose track of which { it's inside, even with forced tool use.
Prefer string enums over booleans for tri-state. status: "approved" | "rejected" | "needs_review" reasons better than two separate booleans.
Mark optional fields explicitly. Optional[str] in Pydantic / .optional() in Zod. The model will leave them out cleanly instead of inventing "unknown".
Don't ask for free-text inside structured fields. A summary: str field at the end is fine. A summary field followed by more structured fields tends to bleed prose into the structured ones.

Validation and retry loop

Even with forced tool_use, your business validators (e.g., "total_cents must equal sum of line items") can still fail. The pattern is straightforward:

def call_with_retry(messages, max_attempts=3):
    for attempt in range(max_attempts):
        resp = client.messages.create(
            model="claude-sonnet-4.5",
            max_tokens=2048,
            tools=[extract_tool],
            tool_choice={"type": "tool", "name": "record_invoice"},
            messages=messages,
        )
        raw = resp.content[0].input
        try:
            return Invoice.model_validate(raw)
        except ValidationError as e:
            messages.append({"role": "assistant", "content": resp.content})
            messages.append({
                "role": "user",
                "content": f"That output failed validation: {e}. Please call the tool again with corrected values.",
            })
    raise RuntimeError("Schema validation failed after retries")

A single retry usually fixes anything, and the second attempt sees the original model output plus the validation error in context — Claude will correct itself reliably.

GPT‑4o `json_schema` vs Claude forced `tool_use`

Both approaches deliver schema-conformant JSON in production. Practical differences:

Guarantee surface. OpenAI's strict: true is enforced by constrained decoding at the token level — the model literally cannot emit invalid JSON. Anthropic's forced tool_use is enforced by a validator on the tool input; in our experience the difference is invisible at the API surface (both return valid JSON every time), but the failure mode if Claude does fail is "the call errors out" rather than "the JSON is malformed".
Schema feature support. OpenAI's strict mode disallows oneOf, not, recursive refs, and a few other JSON Schema features. Claude's tool_use accepts a broader subset, including some $ref patterns (when emitted with the right Zod target) and richer pattern strings.
Description usage. Claude weighs field descriptions more heavily than GPT‑4o does. If your schema has rich description fields, expect Claude to take more advantage of them.
Latency. Forced tool_use adds a small fixed overhead vs free-form generation but is on par with GPT‑4o's json_schema mode in our benchmarks.
Migration cost. Through the Claudexia gateway's OpenAI-compatible endpoint, you change one line — the model parameter — and response_format keeps working.

Wrap-up

Claude does not need a json_mode flag, because tool use already gives you something stronger: a typed contract enforced by the API. Define your schema once in Pydantic or Zod, force a single tool, and you have a structured-output endpoint that is as reliable as anything in the industry. The OpenAI-compatible shim on the Claudexia gateway means you can drop Claude into an existing response_format codebase without changing a line of business logic. Combine that with a retry loop on your custom validators and you have a JSON pipeline that simply does not break.