Small Models, Big UX: Using Haiku to Make Smart Choices From Your Data

You Don't Always Need the Biggest Model

When developers reach for an AI model, the instinct is to grab the most powerful one. But for many UX decisions, a small model like Claude Haiku is not just good enough - it's actually the better choice. Faster responses, lower costs, and when you constrain the output space, accuracy that rivals much larger models.

The key insight: if you already know the valid answers, let the small model pick from them.

The Pattern: Preselected Sets + Small Model

Instead of asking a model to generate free-form output, you give it a closed set of options and ask it to select the best match. This is fundamentally different from open-ended generation - you're turning a creative task into a classification task, and small models are great at classification.

User input (messy, unstructured)
  → Small model (fast, cheap)
    → Pick from known options (constrained, reliable)
      → Great UX (instant, accurate)

Use Case 1: Auto-Selecting Options for Users

Imagine a project management app where users create tasks. Instead of making them manually pick a category from a dropdown, you can infer it from their task description.

import Anthropic from '@anthropic-ai/sdk'

const client = new Anthropic()

const categories = ['bug', 'feature', 'docs', 'refactor', 'test', 'chore'] as const

async function categorizeTask(description: string) {
  const response = await client.messages.create({
    model: 'claude-haiku-4-5-20251001',
    max_tokens: 20,
    messages: [{
      role: 'user',
      content: `Categorize this task into exactly one of: ${categories.join(', ')}

Task: "${description}"

Reply with only the category name, nothing else.`,
    }],
  })

  const text = response.content[0].type === 'text'
    ? response.content[0].text.trim().toLowerCase()
    : ''

  return categories.includes(text as any) ? text : 'chore'
}

The user types "fix the login button not responding on mobile" and instantly sees "bug" pre-selected. They can still change it, but 90% of the time they won't need to.

Use Case 2: Matching Free-Text to Known Values

Users type things in messy, inconsistent ways. A small model can map that chaos to your structured data.

const timezones = [
  'America/New_York',
  'America/Chicago',
  'America/Denver',
  'America/Los_Angeles',
  'Europe/London',
  'Europe/Paris',
  'Europe/Berlin',
  'Asia/Tokyo',
  'Asia/Shanghai',
  'Australia/Sydney',
]

async function matchTimezone(userInput: string) {
  const response = await client.messages.create({
    model: 'claude-haiku-4-5-20251001',
    max_tokens: 40,
    messages: [{
      role: 'user',
      content: `Match this user input to the closest timezone from the list.

Input: "${userInput}"
Options: ${timezones.join(', ')}

Reply with only the timezone identifier.`,
    }],
  })

  const text = response.content[0].type === 'text'
    ? response.content[0].text.trim()
    : ''

  return timezones.find(tz => tz === text) ?? null
}

Now "I'm in Paris", "CET", "france time", and "UTC+1" all resolve to Europe/Paris. No regex, no lookup table maintenance - just a model that understands language.

Use Case 3: Smart Defaults and Personalization

Given a user's recent activity, pick which dashboard widgets to show first, which notification settings to suggest, or which template to pre-select.

const templates = [
  { id: 'blank', name: 'Blank Document' },
  { id: 'meeting-notes', name: 'Meeting Notes' },
  { id: 'project-brief', name: 'Project Brief' },
  { id: 'weekly-update', name: 'Weekly Status Update' },
  { id: 'bug-report', name: 'Bug Report' },
]

async function suggestTemplate(context: string) {
  const response = await client.messages.create({
    model: 'claude-haiku-4-5-20251001',
    max_tokens: 30,
    messages: [{
      role: 'user',
      content: `Based on this context, which template would be most useful?

Context: "${context}"
Templates: ${templates.map(t => t.id).join(', ')}

Reply with only the template id.`,
    }],
  })

  const text = response.content[0].type === 'text'
    ? response.content[0].text.trim()
    : ''

  return templates.find(t => t.id === text)?.id ?? 'blank'
}

User opens the app on Monday morning after a sprint review? Suggest "weekly-update". They just came from a Jira ticket? Suggest "bug-report". The model reads the context, you get a smart default.

Why This Works

Three things make this pattern effective:

Constrained output = higher accuracy. When the model only has to pick from 5-10 options instead of generating arbitrary text, error rates drop dramatically. You're playing to the model's strengths.
Lower cost. Haiku is roughly 25x cheaper than Opus. When you're making these calls per-user-action, that difference matters. A categorization call costs fractions of a cent.
Faster response. Small models respond in ~200ms. That's fast enough to feel like a UI interaction, not an AI call. Users see a smart default appear before they even reach for the dropdown.

The fallback pattern is important too - always have a sensible default when the model's response doesn't match your expected options. This makes the system gracefully degrade rather than break.

When to Reach for a Bigger Model

This pattern has limits. If the decision requires deep reasoning, nuanced context understanding, or the option set is very large and ambiguous, you might need a more capable model. But start with Haiku and only upgrade when you see actual accuracy problems - you'll be surprised how rarely that happens with well-constrained choices.

The Takeaway

Stop thinking of AI models as "smart or dumb." Think of them as tools with different cost/speed/capability profiles. For picking from known options, small models are the right tool. Your users get faster, smarter interfaces, and you get a lower bill. That's a good trade.