Few-Shot Prompting: Teaching by Example
Few-shot prompting is the practice of including worked examples directly in your prompt to demonstrate the exact input-output pattern you want. It's one of the oldest and most reliable prompting techniques — and when done well, it can transform a mediocre result into a precise, consistently formatted output that requires no post-processing.
This guide covers how to construct effective few-shot prompts, how to select and sequence examples, and when few-shot outperforms zero-shot (and vice versa).
What Is Few-Shot Prompting?
In few-shot prompting, you provide the model with n input-output pairs before the actual query. These examples act as an implicit specification: they show rather than tell.
Zero-shot:
Extract the company name and dollar amount from this press release:
[press release text]
Few-shot (3-shot):
Extract the company name and dollar amount from each press release.
Press release: "Acme Corp announced today it has raised $42M in Series B funding..."
Output: {"company": "Acme Corp", "amount": "$42M", "round": "Series B"}
Press release: "DataFlow, a Berlin-based startup, closed a €15M seed round..."
Output: {"company": "DataFlow", "amount": "€15M", "round": "Seed"}
Press release: "TechWave said it completed an $8.5 million bridge financing..."
Output: {"company": "TechWave", "amount": "$8.5M", "round": "Bridge"}
Press release: [YOUR TEXT HERE]
Output:
The few-shot version communicates the output schema, handles currency variants, normalizes amounts, and even infers round type — without writing a single explicit instruction.
When Few-Shot Beats Zero-Shot
Few-shot outperforms zero-shot when:
- Output format is non-standard — JSON with specific keys, CSV, custom markup
- The task has implicit conventions — domain-specific classification, style matching
- Edge cases matter — demonstrate how to handle ambiguous or unusual inputs
- Consistency is critical — you need the same format across hundreds of calls
- The task is novel to the model — rare domains or invented schemas
Zero-shot is preferable when:
- The task is simple and well-understood (sentiment: positive/negative/neutral)
- Latency and cost are constraints (each example adds tokens)
- You want maximum flexibility in the response
Anatomy of a Good Example
Each few-shot example should have three properties:
1. Representative Input
Choose inputs that reflect the real distribution of queries you'll receive — not just the easy, clean cases. If 20% of real inputs will be messy or edge-case-y, at least one of your examples should be too.
2. Correct, Ideal Output
The output must be exactly what you want. Sloppy examples teach sloppy behavior. If your output is JSON, it must be valid JSON with consistent key names. If it's prose, it must match the tone and length you want.
3. A Consistent Format Separator
Use consistent delimiters between examples and between input/output. Common patterns:
# Pattern 1: Label-based
Input: ...
Output: ...
# Pattern 2: XML-style
<example>
<input>...</input>
<output>...</output>
</example>
# Pattern 3: Q/A style
Q: ...
A: ...
Pick one and use it consistently across all examples and the final query.
How Many Examples?
More is not always better. Research and practice suggest:
- 1-shot — enough for simple format demonstration
- 3-shot — the sweet spot for most classification and extraction tasks
- 5-10 shot — needed for tasks with meaningful variation or complex reasoning
- 10+ shot — diminishing returns; consider fine-tuning instead
For tasks with multiple classes or output types, aim for at least one example per class.
Example Selection Strategy
This is the highest-leverage decision in few-shot prompting.
Diversity Over Repetition
Don't pick three examples that are nearly identical. Cover different facets of the task:
# BAD: Three nearly identical positive reviews
Input: "Great product, loved it!"
Output: positive
Input: "Amazing quality, highly recommend"
Output: positive
Input: "Best purchase I've made this year"
Output: positive
# GOOD: Covers different cases
Input: "Great product, loved it!"
Output: positive
Input: "Arrived damaged and customer service was unhelpful"
Output: negative
Input: "It's fine, does what it says, nothing special"
Output: neutral
Recency Bias
Models weight recent context more heavily. Put your most important or representative example last — immediately before the actual query.
Show Don't Tell on Edge Cases
If there's a tricky case users will encounter, demonstrate it explicitly rather than describing it in instructions:
# Showing how to handle missing data
Input: "TechCorp is exploring fundraising options but has not announced a final amount"
Output: {"company": "TechCorp", "amount": null, "round": "unknown"}
Few-Shot for Different Task Types
Classification
Classify the support ticket priority.
Ticket: "I can't log in to my account at all"
Priority: High
Ticket: "The dark mode toggle is slightly misaligned"
Priority: Low
Ticket: "Payment processing is failing for all enterprise customers"
Priority: Critical
Ticket: [NEW TICKET]
Priority:
Structured Extraction
Extract action items as a JSON array.
Meeting notes: "John will update the API docs by Friday. Sarah is owning the
design review and needs it done before the sprint ends. No deadline on the
backlog cleanup — just when we get to it."
Action items: [
{"owner": "John", "task": "Update API docs", "deadline": "Friday"},
{"owner": "Sarah", "task": "Design review", "deadline": "End of sprint"},
{"owner": null, "task": "Backlog cleanup", "deadline": null}
]
Meeting notes: [NEW NOTES]
Action items:
Style Transfer
Rewrite the sentence in the style of Ernest Hemingway.
Original: "The meteorological conditions were characterized by significant
precipitation and a considerable reduction in ambient temperature."
Hemingway: "It rained hard and turned cold."
Original: "She experienced a profound emotional response upon encountering
her former romantic partner in an unexpected context."
Hemingway: "She saw him and felt something she didn't name."
Original: [YOUR SENTENCE]
Hemingway:
Combining Few-Shot with System Prompts
For production use, combine a system prompt (for role and constraints) with few-shot examples (for format and edge cases):
System: You are a data extraction assistant. Always return valid JSON.
Never include null fields — omit them entirely. Be conservative:
if you're not certain about a value, omit it.
User:
Extract entities from: "Microsoft acquired Activision for $68.7B in 2023."
Output: {"acquirer": "Microsoft", "target": "Activision", "value": "$68.7B", "year": 2023}
Extract entities from: "The merger was called off before completion."
Output: {}
Extract entities from: [YOUR TEXT]
Output:
Debugging Few-Shot Prompts
When few-shot results are inconsistent:
- Check example quality — is the output in each example exactly correct?
- Check format consistency — are all delimiters identical?
- Add more diversity — are you missing an example that covers the failing case?
- Check for contradictions — do your examples imply conflicting rules?
- Verify ordering — move the most representative example to last position
Key Takeaways
- Few-shot works by demonstrating patterns, not describing them
- 3 diverse, high-quality examples beat 10 repetitive ones
- Consistent formatting across all examples is non-negotiable
- Show edge cases explicitly — don't describe them
- Combine with system prompts for production reliability
- When few-shot fails, debug the examples before anything else