Few-Shot Prompting

Giving the AI examples before the real question — the Restricted Section version of 'here's how this works, now you try.'

Few-shot prompting improves LLM performance by including examples of the desired behavior directly in the prompt before the actual task. A few-shot classification prompt might include: three examples of texts with their correct sentiment labels, followed by the actual text to classify. A few-shot extraction prompt might show three example documents with the correctly extracted fields, then present the actual document. The model sees the examples and infers the task structure, output format, and judgment criteria from the pattern — often producing more consistent and accurate results than a zero-shot prompt that describes the same task in words. The name "few-shot" reflects that only a small number of examples (typically 3-10) are sufficient to significantly improve performance; more examples often provide diminishing returns and consume valuable context window space.

The selection and quality of few-shot examples matters significantly. Good examples cover representative variation in the task — different input types, edge cases that illustrate key judgment calls, and examples where the correct answer might be non-obvious. Poorly chosen examples (all from the same narrow subcategory, or missing important edge cases) can actually mislead the model by implying that variation the model hasn't seen doesn't exist. When chain-of-thought is combined with few-shot (showing examples that include the reasoning process as well as the final answer), models perform better still on complex reasoning tasks — the combination demonstrates not just what the answer should look like but how to arrive at it.

For B2B teams, few-shot prompting is the practical bridge between zero-shot (no examples, fastest to set up, sometimes inconsistent) and fine-tuning (highest investment, most robust for specific tasks). When building AI features, the engineering workflow for prompt development looks like: write zero-shot prompt, test on 20-30 representative inputs, identify patterns in failures or inconsistencies, add 3-5 examples that address those failure patterns, retest, and iterate. This cycle — zero-shot to few-shot with targeted examples based on observed failures — resolves the majority of prompt performance issues without requiring the infrastructure investment of fine-tuning. Document the final few-shot prompt and examples as carefully as application code — they are functionally equivalent to code for AI-powered features.

few-shotprompt engineeringLLMpromptingexamplesAI

Related terms

← Back to Glossary