Chain-of-Thought Prompting: Getting Models to Show Their Work
Chain-of-thought prompting dramatically improves LLM performance on reasoning tasks by instructing the model to think step by step before giving a final answer. Here is how it works and when to use it.
PromptProcessor Team
April 6, 2025
What Is Chain-of-Thought Prompting?
Chain-of-thought (CoT) prompting is a technique that instructs a large language model to produce intermediate reasoning steps before arriving at a final answer. Instead of jumping directly to a conclusion, the model "thinks out loud" — working through the problem step by step in a way that is visible in the output.
The simplest version of CoT prompting is adding the phrase "Let's think step by step" to the end of a prompt. This single instruction was shown to dramatically improve performance on arithmetic, commonsense reasoning, and symbolic reasoning tasks — often matching or exceeding the performance of much larger models on the same tasks.
Why It Works
LLMs generate text token by token. When a model is asked to produce a final answer directly, it has to "compress" all of its reasoning into a single prediction. For complex problems, this compression loses important intermediate steps.
By generating intermediate steps explicitly, the model effectively gives itself more tokens to work with. Each step in the reasoning chain becomes part of the context for the next step, allowing the model to maintain coherent logic across a longer sequence of inferences.
Zero-Shot CoT
The simplest form of CoT prompting requires no examples — just an instruction to reason step by step:
Q: A store has 48 apples. They sell 60% of them in the morning and receive a new shipment of 24 apples in the afternoon. How many apples does the store have at the end of the day?
Let's think step by step.
The model will then generate a step-by-step solution before giving the final answer. This zero-shot approach works surprisingly well for mathematical and logical reasoning tasks.
Few-Shot CoT
For more reliable results, you can combine chain-of-thought with few-shot prompting by providing examples that include the reasoning chain:
Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 balls. How many tennis balls does he have now? A: Roger starts with 5 balls. He buys 2 cans x 3 balls = 6 balls. Total: 5 + 6 = 11 tennis balls.
Q: {{math_problem}}
A:
Q: {{math_problem}}
A:
The example shows the model exactly how to structure its reasoning, making the output more consistent and easier to parse.
When to Use Chain-of-Thought
CoT prompting is most valuable for tasks that require multi-step reasoning:
- Mathematical word problems — Any task involving arithmetic, algebra, or quantitative reasoning.
- Logical deduction — Tasks where the answer depends on a chain of if-then inferences.
- Complex classification — When the classification decision requires weighing multiple factors.
- Code generation — Asking the model to reason about the algorithm before writing code often produces better implementations.
- Fact verification — Breaking down a claim into checkable sub-claims before rendering a verdict.
CoT is less useful for simple factual lookups, direct translation, or tasks where the answer is immediate and does not require intermediate reasoning.
CoT in Batch Processing
Chain-of-thought prompting works well in batch processing contexts, but with one important consideration: CoT outputs are longer than direct-answer outputs. This means higher token consumption per prompt and potentially longer processing times.
For batch processing, consider a two-stage approach:
- Stage 1: Run a CoT prompt against your dataset to generate reasoning chains plus answers.
- Stage 2: Run a second prompt that extracts just the final answer from the Stage 1 output.
This separation keeps your final output clean while still benefiting from the improved reasoning quality that CoT provides.
Ready to put this into practice?
Try the free Batch Prompt Processor — run your prompt template against hundreds of variables in seconds, right in your browser.
Open the ToolRelated Articles
Prompt Chaining: Breaking Complex Tasks into Reliable Steps
Prompt chaining is the technique of splitting a complex task into a sequence of smaller prompts, where each output feeds into the next. It dramatically improves reliability on tasks that are too complex for a single prompt.
Role Prompting: How to Get Expert-Level Outputs from Any Model
Assigning a specific role or persona to a language model is one of the most underrated techniques in prompt engineering. Done correctly, it shifts vocabulary, tone, and reasoning style in ways that dramatically improve output quality.
Few-Shot Prompting: Teaching Models by Example
Few-shot prompting is one of the most reliable techniques for improving LLM output quality. By including examples directly in your prompt, you can teach the model exactly what you want — without any fine-tuning.