<aside> <img src="i" alt="i" width="40px" />
TL;DR: Prompting makes structured output more likely. Schema-constrained decoding makes invalid continuations impossible by masking out tokens that would violate a JSON/schema/grammar contract.
</aside>
In real LLM workflows, you’ll ask for output like:
{
"verdict": "pass",
"confidence": "high",
"reason": "The answer satisfies the rubric."
}
And then you’ll see failures such as:
Here is the JSON:)"high confidence" instead of "high")A human reader can often “see what it meant.” A parser can’t.
Prompting affects what the model is likely to generate.
Schema-constrained decoding affects what the decoder is allowed to generate.
Prompting = changes token probabilities
Constrained decoding = removes format-breaking tokens from the candidate set
This difference is the whole mechanism.