Why patch-based color constancy experiments are structurally flawed
With this post I want to argue that treating color constancy as a tri-stimulus mapping is not just an oversimplification, but a mischaracterization of the perceptual problem. A key reason this persists is the widespread reliance on patch-based experiments using a single stimulus on a background.
These experiments unintentionally force observers to collapse perceptual dimensions that are otherwise distinct.
The hidden constraint in single-patch paradigms
In a typical patch-based experiment, an observer is shown a single colored patch against a background and asked to make a match or judgment. Under these conditions, there is simply not enough image structure to support multiple perceptual interpretations.With only one patch:
- The visual system cannot reliably separate surface color from illumination or other global constraints.
- The observer must treat the stimulus as a single fused percept.
- Any percept of “color cast” is effectively absorbed into the color of the patch itself.
Forced collapse is not evidence of perceptual collapse
When observers succeed in matching colors across illuminants in patch-based tasks, this is often taken as evidence that the visual system normalizes illumination and represents color in a corrected tri-stimulus space.But this inference is invalid.
The task demands a single output, so the observer must collapse:
- surface-related color information, and
- illumination or color cast related color information
into one reportable variable.
Why multiple patches under a single cast change everything
When an image contains many patches under the same color cast, the situation changes qualitatively.Now there is sufficient structure to support:
- a stable relational organization between patches,
- the identification of achromatic references,
- and the perception of a coherent illumination layer affecting the entire scene.
Crucially, they can still identify:
- which regions are achromatic,
- which regions are chromatic,
- and what the color of the illumination or bias itself is.
What patch-based experiments actually measure
Patch-based paradigms do not measure the dimensionality of color perception. They measure the best strategy an observer can adopt under information-poor conditions.They test how perception behaves when:
- illumination/bias and surface are deliberately confounded,
- structural cues are removed,
- and observers are forced to give a single answer.
Implication for color models
If a model is validated primarily on patch-based experiments, it will inevitably favor low-dimensional mappings. This does not mean the model is correct. It means it is well-tuned to a constrained task.Once richer image structure is introduced, multiple surfaces, shared illumination, relational cues, the limitations of tri-stimulus mappings become apparent.
At that point, the need for higher-dimensional perceptual representations is no longer theoretical. It becomes empirically unavoidable.
Conclusion
Single-patch color constancy experiments do not reveal the true structure of color perception. They suppress it.Only when a multitude of patches is viewed under a shared color cast does the perceptual system reveal what it is actually encoding: not just colors, but the relationship between colors and the light or bias that binds them.
Any theory of color that cannot represent both simultaneously is missing a fundamental part of the percept. This is where my chromatic adaptation model enters the scene. It explicitely models the color percepts as a neutral anchor and a bias that can post-hoc be decribed as the percept of colors through a veil. It estimates a set of stimuli that when combined with the estimated bias or veil results in predictable and stable percepts. Here the model was locally applied to a known variegated field, where the stimuli were predicted that result in expected percepts of uniformly colored red, green and blue rectangles under a variegated bright yellow and dark blue color cast:
A theory of color must in my view be judged not by how well it performs under artificially collapsed conditions, but by whether it can represent what observers actually perceive when the visual system is given enough information to do its job. Models that explicitly encode both surface and bias are not adding complexity for its own sake; they are responding to empirical necessity. Ignoring one half of this perceptual state does not simplify the problem. It underrepresents it.
Last edited:



