11.1 What do Purves and Lotto's Illusions Actually Show?

Figure 11.1.1. After Figure 6.10 from Dale Purves and R. Beau Lotto's book Why We See What We Do; An Empirical Theory of Vision (2003, revised 2011). The areas of the image that depict the blue tiles on the top of the cube in (A) and the yellow tiles on the top of the cube in (B) are identical in psychophysical colour, and appear middle grey against a neutral background.

This remarkable illustration is from Dale Purves and R. Beau Lotto's book Why We See What We Do; An Empirical Theory of Vision (2003, revised 2011). The illustration is one of many presented by the authors in support of their view that visual perception is not, or is only accidentally, veridical (truthful). It's important however to be very clear about the senses in which visual perception is not veridical.
Although Purves and Lotto describe the two scenes as being "consistent with illumination by spectrally different light sources", the effect depicted is specifically that of viewing the scenes through two differently coloured translucent masks. As is standard practice for magic tricks, the authors misdirect us by saying in their caption that "the 'blue' tiles on the top of the cube in (A) are physically identical to the 'yellow' tiles on the cube in (B)". What is physically identical of course is the digital paint used to depict the tiles, not the tiles themselves, which do not exist physically but only as virtual objects depicted in the scenes. Throughout their book the authors repeatedly make this sort of conflation of distal and proximal (image) properties: the colours/greys/sizes/angles etc A and B appear to be different but are "actually" the same. To take another example, in their illustration of the Shepard tabletop illusion, what differ are the perceived lengths of two virtual tables depicted in the scene, what are the same are the lengths of the images of the tables.
What these demonstrations show is that when we look at an image of a three-dimensional scene, we tend to notice the perceived properties of the virtual objects depicted in the scene, and it is remarkably difficult, at least initially, to see (that is, attend to) properties of the image itself. In this sense our perception of this image is not veridical: we perceive yellow and blue in areas painted with grey paint. Paradoxically however these images also demonstrate our visual system's remarkable capacity (called perceptual constancy) to extract more or less reliable information about the properties of objects in the environment despite variations in illumination, atmosphere and viewpoint. If this were a photograph of a physical setup, the light reaching the camera from each tile would be an additive-averaging mixture of light reflected from the tile and light contributed by the translucent mask. For the light reaching the camera to be achromatic and thus be recorded as a grey image colour, the colour of the tile would need to be the additive complementary of the colour of the mask. Remarkably, our visual system automatically, immediately and seemingly effortlessly presents the tiles depicted by grey pixels as having the colours they would need to exhibit in a physical setup that created this visual appearance.

Figure 11.1.2. Simple simultaneous contrast: central squares matching the grey image colour of the tiles in question in Fig. 11.1.1 are set in a surround that averages the colours surrounding them in that image. The inaccuracy of our perception of their dissimilarity is of a much lower order than in Fig. 11.1.1. (Interestingly the contrast effect, though weak when this array is viewed globally, increases when it is viewed with a fixed gaze).

From the point of view that perceived colours are just ways of seeing certain physical properties rather than physical properties as such, all perceptions of colour are, in this very specific sense, not veridical. Being "not veridical" in this restricted sense does not preclude object colours from being more or less reliable though coarse-grained perceptions of the spectral reflectance of objects. The extent to which these perceptions of spectral properties can be inaccurate is revealed by simple simultaneous contrast, as in the relatively extreme example I posted last week. These perceptual inaccuracies are typically much smaller than those of image colours in representational images (Fig. 11.1.2).
From an evolutionary point of view there is nothing "peculiar" about the fact that perceived object colours are so salient in images. The main selective value of colour vision for us is in enabling us to perceive properties of objects rather than properties of the visual field itself. We did not have to concern ourselves with the latter at all until we began to paint mimetically. But just because we do not immediately notice the properties of an image or of the visual field does not mean that these properties are permanently "inaccessible" to us. If you enlarge the left-hand image (A) and stare fixedly at the area occupied by just one of the blue tiles for a period of time (twenty seconds should be enough) you may begin to see the grey image colour, and similarly for the image area occupied by one of the yellow tiles in "B". In other examples it can be necessary to break the representational spell of the image by connecting up the areas or by masking the rest of the image in order to perceive the similarity of the image areas veridically. In nature we can make comparisons of the properties of the visual field by using painters' tools like squinting, viewing through an opaque screen with two apertures, or (for the Shepard table illusion) measuring with a pencil etc held at arm's length.
In their book Purves and Lotto employ this and other illustrations to supposedly demonstrate that "what we see deviates from physical measurements of objects and conditions in the real world". What these illustrations in fact show, and show vividly, is that our visual system is remarkably good at arriving at generally reliable though coarse-grained perceptions of the spectral reflectances of objects (as object colours) despite variations in illumination and atmosphere, but that paradoxically this capacity is so instantaneous and automatic that it can be very difficult to see (that is, attend to) the physical properties of the components of representational images of the world. We experience visual information pre-processed into estimates of spectral reflectances, illumination and atmosphere, and it requires effort to "flatten" these superimposed perceptions into a two-dimensional, camera-like visual image.
To sum up, there are three senses in which visual perception is not veridical (i.e. precise and/or accurate). 1. Colours are ways of seeing physical properties rather than physical properties in themselves, and these ways of seeing are imprecise in the sense that they represent only the direction and amount of bias between the long-, middle- and short-wavelength components, rather than all the details, of a spectral distribution. 2. Perceived colour can be a somewhat inaccurate representation of physical properties, as seen in demonstrations of simple simultaneous contrast. 3. Perceived colours, lengths etc in an image can be very inaccurate representations of the physical properties of that image, at least initially, because our perceptions of the properties of the virtual objects depicted in the image are so salient that it can be very difficult to attend to the properties of the image itself. Precisely this difficulty also troubles painters learning to translate a visual image into paint.

Page added October 25, 2017.