Pictures and words are everywhere. We watch television; read newspapers and books; talk on the phone; send emails and post pictures and written comments on Facebook. Pictures and words are also in our minds. We can picture an upcoming vacation in our “mind’s eye,” or use our “mind’s mouth” to make a list of things we need to pack. How does visual versus verbal thinking affect our beliefs, expectations, judgments and behavior? My research aims to understand the nature and distinctive influences of visual and verbal thought.
THE ADAPTIVE FUNCTIONS OF VISUAL AND VERBAL PROCESSING
Pictures and words are symbols used to represent real objects, events, and actions. Pictures are concrete representations that in nearly all cases physically resemble their referents; they are analogs of the real world. In contrast, words in nearly all cases are abstract representations that have an arbitrary relationship with their corresponding referents. Each word (excluding proper names) refers to broad category of concrete objects (Glaser, 1992; Paivio, 1986). Thus, there are infinitely many ways to draw a “chair”, a “kitchen chair” (a subordinate representation; cf., Rosch et al, 1976), or even a “kitchen chair with four legs”. A picture, in contrast, fixes the referent object’s most salient features.
That pictures and words reliably differ in their level of abstraction is well known. However, these differences have implications for thought and behavior that we are only beginning to understand. My doctoral dissertation provided evidence that people use visual thinking when they think about something that is psychologically proximal to them (temporally, geographically, or socially). In contrast, people use verbal thinking when they think about something that is psychologically distal from them. Thus, for example, if you think about a vacation you are going to commence tomorrow, you might think about it using vivid mental imagery. In contrast, if you think about a vacation that will take place a year from now, you might think about it using inner speech.
The rationale for this hypothesis is simple. Because they are abstract, words preserve the essential properties of a stimulus across momentary changes in appearance and through changes in space and time. For example, with respect to spatial distance, the visual image of an object changes as one changes viewing position, but the name for that object remains the same. Words fulfill this function due to their inclusive semantic nature. And because words are emancipated from the particularity of the situation, they are able to represent it across (temporal, spatial, and social) distances. Pictures, by contrast, preserve the event in minute detail, with and relevant and irrelevant parts being given equal attention, to serve for immediate or early use (Amit, Algom, Trope, & Liberman, 2009a).
Thus, it is adaptive to represent distal things in words and proximal things in pictures. To physically interact with an object one needs representations of its temporary properties. Which way is it facing? What is it sitting next to? What is blocking it? What is its exact shape? However, to deal with objects at a distance, such information is often not necessary. If you are on your way to the store to buy a jug of maple syrup, you do not need to know which way the jug handles are facing. But when it comes time to take one off the shelf, you do. Thus, you typically need visual information up close, but not from far away. And from far away it is typically more efficient to represent things in a more sketchy, verbal way, rather than burdening yourself with unecesssary details. Nevertheless, we can think about distant things visually, and we canuse words to represent things that are close. This leeway gives room for a lot of variation in how we think about the world, and the possibility that some types of thinking may be better or worse for different purposes.
To test the medium/distance hypothesis I use a variety of methods, including cognitive (e.g., Stroop task), social, and brain-imaging techniques. All these methods show converging evidence that people prefer, and are better at processing - visual information that represent proximal items and verbal information that represent distal items, than vise versa.