Friday, September 22, 2006

Are Images Flat?

Okay, here’s another post about the spatial properties (or not) of visual images. I seem to be on a kick!

Pete Mandik reminded me of this issue when he said something in his comments on my last post that seemed (perhaps only seemed?) to imply that he regarded images as generally two-dimensional.

We certainly talk, sometimes, as though they are. Most tellingly, I think, we call images “pictures” (in the mind’s eye), not (say) “sculptures”. Stephen Kosslyn, in his seminal 1980 book on imagery, describes the imagery space as “roughly circular” and compares its horizontal and vertical dimensions. He does not (that I recall) discuss its depth. Likewise, he says that position in the imagery matrix can be indicated by a pair of co-ordinates (polar co-ordinates r and theta) – not, as would be necessary for a three-dimensional imagery space, a triplet of co-ordinates (such as the Cartesian x,y,z or the polar r, theta, and phi). His sample portrayals of images never indicate depth.

In my posts here and here and my article on the question of whether things look flat, I suggest that our tendency to think of circular objects viewed obliquely as “looking elliptical” and distant objects as “looking small” derives primarily from an over-analogizing of visual experience to flat media such as paintings and pictures. I won’t rehearse those arguments again; but if that’s right, then perhaps our (at least some of our) tendency to think of images as flat derives from a similar over-analogizing and should be treated with similar skepticism.

One can accept that images are often (or even typically?) three-dimensional without going so far as to say that we can imagine something from multiple perspectives at once -- just as we can say that our visual representations and visual experience are fundamentally and ineliminably marked out in three-dimensional space (even monocularly) without saying that we can see from more than one angle at once.

So I’m wary of our too easily supposing that our imagery appears as if on a flat plane. But maybe, nonetheless, it does. I wonder about readers' introspective sense of this; and I’d be interested to hear also if you have reflections on neuroscientific or behavioral tests that might shine light on the question.

21 comments:

Pete Mandik said...

Eric,

I think images are what Marr calls 2-and-a-half-D. they are view-point dependent and a lot of that can be characterized by the not quite accurate model of planar projection. But, of course, we have stereoscopic depth info via integrating info from two eyes. And, as you rightly have pointed out, our retina are importantly curvy.

I think your points about being skeptical of overanalogizing are very good.

Re: tests, there's a pretty big literature relavant to this stuff concerning whether visual memories of 3-D objects are viewpoint dependent or not. Biederman is representative of the view-point independence camp, relying on experiements whereby rotational differences between target and cue objects don't give rise to corresponding differences in accuracy and reaction time (as long as sufficiently diagnostic features are present to enable "geon" extraction). In contrast are people like Tarr who present experiements in which such differences do obtain. Of course all of this stuff is inspired by the rotation matching experiments of Shepard.

Eric Schwitzgebel said...

Ah yes, 2-1/2-D -- the perfect compromise? Maybe I understand what that means computationally, but phenomenologically I'm less sure! (Like a relief carving? -- that doesn't seem quite right....)

That's an interesting suggestion, too, Pete, about the literature on viewpoint dependence in memory (a literature I don't know very well). I think there are a few leaps to make to connect it with the question of the dimensionality of imagery (e.g., the connection between memory and imagery; the connection between dimensionality of the represented object and dimensionality of the medium of representation), but maybe those leaps are plausible....

Pete Mandik said...

I didn't think to invoke relief carvings, but that's quite nice. I think as far as representational content goes, visual imagery (and visual experience) is like a relief carving: they both represent the bugliness of objects, but only the forward facing bulginess. Backsides remain unrepresented. Now, as far as representational vehicles go, relief carvings literally are bulgy in the front and flat in the rear. Not so for the vehicular properties of visual imagery (and visual experience).

The above remarks lead me to wonder, Eric, what you think of this talk of contents and vehicles. When you post stuff with titles like "Are Images Flat?" Are you talking about literal flatness? Represented flatness? Some other kind of flatness?

Eric Schwitzgebel said...

Since a flat thing can represent a three-dimensional one, I don't mean that images represent things as flat. Nor do I think the brain processes involved are flat (what would that mean?). I mean only does visual imagery experience the experience of objects as though upon a flat plane, like a picture or photograph? So, for example, if you imagine a coin viewed at an oblique angle, is there something elliptical in your imagery experience, as though that circle were projected obliquely upon a flat plane? The attribution of such "projective distortions" (as I call them in my essay "Do Things Look Flat?") are I think a telltale sign that the experience in question is (implicitly or explicitly) taken to be flat.

It's a good question, Pete. Does that answer it?

Tanasije Gjorgoski said...

So, for example, if you imagine a coin viewed at an oblique angle, is there something elliptical in your imagery experience, as though that circle were projected obliquely upon a flat plane?

It might go both ways though.
We can say that a coin viewed at an oblique angle looks like a elliptical two dimensional projection at right angle. But we also can say that two dimensional projection at right angle looks like a coin viewed at an oblique angle.

Why give primacy to one of those? Why conclude that because A looks like B, B is somehow representative of our phenomenological experience?

Eric Schwitzgebel said...

Thanks for the comment Tanasije! I agree about the symmetry here. But here's my thought: Unless there's some respect in which imagery is flat, how does ellipticality come into it at all?

I've been working through the early introspective physiologist, Purkinje, by the way, and I'm posting a bit of translation in the Underblog. One striking thing -- striking to me, but perhaps not striking to those who implicitly accept that visual experience and visual imagery are flat -- is that his descriptions of his eyes-closed visual experiences are always descriptions of planar configurations.

Tanasije Gjorgoski said...

Eric,
One could see a coin from certain angle, and then see elliptical something.
He can then say that both things look the same (or are similar).
And even I haven't seen such elliptical thing, I can imagine an elliptical thing that would look like the coin from that angle.

Why ask how does ellipticality come into it, and not "how does the coin-at-certain-angleity comes into it? (in the symmetrical case)

Quirinius_Quine said...

"But here's my thought: Unless there's some respect in which imagery is flat, how does ellipticality come into it at all?"

I agree with you that the coin doesn't look flat, but it does look elliptical. To me, vision seems like a 3d construction in perspective: things look like they do shrink the farther away they are, and lines really do seem to converge to a point on the horizon, yet everything still looks 3d. Try to imagine a model scene so constructed so that when viewed from above the lines still look to converge toward the horizon, though not a sharply as when viewed head on. If you've seen the original Willy Wonka and the Choclate Factory, try to remember they scene where they walk through the hallway that really does get smaller the closer they got to the tiny door at the end, and then you'll have some idea of what I'm trying to say.

Eric Schwitzgebel said...

Thanks for the comments Tanasije and Mr. Quine!

On your point, Tanasije: I'm happy to ask how the coin-at-an-angle-ity comes into it. It does because the experience has a three-dimensionality to it. Similarly, I think the ellipticality comes in only if the experience (also?) has a two-dimensionality to it. But I'm not much inclined to think it does.

Quirinius, I'm embarrassed to confess that I'm not sure I understand what you're proposing - that it's experienced as three-dimensional but a bit squashed? Or...?

Pete Mandik said...

Eric,

I'm not sure I exactly get what your answer is, so let me ask my question in a slightly different way. Before I get to that, though, I'm first going to assert a bunch of stuff without argument.

Everything there is to say about the mind can be said in terms of representational contents or representation vehicles and there is no third option. Putative third options, like non-representational (non-content, non-vehicle) phenomenal properties don't exist. Further, all considerations that seem to favor them stem content-vehicle confusions.

Ok, now that I'm done begging a whole bunch of huge questions, here's my question to you: when flatness is attributed to images, is that code for saying that things are represented as flat (the content option), the representational vehicles are themselves flat (the vehicle option, or is there some third flatness in play, a "phenomenal flatness" which is neither the literal flatness of representation vehicles or the representation of flatness of the objects preceived or imagined?

In your answer, you initially pretty plainly reject the content and vehicle options. But when you go on to say what you think is going on, its not clear that you are giving a genuine third option (as opposed to ,say, a blend of the first two).

Here's what I'd say about all this stuff:

First, leaving the mind to the side a bit, let's look at the conventions of pictorial representation in painting and photography. One thing that happens frequently is that objects are represented as being farther from the viewer by having the representations of the objects being smaller and higher. So, two equally sized trees are represented as being differentially postioned with respect to the viewer by having the one tree representation be smaller than the other and higher up in the picture. The spatial property of distance from the viewer here is a represented property or content property. The spatial property of smaller-than or hhigher-up invoked here are vehicular properties.

Sticking with pictures and moving on to coins, how does one represent a circular object as having its edges not being equally far away from the viewer? By having representational vehicle be an elipse. The representational content involves a circle (which has some parts closer to the viewer than others).

So why are elipses good for representing circles and small high things good fro representing big distant things? Because pictures are flat and the flatness here is vehicular flatness.

Moving back to the mind, what's the flatness attributed to images? My answer to that question is that it's vehicular flatness: there's a pretty literal sense in which neural elements in two d arrays are representing stuff the way pixels in a digital image represent stuff.

Sorry to go on so long!

Eric Schwitzgebel said...

Thanks for the very thoughtful comment, Pete -- not too long, but exactly as long as is necessary!

I'm not a representationalist about the mind, so I'm not going to accept your starting premises, I'm afraid. And yet I'm not sure that this topic is where I'd want to fight my anti-representationalist battles. (I fight them, instead, regarding matters of the discreteness and determinateness of representational structure and the non-discreteness and indeterminateness of the mind.)

In my earlier response to you, I didn't quite reject the vehicle interpretation. (I only said that brain processes weren't flat.) Maybe the supposed flatness (and remember, I'm not endorsing the flatness view, so I'm elaborating a position with which I'm not sympathetic) is something like the flatness of pictures. Yet I'm not sure that's entirely vehicle flatness, or that there's a clean content-vehicle distinction here. The canvas is of course flat, and the ellipse is a planar figure; but the canvas also, in some sense, represents an ellipse, no? -- It represents the coin as viewed obliquely by representing it elliptically. So in a sense there are primary and secondary representations. Or maybe that's not the right way to think of it?

Anyhow, my main argument is very simple: If there's somehow an ellipticality in the phenomenology, that implies some sort of planar projection is involved. And if phenomenology is determined (in part) by projection onto a plane, then there's a sense in which things look flat. No?

Pete Mandik said...

Thanks Eric, this helps focus things quite nicely. Of particular interest is your statement that the coin is represented elliptically and this is not vehicular epillipticality (apologies for any grotesque neologizing!). I wish I had an actual argument to offer, but it seems pretty clear to me that "represented elliptically" in the case of pictures is attributing the property of being elliptical to the representation, not to what the representation is a representation of. I'm not sure how to settle this, but it seems like we can boil down the dabate to the following two choices.

Since I take it we don't disagree that a litterally eliptical blob of paint on a differently colored background is doing the representing, the choices are:

Pete Choice: An eliptical blob is representing a tilted circle

Eric Choice: An eliptical blob is representing an elipse and a tilted circle

Is this a fair way of putting it? I confess that I may very well have spelled the choices out to make the Pete Choice look better.

Tanasije Gjorgoski said...

Hi Eric,
First, I guess I didn't understood you don't endorse the flatness view, so sorry for giving arguments against the view, as if you accept it. Anyway Im with you on this, though I wouldn't say there isn't anything flat about experiencing oblique coin, but that experience is one of "being in 3d world, and looking at a 3d coin from certain place" type.

Pete, as for the symmetry I mentioned previously...
1. Eliptical blob can represent a tilted circle
2. Tilted circle can represent an eliptical blob
Would you argue that
a) just 1 is possible, or
b) both are possible, but just that it is contingent fact of our cognition that 1 is the case

Eric Schwitzgebel said...

I'm inclined to agree with your "being in a 3-D world" statement, Tanasije. That puts it nicely. But I don't want to be dogmatic about it.

On your question in the second part of your post: (1) seems very possible, quite common. (2) I would also say is possible -- practically anything can represent practically anything if the background circumstances and conventions are right -- but highly unusual! Is there something in particular you're inclined to draw from that?

Tanasije Gjorgoski said...

Eric,
I was not going anywhere with the second part. I found Pete's example (analogy?) of pictorial representations in photographies helpful, but I wondered if he can put further arguments on why should one think that it is that kind of relation (2d vs 3d) in phenomenal experience between the vehicular properties and content properties.

In my thinking one might argue that the possibility of illusion of rotated coin done by showing elliptical color patch, shows that that is the case. As if 2d elliptical patch is enough to make illusion of rotated circle, then there is nothing in the 3d situation which is not in the 2d situation.
Though of course one might argue that the illusion is only possible in this special case, and that e.g. if we put the elliptical patch close to the eyes, the illusion will disappear, or same if we are looking at some bigger object in front of us (e.g. the table on which I'm now)

Tanasije Gjorgoski said...

Oh, I was reading few days ago the draft of your paper "The Unreliability of Naive Introspection", which contains a point, which might be brought also in connection to the issue discussed here...
There you are talking about the central vs. periphery of the visual experience, and how it is clear only in the proximity of the center of attention.
Connected to this it is interesting to ask if maybe that center (to some proximity) is 2d, and that if we are looking at the larger object, we need to move our eyes "over it", so we necessarily include the third dimension (because we need to change our focus, e.g. to more distant or more closer point on the surface of the object).
Also it is interesting to analyze this question in the context of the periphery... The periphery seems to go "around" the observer, so it seems it can't be 2d. But then maybe it is convex 2d plane?

Pete Mandik said...

Tanasije,

I choose B.

Cheers,

Pete

Quirinius_Quine said...

"Quirinius, I'm embarrassed to confess that I'm not sure I understand what you're proposing - that it's experienced as three-dimensional but a bit squashed? Or...?"

'Squashed' isn't a term I would have chosen, but I think it points in the right direction. It would be "squashed" only towards one end, the lines converging towards point of greatest depth.

Tanasije Gjorgoski said...

Eric, sorry for sending so much comments, but thinking about the periphery, I wonder if something like this would work as experiment for the 2d/3d.
Make V sign with the fingers of your left hand, and raise the hand in front of your eyes. Now put the index-finger of your right hand behind your left hand, (so that you look at the index finger through the V structure of your left hand). While keeping eye focus on the index finger, observe the vagueness of the periphery.
Next put the index-finger in front of V and keeping eye focus still on the index finger, observe the vagueness of the periphery.
The vagueness is clearly different in both cases, and presents some kind of information of how to move the eye-focus in order to change the focus from the index finger to the V structure. In the first case I need to focus on something closer, and in the second to change focus to something more distant.
Is it possible to make illusion of those differences using 2d pictures?

Eric Schwitzgebel said...

Pete, I don't know how I missed your comment before, when I was replying to Tanasije. I'm not sure I want to commit entirely on the question you pose; "representation" is not really my preferred language. There's something attractive in the way you put it, but I'm not entirely sure (notice the hedged language here) that there isn't a sense in which that blob of paint "represents" an ellipse. Analogously, maybe we can also say that the more distant tree (in an earlier example of yours) is "represented" as smaller and higher in the painting.

I suppose this has some similarity to Alva Noe's "dual content" view of perspectival representations.

Eric Schwitzgebel said...

That's a nice little armchair experiment, Tanasije! I am inclined to think that in the case you describe (fingers in a V, finger from the other hand either in front of or behind the V), the experience is so vividly three-dimensional that it dispels whatever temptation we might have to say that things look (to quote Hume) "as though painted upon a plain surface".

HH Price, interestingly (in his 1932 book Perception), suggests that one gets lively three dimensional effects close up, but they tend to flatten out in the middle distance, and in the far distance things look genuinely flat.