xiaogang yan , ankit khambhati , lei liu and tai sing leetai/public/phy4.pdf · xiaogang yan †,...

28
Yan, Khambhati, Liu and Lee 1 Neural dynamics of image representation in the primary visual cortex Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Lee ,* Center for the Neural Basis of Cognition and Computer Science Department * Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A. Address for correspondence: Tai Sing Lee Rm 115, Mellon Institute Carnegie Mellon University 4400 Fifth Avenue, Pittsburgh, PA 15238. (412) 268 1060. (412) 268 5060. tai@cnbc.cmu.edu Running head: Image Representation in V1 Key words: Color tuning, multiplicative gain, feedback, boundary, surface, figure-ground, con- textual modulation, response latency, color filling-in. Acknowledgement: This research is supported by NSF and NIH. We thank Ryan Kelly, Jason Samonds, Matt Smith for helpful comments and technical assistance. A protocol covering these studies was approved by the Institutional Animal Care and Use Committee of Carnegie Mellon University, in accord with Public Health Service guidelines for the care and use of laboratory animals.

Upload: others

Post on 12-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 1

Neural dynamics of image representation in the primary visual cortex

Xiaogang Yan†, Ankit Khambhati†, Lei Liu† and Tai Sing Lee†,∗

Center for the Neural Basis of Cognition †

and Computer Science Department∗Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A.

Address for correspondence:

Tai Sing LeeRm 115, Mellon InstituteCarnegie Mellon University4400 Fifth Avenue, Pittsburgh, PA 15238.(412) 268 1060.(412) 268 [email protected]

Running head:Image Representation in V1

Key words:Color tuning, multiplicative gain, feedback, boundary, surface, figure-ground, con-textual modulation, response latency, color filling-in.

Acknowledgement:This research is supported by NSF and NIH. We thank Ryan Kelly, Jason Samonds,Matt Smith for helpful comments and technical assistance. A protocol coveringthese studies was approved by the Institutional Animal Care and Use Committeeof Carnegie Mellon University, in accord with Public Health Service guidelines forthe care and use of laboratory animals.

Page 2: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 2

Abstract

Horizontal connections in the primary visual cortex have been hypothe-sized to play a number of computational roles: association field for con-tour completion, surface interpolation, surround suppression for saliencycomputation. Here, we argue that horizontal connections might alsoserve a critical role of computing the appropriate codes for image rep-resentation. That is, the early visual cortex or V1 explicitly represents theimage we perceive. Such a role for horizontal connection is commonlyassumed in theories of efficient coding for image representation, and isimportant for redundancy reduction. Recently, a number of recent fMRIand neurophysiological studies have argued against an explicit isomor-phic representation on the basis that the neurons’ responses appear tobe non-uniform across surfaces with uniform perceived brightness (Cor-nelissen et al. 2006), and those neurons that do respond to solid color(rather than edges) do not appear to respond in a way that is consistentwith perceptual filling-in (von der Heydt et al. 2003). In this study, weinvestigate, neurophysiologically, how V1 neurons respond to uniformcolor surfaces and show that spiking activities of neurons can be decom-posed into three components: a bottom-up input, an articulation of colortuning and a saliency effect that is inversely proportional to the distanceaway from the bounding contrast border. We demonstrate that resultsderived from the image representation model are consistent with theseobservations. This suggests that the horizontal connection network re-quired for implementing explicit isomorphic image representation mightbe an important part of the circuitry in the primary visual cortex.

Introduction

When viewing a figure of solid color or brightness as Figure 1A, we perceive a colorsurface. Yet, neurons in the retina and the LGN are mostly excited by chromatic andluminance contrast borders. The earliest theoretical explanation of this perception isadvanced by the Retinex Theory (Land 1971), which argues that there is a process forrecovering reflectance or perceived lightness within a region based on contrast sig-nals provided by the retinal neurons. This model was implemented in 2D by Horn(1974) using an iterative algorithm with neighboring units propagating brightnessmeasure from contrast borders. Subsequently, Grossberg and Mingolla (1985) pro-posed the so-called feature contour system/boundary contour system model for V1in which they described propagating brightness and color signals from the lumi-nance and chromatic contrast borders to the interior of the surface with a diffusion-like process via a synctium in the brightness and color channels respectively.

Page 3: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 3

Neurophysiological evidence for isomorphic representation, particularly in the con-text of color and brightness representations have become controversial recently. Whilesome single unit neurophysiological studies (Rossi et al. 1996, Kinoshita and Ko-matsu 2001, and Roe et al. 2005) yielded results that might be consistent with bright-ness and color filling-in in V1 and V2, both von der Heydt et al.’s (2003) single unitrecordings, and Cornelissen et al.’s (2006) fMRI studies failed to find neural evidencein early visual areas that is consistent with this account. These latter authors arguedthat mental representation of solid color on a surface might be inferred by higherorder extrastriate cortex based on chromatic contrast responses in V1 to edges only,instead of having to synthesize isomorphic or explicit representation of the input im-age or mental image in the early visual cortex. One difficulty of the interpretation ofspiking activities of neurons in V1 is that there are a number of factors contributingto the response of each individual neuron. We foudn that in response to a color stim-ulus, a neuron is driven by feedforward bottom-up input which depends on conesensitivity to the different color stimulus. Over time, different neurons might ar-ticulate preference for different colors that depends on chromatic context (Wachlteret al. 2003). In addition, attention and figure-ground configuration of the stimuluscan bring in another layer of of contextual effect (e.g. Lamme 1995, Zipser et al.1996, Lee et al. 1998, 2002). Here, we use single-unit neurophysiological studies andsimulation studies to tease apart the different components of the neural responses,and argue that a computational model, based on image segmentation, implementedwith recurrent horizontal connections can potentially explain a number of earlierphysiological observations, and in fact, provide some evidence for isomorphic rep-resentation of perception in the primary visual cortex.

Neurophysiological Methods

This paper contains results from both neurophysiological experiments and compu-tational simulation. We will first describe the neurophysiological experiments, andthen present computational simulations to further understand these results. Neu-rophysiology was performed on two awake behaving monkeys. Single-unit record-ings were made transdurally from the primary visual cortex of the monkeys withepoxy-coated Tungsten electrodes through a surgically implanted well overlying theoperculum of area V1. The receptive field (RF) of each cell was first mapped with asmall bar and with a texture patch as the minimum responsive fields. The receptivefields of recorded cells were located between 0.5◦ − 4◦ eccentricity in the lower leftquadrant of the visual field, and ranged in size from 0.5◦ to 1◦, with eye movementjitter factored in. The procedure and setup are essentially the same as that in Lee etal. (2002).

During recording, the monkeys performed a standard fixation task during stimu-lus presentation, where they were trained to fixate on a dot within a 0.7◦ fixation

Page 4: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 4

window on a computer monitor, 58 cm away. After stimulus presentation within aparticular trial the fixation dot jumped to another random location, and the monkeyhad to make a saccade to the new location in order to successfully complete the trialand obtain a juice reward. Each trial lasted for a fixed amount of time. The stimulifor each condition was presented statically on the screen in each trial. Each conditionwas repeated 20-30 times/trials, randomly interleaved with other conditions.

In these experiments, we studied only those neurons with localized receptive fieldsthat exhibited some sustained responses inside a uniform (solid) color region muchlarger than the receptive field. The waveform of the spikes had a SNR of at least1.5 in peak-to-peak amplitude relative to the background noise. The analysis here isbased on single-unit activities and multi-unit activities. Single units and multi-unitsare analyzed separately, reported separately in some cases (Figure 5). But since thereare no qualitative distinction between single units and multi-units in the effect weobserved, we tend to lump them together in population analysis. Many cells thatresponded well to oriented bars but not to uniform color surfaces. These neuronswere also omitted from this study. The color-sensitive cells studied here were about15 percent of the cells encountered. We found there is a tendency for cells preferringto the same color tend to cluster together in a particular penetration.

Stimuli were presented on a Sony Multisync color video display monitor driven by aTIGAR graphics board via the Cortex data acquisition program developed by NIH.The resolution of the display was 640 x 480 pixels, and the frame rate was 60 Hz. Thescreen size was 32 x 24 cm. When viewed from a distance of 58 cm, the full screensubtended a visual field 32◦ x 24◦ of visual angle. One pixel corresponded to a visualangle of 0.05◦.

The stimuli tested typically contained a color figure in a background of contrastingcolor. Four colors were tested: a red and green contrasting color pair, and a blueand yellow contrasting pair. Fig. 1A illustrates the red-green stimulus pair. The fourcolors, plus the intertrial gray, were equiluminant at 7.4 cd/m2, as measured usingTektronix J17 Lumacolor Photometer/Colorimeter. Their values in CIE coordinates(x,y) are (0.603, 0.364) for red, (0.286, 0.617) for green, (0.454, 0.481) for yellow and(0.149, 0.066) for blue. In the LMS coordinate system, the cone contrast of our red orblue relative to the gray is about the same, and is more than twice that of our greenor yellow. The colors were chosen to be as saturated as possible subject only to theequiluminant constraint.

Note that colors used in other color tuning or color coding experiments (Wachtleret al. 2003, Friedman et al. 2003) are chosen to have equal cone contrasts relativeto the adapting gray. This constraint is important for assessing the contributionsfrom the L,M,S cone inputs to the neuron in question. However, these ‘balanced’

Page 5: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 5

colors tend to be unsaturated, pastel-looking in appearance and not very salient. Theequiluminant colors we used were not equalized in cone contrast. Thus, the colortuning curves we obtained are not based purely on color, but subject to influence ofcone-contrast as well. Our color tuning curve should be interpreted with caution inthat regard.

We consider our choice acceptable for the purpose of our questions for the followingreasons. First, it is well known that V1 neurons’ color preferences are fairly diverseand dispersed, covering a variety of color, and are no longer limited to the canonicalLMS cone axis (Lennie and Movshon 2005). Hanazawa et al. (2000) demonstratedthat V1 cells’ color tunings exhibit preference to a variety of cues and saturation, sug-gesting an elaboration of color tuning to support color perception. Thus, in theory,any set of colors that span the color space are equally legitimate for demonstratingcolor sensitivity defined in a more general sense. Second, empirically, even thoughthe red and blue stimuli are stronger in cone contrasts, many of the neurons we stud-ied exhibited tuning preferences for green or yellow. This suggests that V1 neurons’color tunings are not solely determined by the input’s cone contrast but by morecomplex factors.

Spatiotemporal Dynamics of Neural Responses to Color Stimuli

To examine evidence of filling-in, we first evaluated the neural responses to differentparts of the stimulus images that contain a 4◦ × 4◦ solid color figure in a contrastingequiluminant color background (Fig. 1A). One stimulus image contains a red figurein a green background while the other contains a green figure in a red background.Each image was presented statically on the screen for 350 msec in each trial. Theimage was shifted spatially over successive trials, placing the figure at one of the 12locations relative to the receptive field at 0.66◦ or 1◦ intervals, spanning a range ofover 10◦ (as shown in Fig. 1A). This allows monitoring of the temporal evolution ofthe neuron’s response to each particular spatial location of the image.

Fig. 1B and 1C present the temporal responses of two representative V1 neurons to12 different spatial locations of each of the image pairs shown in Fig. 1A. Neuronstended to exhibit strong and sustained responses at the boundary, but many cellsalso exhibited weak sustained responses to the solid color surface where there wereno contrast features within the receptive fields, which at 2−4◦ eccentricity, were lessthan 1◦ in spatial extent.

Many effects appeared to be superimposed in the neurons’ responses to such stimuli.First, the transition from gray screen to color figure produced a temporal chromaticcontrast signal that elicited a fast initial response in the neurons. Even though thestimuli were of equal luminance, the cones have different sensitivities to these colors,

Page 6: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 6

Figure 1: (A). A pair of complementary stimuli tested: a red figure in a green back-ground, and a green figure in a red background. The other pair is blue/yellow figurein yellow/blue background. Crosses indicate the 12 RF locations of the neuron foreach image to be sampled. (B) The spatiotemporal response profiles of a neuron inresponse to the pair of stimuli above. Each depicts the temporal evolution of neu-ral responses at the different spatial locations indicated. The response is normalizedagainst the maximum response of each cell to the pair of the complementary stim-uli for easy visualization. The center of the figure in the stimulus was located at 0◦

along the x location axis, with the borders of the figure located at−2◦ and 2◦. (C) Thespatiotemporal response profiles of another neuron to the same stimuli, showing afigure enhancement effect regardless of the color of the surface.

Page 7: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 7

0

100

200

300

400

500

6000−30 ms 20−50 ms 30−60 ms 60−90 ms

−2 0 2 40

100

200

300

400

500

600

Firi

ng r

ate

(Spi

kes

per

seco

nd)

90−120 ms

−2 0 2 4

120−150 ms

Spatial location in visual angle−2 0 2 4

150−180 ms

Spatiotemporal profile of neuronal responses to figure−ground stimuli (di071304.7, cell1)

−2 0 2 4

180−210 ms

0

50

100

150

2000−30 ms 30−60 ms 70−90 ms 90−120 ms

−2 0 2 40

50

100

150

200

Firi

ng r

ate

(Spi

kes

per

seco

nd)

120−150 ms

−2 0 2 4

150−180 ms

Spatial location in visual angle−2 0 2 4

220−250 ms

Spatiotemporal profile of neuronal responses to figure−ground stimuli (di071404.4, cell1)

−2 0 2 4

340−370 ms

Figure 2: Two neurons’s spatial response profile to the two stimuli in Figure 1 indifferent time windows.

and thus provide a differential bottom-up response to the V1 neurons. Secondly, inthe later part of the responses, 100 msec after stimulus onset at the latest, the neuronsexhibited preferences to the different colors, independent of the initial bottom-up re-sponse. Some neurons prefer red in the later part of their responses, while otherneurons prefer green in the later part of their responses. Thirdly, we found the neu-rons’ responses were stronger when their receptive fields were inside the figure thanwhen their receptive fields were inside the background. This is evident in the twoneurons shown in Figure 1B and 1C. Figure 2 shows the spatial response profile ofthese neurons in different time window post-stimulus onset, again to illustrate thestrong initial responses at the boundary, and the subsequent differential responsebetween the figure (between -2 and 2 along the x-axis in the graph) and the back-ground.

Figure-enhancement Contextual Modulation

We first investigate the differential response of the neurons to figure and backgroundin these color stimuli, analogous to the figure enhancement observed by Lamme(1995) on texture figures. Fig. 3 shows the population averaged responses of 82 V1neurons in responses to the different parts of these two test stimuli in Fig. 1.

We examined the population average temporal response of these neurons to threespecific locations of each of the two images (the figure center, the figure border,and 2◦ outside the figure border). Each cell’s peri-stimulus histogram (PSTH) wassmoothed with a 10 msec window before population averaging. The population re-sponses were stronger initially for the red figure than for the green background (Fig.

Page 8: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 8

3A), and were stronger initially for the red background than the green figure (Fig.3B), clearly indicating that the red stimulus might be a more potent bottom-up stim-ulus. However, as time evolved, the later part of the population PSTH was strongerin the green figure than in the red background (Fig. 3B). The sustained neural ac-tivity in the later part of the response thus is stronger inside the figure than in thebackground, regardless of the color of the receptive field stimulus. This is a ratherdramatic effect, not observed in neural responses to texture stimuli (Lamme 1995,Lee et al. 1998).

Figs. 3C and 3D compare the average temporal response of this population to thesame receptive field stimulus (red or green respectively) in the figure (center) condi-tion and in the background condition (2 degrees outside). It can be observed that theresponses to the two conditions start to diverge significantly at 70 msec for red and66 msec for green. We assessed the figure enhancement onset time for each neuronby performing a running T-test with a window of 20 msec, starting with stimulusonset, to determine when the responses (across different trials) at the figure centerbecame greater than the responses in the surround with statistical significance (witha p ≤ 0.05). To make the estimation robust against noise, we additionally requiredthe enhancement be statistically significant for at least 70 percent of the subsequent20 windows, shifted in 1-msec step.

This enhancement of response to identical stimulus when it is placed inside the fig-ure relative to when it is placed outside the background is called the ‘figural enhance-ment’ responses, as originally observed by Lamme for texture stimuli. However, thefigural enhancement effect for these color figure stimuli is much greater. The on-set time of the figure enhancement effect for color stimuli could be as early as 50-60msec, but on average the enhancement effect tend to emerge in the later part ( 100ms post-stimulus onset) of the responses (see Figure 3E and 3F).

Figure Enhancement and Color Selectivity

To evaluate the figural enhancement effect as a function of color, we measured 81 (54from monkey D, and 27 from monkey B) V1 neurons’s responess to the four colors(red and green stimuli as shown in Figure 1, as well as yellow and blue stimuli ofequal luminance) both at the center of the figure and in the background condition(5o away from the figure’s border), as well as an equililuminant gray screen. Figure4 shows a typical neuron’s response to the four colors in the figure condition (firstrow) and the ground condition (second row). It can be observed that the neuron’sresponses when its receptive field was inside the figure was significantly strongerthan its responses when its receptive field was in the background, for all four colors.

To compare the figure enhancement effect observed in color figures with data from

Page 9: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 9

A B

C D

! "! #!! #"! $!! $"! %!! %"!!

"

#!

#"

$!

$"

%!

%"&'()*+,-./0+/1.2./3-4/5.3-67*../-8&-53'2),)59

:/5.3-3'2.-625.19

;)2<.*-4=-1.,,5

;>[email protected]'+/>#BB@.+/>#B!C?DE!FC%

! "! #!! #"! $!! $"! %!! %"!!

"

#!

#"

$!

$"

%!

%"&'()*+,-./0+/1.2./3-4/5.3-67.8-7&-53'2),)59

:/5.3-3'2.-625.19

;)2<.*-4=-1.,,5

;>"[email protected]'+/>?A@.+/>#!%BCDE!FBF

E F

Figure 3: (A,B) Population responses of V1 neurons to three specific locations of eachone of the complementary stimulus pair. (C,D) Population responses of V1 neuronsto same RF color stimulus in the figure or the ground context. Border response isalso shown for reference. (E,F) Distribution of the figure enhancement onset time forred surface and green surface.

Page 10: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 10

0 500 10000

50

100

150

200

250Red Figure

Firin

g ra

te (s

pike

s/se

cond

)

Time post!stimulus onset (msec)0 500 1000

0

50

100

150

200

250Green Figure

0 500 10000

50

100

150

200

250Blue Figure

0 500 10000

50

100

150

200

250Yellow Figure

0 500 10000

50

100

150

200

250Red Ground

0 500 10000

50

100

150

200

250Green Ground

0 500 10000

50

100

150

200

250Blue Ground

0 500 10000

50

100

150

200

250Yellow Ground

Figure 4: A V1 neuron’s response to the four colors in the figure condition (first row)and the ground condition (second row). This neuron cell prefers yellow the most,but also responds considerably to blue. Enhancement of response is observed whenthe cell (receptive field diameter 1 degree) is inside the 4-degree figure than when itis in the background.

Page 11: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 11

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1(F−G)/(F+G)

Red

Green

sumu

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1(F−G)/(F+G)

Yellow

Blue

sumu

Figure 5: Scatter plots of figure-enhancement for the four colors. Red and Green (leftgraph). Yellow and Blue (right graph). Substantial and significant figure enhance-ment effect is observed for all colors.

earlier texture studies (Lamme 1995, Zipser et al. 1996, Lee et al. 1998, Rossi et a.2001), we computed the figure enhancement index for each color as (F-G)/(F+G),where F is the response of the cell inside the figure of a particular color, and G is theresponse inside the background of the same color. Figure 5 shows the scatter plotsof the figure enhancement indices for the four colors. The mean figure enhancementindices in this population of V1 neurons are 0.26, 0.36, 0.24, 0.38 for red, green, yel-low and blue figures respectively, which are much stronger than the mean figureenhancement index (about 0.1) observed in V1 for texture stimuli (Lee et al. 1998).We found 51% of the cells showed significant enhancement for all four colors, 27%for three, 16% for two, 6% for one color only. Thus, the figure-enhancement effectobserved in solid color figure is significantly stronger than the effect observed fortexture stimuli.

While many of our recorded cells have strong transient response to red, they tendexhibit preference for different colors in the later part of their response. Fig. 1 andFig. 2 show examples of single-cells and Fig 3B shows population averages thatinitially responded predominantly to red ground and later to green figure. This laterdominance of response to the green figure over red backgfround in Figure 2B canarise from a strong figure enhancement over the background or a development ofpreference for green color by the neurons.

To decompose these two potential factors, we analyzed the neuronal responses tored and green in either the figure context or the background context separately. Weseparate the cells into groups depending on their color preference based on the laterpart of their responses to the different colors in the figure condition. The responses of

Page 12: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 12

of the neurons that ultimately prefer green over red are the most revealing, as shownin Figure 6. Among these 81 neurons, 25 exhibited preference for green over othercolors in the later part of their responses. From their population PSTHs in responseto red and green figures (Fig. 6A), and red and green backgrounds (Fig. 6B), in bothcases we observed that the response to the green surface became greater than theresponse to the red surface at about 90 msec. We also estimated the contextual colortuning onset for each cell using the same running T-test used for estimating figureenhancement onset. Fig. 6C shows the distribution of the contextual color tuningonset, yielding a mean of 103 msec (median = 104 msec) for the figure condition, and117 msec (median = 91 msec) for the background condition.

We take this later onset of preference for green, or other colors, over red, as the factthat the articulation of color preference in V1 neurons happens in the later part oftheir responses. We applied the same method we used to estimate the figure en-hancement onset to estimate the onset of the color tuning as defined here i.e. thetime the green overtakes the red. The distribution shown in Figure 6E shows a meanof 113 msec (median = 101 msec), comparable to the distribution of color tuning on-set time observed above. Fig. 6F compares the color tuning onset and the figureenhancement onset estimated for each neuron. Except for three or four neurons,whose figure enhancement onset was significantly delayed relative to color tuningonset, the figure enhancement and the color tuning appeared to emerge almost si-multaneously for most neurons inside the 4◦ × 4◦ figure, lagging behind the borderresponse by 70 msec.

We found the neurons’ preferences for color in the later part of their responses werediverse (see Figure 7). Among the 81 selected color sensitive cells studied, 15 of themare boardly tuned. The remaining prefer one or two of the four colors strongly overthe rest. 17 of them prefer red, of which 6 also prefer yellow (‘orange’ cells) and 6also prefer blue (‘magenta’ cells). 36 cells prefer green, of which 13 prefer yellow(‘yellowish green’ cells), and 8 prefer blue (‘cyan’ cells). 28 cells prefer yellow, ofwhich 6 also like red (‘orange’ cells), and 13 also prefer green (1 ‘yellowish green’cells). Finally, 18 cells like like blue, of which 8 also prefer green (‘cyan’ cells), and 6also prefers red (‘magenta’ cells). Tuning tend to be signficantly sharper in the figurecondition than in the ground condition in large part because the neural responsestend to be much weaker in the ground condition (see Fig. 5). Color tuning wereconsistent in 54 % of the neurons, roughly exhibiting a multiplicative gain. But for46% of the neurons, the tuning tuning appeared to be changed between the twoconditions, probably due to a change in chromatic context: for example, in the figurecondition, the receptive field could be seeing red surrounded by green, while inthe ground condition, the receptive field could be seeing red surrounded by blackmonitor frame, with a green figure somewhere on the screen. The bottomline is thatV1 neurons we studied appeared to be tuned to a spectrum of color, as described in

Page 13: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 13

0 50 100 150 200 250 300 3500

50

100

150

200

250

300

350Onset time: Color vs Figure

Figural enhancement Onset (msec)

Col

or tu

ning

Ons

et (m

sec)

−100 0 100 200 3000

10

20

30

40

50

60

70

80Response to R/G Background (green cells)

Time post stimulus onset (msec)Fi

ring

rate

(sps

)

N(red)=28N(green)=28

Red GroundGreen Ground

−100 0 100 200 3000

10

20

30

40

50

60

70

80Response to R/G Figure (green cells)

Time post stimulus onset (msec)

Firin

g ra

te (s

ps)

N(red)=28N(green)=28

Red FigureGreen Figure

0 50 100 150 200 250 300 3500

2

4

6

8

10

N=25Median=101Mean=113.7+/−8.8

Green figure−ground Onset time

Onset time (msec)

Num

ber o

f cel

ls

0 50 100 150 200 250 300 3500

2

4

6

8

10

N=19Median=91Mean=117.4+/−13

Green Onset time (ground)

Onset time (msec)

Num

ber o

f cel

ls

0 50 100 150 200 250 300 3500

2

4

6

8

10

N=25Median=104Mean=103+/−4.7

Green Onset time (figure)

Onset time (msec)

Num

ber o

f cel

ls

A

C

E

B

D

F

Figure 6: (A,B) Population average PSTHs of 25 green cells in response to the greenfigure versus the red figure (A), and to the green ground verus the red ground (B).(C,D) Distribution of onset of contextual color tuning in these green neurons in thefigure condition (C) and the ground condition (D). (E) Distribution of onset timingof the figure enhancement effect for these green neurons. (D) Comparison betweenthe figure enhancement onset and the contextual color tuning onset.

Page 14: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 14

10

20

30

40

YELLOW

BLUE

GREEN RED

be323LW4d2

10

20

30

40

YELLOW

BLUE

GREEN RED

be3314BWd2

20

40

60

80

YELLOW

BLUE

GREEN RED

be3354BYd2

10

20

30

40

YELLOW

BLUE

GREEN RED

di0528041BWa2

Figure 7: Examples of color tuning in four individual neurons, prefering one of thefour colors. The thick line is the color tuning in the figure condition while the thinline is the color tuning in the ground condition.

Hanazawa et al. (2000).

Untangling Figure Enhancement and Color Coding

Our experimental results show that multiple factors likely contribute the neural ac-tivities observed in these stimuli, namely, difference in chromatic context, articula-tion of color tuning and bottom-up cone sensitivity effect, figure enhancement orsaliency effect. To dissociate these three confounding factors, we developed the fol-lowing novel asynchronous stimulus update paradigm (Fig. 8A) to test an additional55 neurons, drawn evenly from the two monkeys.

In each trial, while the monkey fixated, the color of the full screen was first changedfrom gray (which was presented between trials) to one of the four colors; 350 mseclater, the region surrounding the receptive field of the neuron was abruptly changedto a contrasting color, making visible a disk figure over the receptive field withoutchanging the RF stimulus. The stimulus remained on the screen for another 400msec and then the screen returned to gray. In addition, the ‘figure’ (rendered in thesurround update) was presented in four sizes (1, 3, 6, 9 degrees in diameter ) to ma-nipulate figure-enhancement or saliency effect by changing the figure size. For the1 degree figure, the receptive fields of the cells likely hit the chromatic border. Only12 cells were recorded for this small figure later on in this epxeriment to provide areference for the onset of boundary response. All 55 neurons were tested with the 3,6, and 9 degrees disks which are much larger than the receptive fields and have thesame chromatic context.

The whole screen of uniform color served as a baseline condition for evaluating thefigure response. It was observed that V1 neurons typically gave a transient responsewhen the entire screen changed color, followed by a weak sustained response, whichnevertheless exhibited a color tuning as before. This is analagous to the backgroundcondition in the first experiment. When the surround changed to the opponentcolor during surround update, the neurons gave a strong and robust response, even

Page 15: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 15

Monkey fixates Target color

appears Background changes

color

Target disappears,monkey makes asaccade to a dot

−600 −400 −200 0 200 4000

20

40

60

80

100

120

140

160

Time relative to surround update (msec)

Firin

g ra

te (s

ps)

Temporal evolution of responses (Green RF stimulus)

1º disk3º disk6º disk9º diskuniform

1 3 6 90

20

40

60

80

100

120Figural enhancement onset (Green)

Diameter of the disk (degrees visual angle)

Aver

age

onse

t tim

e (m

sec)

−600 −400 −200 0 200 4000

20

40

60

80

100

Time relative to surround update (msec)

Firin

g ra

te (s

ps)

Temporal evolution of responses (Red RF stimulus)

1º disk3º disk6º disk9º diskuniform

1 3 6 90

20

40

60

80

100

120

140Figural enhancement onset (Red)

Diameter of the disk (degrees visual angle)

Aver

age

onse

t tim

e (m

sec)

A

B

C

Figure 8: (A) Asynchronous update paradigm. The entire screen was turned red at-350 msec, followed by the appearance of a green background surrounding the RFat 0 msec, making visible a circular red disk over the RF. (B) Population PSTH ofcells responding to red figure (left panel) and green figure (right panel) of differentdiameters (1, 3, 5, 9) as well as full color screen. Surround update occurred at 0 msec.(C) Mean response onset to ‘red figure’ subsequnt to surround update as a functionof the figure’s diameter (left panel). Mean response onset to the ‘green figure’ (rightpanel).

Page 16: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 16

though nothing was changed within the receptive field or in its proximal surround.Note that the graph of the average response to the 1 degree figure appeared to be dif-ferent after the first update even though the stimulus is identical. This is because theaverage is based on only a subset of the neurons. The response to the 1 degree figuretend to have a very immediate onset, suggesting feedforward border contrast stim-ulus likely impringes on the receptive field. Fig. 8B shows the population PSTHsfor the cells that showed significant responses to the surround update, which tookplace at 0 msec in the graph, whereas the full screen color change (from gray to color)occurred at -350 msec on the graph.

The responses subsequent to the surround update (at 0 msec) were quite dramaticand sustained. It was statistically significant (relative to the baseline) for 27 cellsfor the 3-degree figure, 20 cells for the 6-degree figure, and 17 cells for the 9-degreefigure. Since the size of the receptive field of the neuron is typically less than 0.7◦,the responses to the surround update were considered contextual. Given the colorof the full screen can be readily perceived, and that the color tuning has already beenestablished after the first update, the sustained significant response observed afterthe surround update can be attributed to the figure enhancement effect. The colortuning curve of the neurons in the different sizes (particularly between 3 degreeand 6 degree disk) was essentially the same, differ by a multiplicative gain, thoughthey could be substantially different from color tuning in the first update due to thedifference in chromatic context.

The strength and sustained nature of this contextual signal suggests that the sus-tained signals observed inside the figure in the first experiment is primarily dueto contextual modulation. The onset of the response due to surround update wassignificantly delayed relative to the onset of the responses to border or even to thesurface. The mean onset time of the contextual responses, estimated based on thecells that showed measurable and significant responses, were 86, 112, 116 msec (red),and 62, 76, 101 msec (green) for the 3,6,9-degree diameter red and green disks re-spectively. Xiaogang, include delay for yellow and blue as well The responses was signif-icantly delayed relative to the response to the 1-degree figures (53 msec for red figure,and 36 msec for green figure) which is more likely resulted from direct stimulationby the figure’s border. Such progressive delay with an increase of figure diameter isconsistent with a propagation process rather than direct stimulation. What is the na-ture of the signal being propagated? Is it a repesentation of color, or a representationof saliency, or both?

Page 17: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 17

Figure Enhancement and Image Representation

The progressive delay of response onset to color stimulus as a function of size issimilar to what Huang and Paradiso (2008) found for luminance disk in their study.Huang and Paradiso (2008) tested the responses of V1 neurons to the center of awhite disk in a black background. As they changed the size of the white disk, theonset of the neurons’ responses became progressively delayed. They interpretedthis effect as ‘brightness’ filling-in, corresponding to what Paradiso and Nakayama(1992) observed decade earlier in a classic psychophysics experiment on the neuraldynamics of filling-in. However, without testing also the neural response to blackdisk in a white background, one cannot be certain whether the delayed neural re-sponses observed inside the figure was coding brightness or coding saliency (as infigure enhancement).

One can argue that in our second experiment (Figure 8), the neural correlate of theperception of red surface is already established after the first update, the secondwave of much strong response is likely due to contextual modulation of spatial con-figuration of the stimulus. The magnitude of the response became weaker as the sizeof the white disk increases, yet the percept of brightness or color does not appear tochange with an increase in the size of the figure. This might suggest that this secondwave can be attributed to saliency or gain modulation, rather than color coding.

However, such an argument is based implicitly on a common assumption that torepresent uniform brightness, neurons need to respond uniformly across the space.We argue here that this commonly held assumption could be false, that an isomo-metric representation of color and brightness is in fact compatible with a neural ac-tivity that is dependent on figural size or distance away from contrast border. Thatis, the edge-dependent figure enhancement signal observed in single unit activitiesreported earlier (Lamme 1995, Lee et al. 1998) and here, and in fMRI findings (Cor-nelissen et al. 2006) is compatible with models of isomorphic image representationin the primary visual cortex.

Faithful image representation is a starting assumption of the efficient sparse codingaccount of the emergent properties of simple cell receptive fields (Olshausen andField 1996, Lewicki and Olshausen 1999), as well as in hierarchical predictive coding(Rao and Ballard 1999) or hierarchical Bayesian inference model (Lee and Mumford2003). In such an image model, any arbitrary images can be represented using anovercomplete set Gabor wavelets, which model V1 simple cells, (Daugman 1988,Lee 1996):

I(x, y) =∑

i

aiφi + n (1)

Page 18: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 18

where n is noise or factors unaccounted for. The represented (or perceived) imageI attempts to explain the input image I(x, y) as a linear weighted sums of wavelets.The weight ai of each wavelet φi (presumably the firing rates of simple cells) areobtained by minimizing the following objective function, which is the square errorof how well these weight sum can explain the image (Eq. 2),

E =∑

x,y(I(x, y)− I(x, y))2

=∑

x,y(I(x, y)−∑

i aiφi)2(2)

The φ is related to the synthesis basis of a simple cell, which is complementary to theconventional filter notion of receptive field, and ai the cell’s firing rate. It has beenshown that the filters and the basis derived from independent analysis resembledthe receptive fields of V1 neurons.

The highly redundant nature of the wavelet representational elements (i.e. simplecells) implies ai cannot simply take the form of the filter response, but need to beadjusted over time. This set of ai can be obtained by minimizing the convex objectivefunction by iterative gradient descent, which can be realized by horizontal networkconnections. The connectivity of the network can be revealed by taking the partialderivative of the error function (Eq. 2) with respect to ai to arrive at the descentequation (Eq. 3).

daidt = − dE

dai

= 2∑

x,y I(x, y)φi(x, y)− 2∑n

k

∑x,y ak(x, y)φk(x, y)φi(x, y) (3)

The first term of the descent equation (Eq. 3) computes the dot product of the in-put image with the Gabor wavelet, using them as the feedforward receptive fieldsof the neurons. Here, the synthesis basis is used initially as the feedforward filters.The second term represents the lateral interactions between neuron i and any otherGabor wavelet neuron with redundancy in their representation. That is, two neu-rons inhibit each other when the dot products of their ‘receptive fields’ are positive,and facilitate each other when the dot products of their ‘receptive fields’ are nega-tive. When their receptive fields are orthogonal, there will be no interaction betweenthem for the purpose here. On the other hand, there are other long-range interac-tion between complex cells such as longitudinal facilitation along association, whichare distinct from the local interaction for redundancy removal between simple cellsmodeled here), which are not captured here.

The implemented model consists of a retinotopic map of cortical hypercolumns cov-ering an input image of 192x192 pixels. The pixel intensities of the input image wererescaled to max (white) of 127.5, min (black) of−127.5 and mean (gray) of 0. The im-age was represented by a set of Gabor wavelets specified by (Eq. 4), as in Lee (1996).

Page 19: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 19

The Gabor wavelets were constrained to a single scale with 8 orientations, at 22.5◦

intervals. In each orientation, the Gabor filters exist in both even and odd-symmetricforms, with zero d.c. response (Lee 1996).

φ(x, y, ωo, θ) = ωo√2πκ

e−ω2

o8κ2 (4(x cos θ+y sin θ)2+(−x sin θ+y cos θ)2)

·(

eiωo(x cos θ+y sin θ) − e−κ2

2

) (4)

We do not model the change in retinal or cortical receptive field size as a functionof eccentricity, i.e. the spatial resolution is assumed to be constant across the retino-topic map. While the responses of these zero-dc wavelets depend on contrast only.That is, neurons do not respond initially to uniform luminance surface, where thereare no luminance or chromatic contrast within the receptive fields. However, as wewill show, a population of such zero d.c. wavelets can nevertheless represent surfaceof uniform brightness in the later part of their responses. This model works with avariety of parameters and also on multiple scales (see also Lee 1996). Here, for sim-plicity, we only used the model with the single (finest) scale, with a spatial samplingof 2 pixels, spatial frequency of 0.314 pixels, spatial frequency bandwidth of 3.5 oc-taves, and two spatial phases, i.e. even and odd symmetric Gabor wavelets, whichis sufficient to represent uniform luminance images.

In our simulation, when a stimulus is first presented, the inital firing rates, ai, ofthe neurons are set equal to the responses of filter response of the Gabor waveletsto their corresponding image patch, relating to the initial feed-forward responses ofthe V1 simple cells. The responses of the cells are subsequently adjusted by gradientdescent to minimize the image representation error function (Eq. 2). Each iterativeapproximation of the input image is represented by an arbitrary length of time be-cause of unknown psychophysical parameters that would be required to discern ourmodel’s time-scale. Hereafter, we refer to time in units of iterations passed.

We tested the responses of the neurons to a disk of uniform brightness, and examinedthe spatial and temporal evolution of the synthesized image I . We also comparedwavelet coefficients of the basis (simple cells), as well as the sum of ther absolutevalues (complex cells) during the image synethesis process to observed neural re-sponse. The represented image were rescaled to 8-bit, grayscale, pixel intensities.Individual hypercolumn responses represent the average magnitude response of allthe wavelets φi at each scale populating that corresponding spatial location. Ourhypothesis is the spatiotemporal profile of observed neural responses (which ap-peared to be a smoothed version of the edge responses) can be understood in termsof the spatiotemporal profile of the wavelet coefficients for representing the lumi-nance disk.

Page 20: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 20

Figure 9: Sum of responses of neurons at all orientations at each position along thehorizontal midline to show the temporal evolution of the population responses. Thewhite and the black disks excited complementary sets of neurons (see Figure 9), butthe total population responses to the two stimuli were equal. There is a gradualdecrease in response away from the border toward the center in most neurons.

Page 21: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 21

Our simulation result shown in Fig. 9 demonstrates a successful synthesis of thewhite disk stimulus. The temporal evolution of the synthesized image I(x, y) showsfirst a representation of the contrast border of the white disk, followed by a filling-inof brightness in the surface of the disk. To examine the evolution of total neural re-sponses within each hypercolumn at different spatial locations across a cross-sectionof the disk that accompany the evolution of the represented image (Fig. 9), wesummed the absolute values of the coefficients of Gabor wavelets (all orientations,even and odd-symmetric) at each location. Note that all these neurons are orienta-tion tuned with zero dc components. Over time (iterations), the coefficients (firingrates) can be adjusted so that these Gabor wavelets can represent the disk perfectly.Even though a uniform brightness disk is represented at the end, the sum of neuralactivities across the hypercolumn is not uniform across space. It is also important tonote that the black disk essentially evoked the same spatiotemporal responses, at thepopulation level, as shown in Figure 9.

It is interesting to note that the temporal evolution of the synthesized image by gra-dient descent through network interaction exhibit the same brightness filling-in dy-namics observed by Paradiso and Nakayama (1991) in their psychophysical exper-iments. The final column is illustrative of the fact that a uniform brightness (color)disk can be represented by these population of wavelet neurons with a distributionof coefficients that emphasize on the edge over the interior, which become a bit moresmoothed over time. The response plotted here is the sum of neurons of all orienta-tions at each spatial location, and is comparable to the observations in Cornelissenet al.’s (2006) fMRI study, in which every voxel obviously is the sum of activities ofall neurons in the voxel.

Figure 10 dissects this population responses further, into vertical and horizontal se-lective neurons, each of the even and odd symmetric phase types (receptive fieldsare shown in the icons). It reveals that the odd symmetric vertical cells are respon-sible for the lion share of the representation. Their responses started with sharpedge responses that spread out over time. The even symmetric neurons, of bothvertical and horizontal selectivies, interestingly, started out with edge responses butexhibited uniform enhancement (or responses) inside the bright disks later on. Notethat here the on-center even-symmetric cells exhibited the enhancement, while theoff-center even-symmetric cells would exhibit the enhancement inside the figure inresponse to the black disk stimulus. Complex cells can be modeled as the sum ofthe absolute or squared values of four types of cells, and would likewise exhibitedenhancement within the figure.

Temporally, at the onset of the stimulus the maximal response was observed at thecontrast border of the disk, which rapidly declined and approached steady-stateover time. The latencies to achieve steady state response by the surface hyper-

Page 22: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 22

!"#$%"&'

()*

!"#$%"&'

()+,

!"#$%"&'

()-,

.#/'(0"$1/"&'(23&"#)4(&5'$6)7&08

,9:

;,9:,9:

;,9:

,9:

;,9:,9:

;,9:

,9+

;,9+,9+

;,9+

,9+

;,9+

,9+

;,9+

*

;*

,9<

;,9<

.#0='(0#,9+

;,9+

,9:

;,9:

.#0='(0# .#0='(0# .#0='(0#

Figure 10: Neuronal responses to the white disk at 1st, 20th and 50th iterationsare separated into the vertical oriented odd-phase (second column) and even-phase(third column) neurons’ responses and the horizontal odd-phase (fourth column)and even-phase (fifth column) neurons’ responses. The response has both positiveand negative components. It is understood that the positive components are car-ried by the neurons with the receptive field shown, while the negative componentsare carried by the neurons with complementary receptive fields, i.e. with 180 phaseshift, comparable to on and off-center cells in the retina. It shows that the odd sym-metric filters carry the lion share of response and strong edge responses spread overtime away from the initial border responses, while while the responses of the evensymmetric filters exhibit stronger responses inside the figure in the later part of theresponses. This positive even symmetric neurons exhibit an increase in response in-side this bright disk figure, and the negative even symmetric neurons will exhibitan incease in response inside the black figure. Thus, figure enhancement responsecan be observed in the even symmetric neurons, while the odd symmetric neuronsexhibit edge response spreading that is slightly stronger inside the figure as well.

Page 23: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 23

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

Iterations

Ave

rage

d M

agni

tude

Res

pons

e

Temporal Dynamics of Gabor Wavelet Response to White Disk

Figure 11: The responses of the simulated neuronal population at each spatial loca-tion relative to the contrast border, sum of absolute values of wavelet coefficients ofwavelets of different orientations and phases at each location, show that as the dis-tance away from the border increases, the response onset increasses, and the magn-tidue of the responses decreases, as also observed in our single and multi-unit data.

columns monotonically increases with distance from the border to the center. Thesteady-state average magnitude response of the cross-sectional hypercolumns alsodecreases with increasing distance from the border (Fig. 11). These are very similarto the size-dependent onset delay we observed in our second experiment as well asin Huang and Paradiso’s experiment.

Therefore, this simulation result show that neural activities for representing uniformbrightness can exhibit the same border-distance or figure-size dependency in mag-nitude and onset delay in the observed neural responses. Therefore, the delayedresponses observed inside the luminance or color figure could in fact be the appro-priate responses for representing the stimuli, its brightness or color, rather than anadditional gain (attention) signal. So Huang and Paradiso’s (2008) responses mightbe related to brightness coding after all.

Page 24: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 24

That is not to say that the figure-enhancement effects or surround suppression effectsobserved in earlier studies can all be explained in terms of image representation. Ahallmark of the figure enhancement effect is very sharp drop of enhancement out-side the figure boundary (Lamme 1995, Lee et al. 1998, also see Figure 1). In thesimulation, this is roughly true for the even symmetric neurons, but not true forthe odd-symmetric neurons in the simulation, which experienced spreading of theboundary signals on both sides of the border. Obviously, image representation can-not be the sole objective of the rich recurrent ‘neural circuits in V1, which likely cansupport a variety of high-resolution computation such as pop-out saliency compu-tation, image segmentation, surface interpolation and contour completion as manyother studies have shown. To create such a sharp cliff in response enhancement,a center-surround mechanism (with a large inhibitory surround) operating on eachchannel, or other recurrent attention feedback conditioned on segmentation bound-ary, might be neeeded. We are only suggesting that image representation model canaccount for many (but not necessarily all) aspects of the neural observations, and iscertainly a possibility that we should entertain.

Isomorphic Image Representation and Horizontal Circuitry

This work began by investigating Lamme’s figural enhancement effect in chromaticfigures. We found the effect to be much stronger for uniformly colored figure thanfor texture figures. The most novel and dramatic finding is that contextual modu-lation, not the direct receptive field, is responsible for the lion share of a neuron’sactivities for such stimulus. We further showed that the this contextual modulationmediates multiplicative gain modulation on the tuning of the neurons. Hence theactivities might be a product of two processes: articulation of color tuning and asaliency/gain signal spreading from the contrast border. However, simulation of anetwork model motivated by image synthesis suggests these neural phenomena canpotentially be account for by a single process related to image representation. Thesimulation shows that the responses of wavelet basis required to represent the uni-formly colored stimuli exhibit a dependency on figure size and distance to border asthe observed responses of the neurons. Furthermore, the evolution of representedimage in the network resemble the filling-in phenomena observed by Paradiso andNakayama (1991) in their psychophysical experiments. These findings shift our per-spective on the nature of some of these contextual modulation neural phenomenaand consider that the contextual modulation response in this context might actuallybe the codes for color representation itself.

The implication of these findings is that image representation may be an importantfunctional objective of the horizontal recurrent circuits, and the hypothesis that theprimary visual cortex might furnish an isomorphic representation of input or per-ceived images remains a viable one. We have already shown in our discussion of

Page 25: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 25

results Cornelissen et al.’s (2006) fMRI findings are in fact very similar to the popu-lation activities at each location in our simulation of the image representation model(Figure 9). The edge-dependent spatial response profile is in fact consistent with therepresentation of a uniform brightness or colored figure. A more serious challengeon the isomorphic representation of images in the early visual cortex has been ad-vanced by Von der Heydt and colleagues’ study on color filling-in, which we willnow discuss.

Von der Heydt et al. (2005) found that neurons responsive to uniform color sur-face did not change their responses as would be predicted by color filling-in in theTroxler illusion. Puzzled, they proposed that subjective experience of solid color ina uniform surface is not represented isomorphically in the early visual cortex, andthat the V1 ‘surface neurons’ they studied are simply coding veridically the bottom-up color signals coming in from the retina. They argued that somewhere higher up,the brain can infer the solid color surface based on the edge responses, and thereis really no need to isomorphically representing our mind’s image in the early vi-sual areas. Their experiment was elegant and not having done that experiment, wecould only venture a rationalization on their negative results. One possibility is thatthey have focused their study on the neurons that respond to full screen solid color,or what they considered as surface neurons. These surface neurons do not exhibitfilling-in effect. Here, we showed by simulation that the Gabor wavelet edge sensi-tive neurons can mediate the color and brightness filling-in and represent uniformlycolored surfaces. It is possible that the neurons we studied are in fact more similar totheir ‘border’ neurons than their ‘surface’ neurons, as our neurons responded verystrongly to the chromatic border and only modestly to the background color sur-face (Fig 8). The data reported here suggested that these chromatic contrast ‘border’neurons in fact can encode surface color because of their sensitivity to the chromaticcontrast in the surround, far away from their receptive fields. Indeed, they foundtheir border neurons in fact exhibit sensitivity to the Troxler illusion: when filling-inoccurs, the neurons responnding to the chromatic contrast stop responding whenthe monkey stops to perceive those chromatic contrast. We believe an importantcontribution of this paper is to argue against the surface and boundary needs to berepresented by two distinct sets of neurons, and that uniform surface color can berepresented by border sensitive neurons.

From the perspective of isomorphic image representation model, as shown by resultsin Figures 9-11, signals for coding solid surface color will exhibit sensitivity to figuresize and distance to border in both response magnitude and onset delay. We proposethat it is important to start exploring the implication of image representation in neu-ral circuits both in theoretical and experimental terms. Image representation modelsare central to the efficient sparse coding account of the emergent properties of sim-ple cell receptive fields (Olshausen and Field 1996, Lewicki and Olshausen 1999).

Page 26: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 26

The predictive coding schemes in hierarchical visual cortex (Rao and Ballard 1999)also implicitly assume V1 can represent and is used to represent images. Lee andMumford (2003) further conjecture that V1 can act as a high-resolution buffer for vi-sual reasoning. They have argued that V1 neurons are not simply filters or waveletsfor representing or detecting edges and features in images, but in fact can support anisomorphic representation of images for high resolution visual reasoning. Because ofthe overcomplete nature of V1 simple cell representation (Daugman 1988, Lee 1996),image representation model require neurons coding similar information to engage incompetitive interactions to remove redundancy. Such a horizontal network has alsobeen extended with sparsness prior to learn receptive fields (Rozell et al. 2008). Theneurophysiological data reported here suggest that it is time to take image represen-tation model more seriously in reasoning about neuronal interaction and horizontalcircuits in V1. This circuit for image representation and synthesis is likely embeddedinside a rich set of horizontal circuits for other well known phenomena observed inV1: contour linking (Li et al. 2008, Gilbert and Sigman 2007), surface interpolation(Samonds et al. 2009), surround suppression, pop-out and saliency computation(e.g. Knierim and Van Essen 1991, Lee et al. 2002). Understanding how these mul-tiple functional horizontal and recurrent feedback circuits work together with thecircuit of image synthesis is a major challenge for future theoretical and experimen-tal research.

References

Albrechet DG, Geisler WS, Frazor RA, Crane AM (2002) Visual cortical neurons of monkeysand cats: temporal dynamics of the contrast response function. J. Neurophysiology 88: 888-913.

Angelucci, A. Bressloff, P (2006) Contribution of feedforward, lateral and feedback cnnec-tions to the classical receptive field center and extra-classical receptive field surround of pri-mate V1 neurons. Progress in Brain Research, Vol 154, Martinez-Conde, Macknik, Martinez,Alonso and Tse (Ed). 93-120.

Connor CE, Egeth HE, Yantis S (2004). Visual attention: Bottom-up vs. top-down. CurrentBiology, 14, 850-852.

Cornelissen FW, Wade, AR, Vladusich T, Dougherty RF, Wandell BA (2006) No functionalmagnetic resonance imaging evidence for brightness and color filling-in in early human vi-sual cortex. J. Neurosci 26: 3634-3641.

De Weerd P, Gattass R, Desimone R, Ungerleider LG (1995) Responses of cells in monkeyvisual cortex during perceptual filling-in of an artificial scotoma. Nature 377: 731-734.

Friedman HS, Zhou H, von der Heydt R (2003) The coding of uniform color figures in mon-key visual cortex. J Physiol (Lond) 548: 593-613.

Page 27: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 27

Gerrits HJM, Vendrik AJH (1970) Simultaneous contrast, filling-in process and informationprocessing in man’s visual system. Exp Brain Res 11: 411-430.

Gilbert CD, Sigman M (2007) Brain states: Top-down influences in sensory processing. Neu-ron. 54(5):677-696

Grossberg S, Mingolla E. (1985) Neural dynamics of form perception: boundary completion,illusory figures, and neon color spreading. Psychol Rev. 92(2):173-211.

Hanazawa A, Komatsu H, Murakami I. (2000). Neural selectivity for hue and saturation ofcolour in the primary visual cortex of the monkey. European Journal of Neuroscience, 12:1753-1763.

Huang X, Paradiso MA (2008) V1 response timing and surface filling-in. Journal of Neuro-physiology 100: 539-547.

Kinoshita M. Komatsu H. (2001) Neural representation of the luminance and brightness of auniform surface in the macaque visual cortex. J. Neurophysiology 86. 2559-2570.

Lamme VA. (1995) The neurophysiology of figure-ground segregation in primary visual cor-tex. J Neurosci. 15(2):1605-15.

Land E (1977) The retinex theory of color vision. Sci. Am. 237:108-28.

Lee TS, Mumford D, Romero R, Lamme VA. (1998) The role of the primary visual cortex inhigher level vision. Vision Res. 38(15-16):2429-54.

Lee TS, Yang CF, Romero RD, Mumford D. (2002) Neural activity in early visual cortex re-flects behavioral experience and higher-order perceptual saliency. Nat Neurosci. 5(6):589-97.

Lee TS, Mumford D (2003) Hierarchical Bayesian inference in the visual cortex. Journal ofOptical Society of America, A. . 20(7): 1434-1448.

Lennie P., Movshon J.A. (2005) Coding of color and form in the geniculostriate visual path-way. JOSA 22: 10, 2013-2033.

Li W, Piech V, Gilbert CD (2008) Learning to link visual contours. Neuron 57(3):442-451.

Luck SJ, Chelazzi L, Hillyard SA, Desimone R. (1997) Neural mechanisms of spatial selectiveattention in areas V1, V2, and V4 of macaque visual cortex. J Neurophysiol. 77(1):24-42.

McAdams CJ, Maunsell JH. (2000) Attention to both space and feature modulates neuronalresponses in macaque area V4. J Neurophysiol. 83(3):1751-5.

Marcus, DS, Van Essen DC (2002) Scene segmentation and attention in primate cortical areasV1 and V2. J. Neurophysiology, 88: 2648-2658.

Page 28: Xiaogang Yan , Ankit Khambhati , Lei Liu and Tai Sing Leetai/public/phy4.pdf · Xiaogang Yan †, Ankit Khambhati , Lei Liu† and Tai Sing Lee,∗ Center for the Neural Basis of

Yan, Khambhati, Liu and Lee 28

Maunsell JH, Gibson JR. (1992) Visual response latencies in striate cortex of the macaquemonkey. J Neurophysiol. 68(4):1332-44.

Palmer SE (1999) Vision Science : Photons to Phenomenology, MIT press.

Paradiso MA, Nakayama K (1991) Brightness perception and filling-in. Vision Research 31:1221-1236. tion and Performance, 9, 183-193.

Roe AW, Lu H, Hung CP (2005) Cortical processing of a brightness illusion. Proc Natl AcadSci USA 102:3869-3874.

Rossi AF, Desimone R, Ungerleider LG. (2001) Contextual modulation in primary visualcortex of macaques. J Neurosci. 21(5):1698-709.

Rossi AF, Rittenhouse CD, Paradiso, MA. (1996) The representation of brightness in primaryvisual cortex. Science 273, 1104-1107.

Samonds JM, Potetz B, Lee TS, (2009) Cooperative and competitive interactions facilitatestereo computations in macaque primary visual cortex J. Neuroscience 29(50):15780-15795.

Sasaki Y, Watanabe T. (2004) The primary visual cortex fills in color. PNAS, 101 (52): 18251-18256.

von der Heydt, Friedman HS, Zhou H (2003) Searching for the neural mechanism of colorfilling-in. In “Filling-in”: from perceptual completion to cortical reorganization (Pessoa L.and de Weerd P. ed.) 106-127, New York, Oxford University Press.

Wachtler T, Sejnowski TJ and Albright TD (2003) Representation of color stimuli in awakemacaque primary visual cortex. Neuron, 37, 681-691.

Zipser K, Lamme VAF, and Schiller P (1996) Contextual Modulation in Primary Visual Cor-tex. Journal of Neuroscience, 16(22): 7376-7389;