by Justin Gao
art by Dagim Estifanos
“No art passes our conscience in the way film does, and goes directly to our feelings, deep down into the dark rooms of our souls.”
- Ingmar Bergman
So much of watching a film occurs below the surface of our consciousness, where the smallest and faintest details do ‘God’s work’ in the deepest recesses of our mind. All elements of a film work synchronously to deceive our conscious minds, and as our disbelief is involuntarily suspended, our very biology becomes entangled with the hyperreal (where simulation becomes indistinguishable from reality, thus surpassing it) [1]. The manipulation of unconscious patterns of instinctive processing and error feedback paired with the physical reaction of the body to film causes one to transcend the diegesis, the world of film. The fictional world becomes actualized into a fully-formed consciousness in the brain to match that of the material world. The shapes, lines, colors, textures, depth, and movement inform the reconstruction within the mind to create a web of thoughts, ideas, emotions, and sensations. A distinct rendition of the film is constructed by the mind through the experience of film.
The magic of film at its inception in the late 19th century stemmed from the fascination and novelty behind the opportunity to witness reality compressed, transported, and displayed within the confines of a circus attraction (Edison’s earliest version of a film projector: the Kinetoscope). Indeed, the inextricable link between seeing and knowing stood at the foundation of what we knew as ‘knowledge’ and ‘reality’–a sentiment so irrevocably subverted by the advent of moving pictures. As films became more artistic and abstract, with filmmakers incorporating elements that strayed further and further from ‘reality’ (non-diegetic music, sound effects, VFX, cinematic lighting, camera angles, editing, etc.), what was expected to have been an escape from realness in turn resulted in viewing experiences that felt increasingly real and undeniably visceral. Thus, what began as a circus trick for the conscious mind quickly evolved into an embodied experience for the individual, a holistic re-representation of reality that bubbled up through the physiological to the subconscious, and that not only reflects but expands our being here on earth.
Therefore, this article seeks to answer the question: How does an art form removed from reality–in its digital trickery, illusions, absurd cutting, and unnatural lighting–produce an experience that we perceive to be so real? This article focuses on the visual processing of films and how certain methods of visual integration may elicit certain emotions.
Part I: Breaking Down the Visual World
“The simulacrum is never what hides the truth - it is truth that hides the fact that there is
none. The simulacrum is true.”
- Ecclesiastes
From the infinite complexity that is the visual world, researchers have derived categories for types of visual information, largely based on the types of systems that are specialized to detect unique visual stimuli. These categories include: intensity, color, lines and shape, depth and spatial location, and complex objects (such as faces) [2].
There is a recognized ‘flow of information’ through a series of cells that form the foundations of visual processing. Visual processing begins at the retina, where photoreceptors at the back of the eye detect the intensity of light (rods) and the color of light (cones) [3]. The human retina contains around 126 million such photoreceptors, each corresponding to a specific point in the visual field [4]–similar to pixels in a camera. From there, the information proceeds from the photoreceptor cell to a layer of bipolar cells. Multiple photoreceptors may be connected to a single bipolar cell, whose main role is to slightly refine incoming information and pass it onto the third retinal layer: the ganglion cells [5]. Ganglion cells project into the primary visual cortex (V1) [3]. This process of lateral inhibition can also be created through interneurons, which receive input and output from multiple rods and cones and relay information to bipolar cells, and thus have the ability to decrease the activity of their neighboring neurons to create heightened contrast between regions of different brightness, enhancing edge detection [6].
This system can perceive a huge amount of data. A standard Hollywood film is shot in 8k resolution (7680 x 4320 pixels), totaling just over 33 million pixels across the surface of the digital sensor. If we equate a photoreceptor to a pixel, this means the human retina detects around 4 times the raw visual information in industry-level films. While this is equal to over 60 times the information of standard HD resolution, it only accounts for half of Nolan’s 70mm IMAX Oppenheimer, which has an estimated 18k resolution. Thus, following the activation of photoreceptors, the brain undergoes significant parallel processing, where the huge swaths of raw visual data can be refined down to manageable bits [7]. During this step, the overwhelmingly complex visual world diverts from realness towards a more comprehensible and estimated reconstruction.
This information refinement process begins with the prioritization of the detection of edges, lines and shapes, and a filtering-out of irrelevant information through lateral inhibition [8]. The amount of data communicated to the brain is reduced at the expense of small, irrelevant details. Line orientation is detected similarly, as simple cells in the primary visual cortex (V1) (single neurons in the same column on the retina) respond to particular directions of lines. Imagine several lateral geniculate nucleus (LGN) cells (which connect photoreceptors with V1), each responding to a small circular spot of light, aligned in a vertical row [9]. The simple cell in V1 that receives input from all of these LGN cells will fire strongly to a vertical bar of light that spans their combined receptive fields. This spatial arrangement of inputs means the simple cell will be most responsive when a stimulus (like a bar of light) is aligned along the preferred orientation of the cell’s receptive field (e.g., vertical or horizontal). Consequently, what may have seemed like a chaotic and vague set of individual data points is ultimately refined to the response of a single cell [9].
The process of refinement is also apparent in higher-order processing. Refinement also begins to form the patterns that constitute the subconscious reconstruction of a frame. Gestalt psychology categorizes these patterns into 6 Principles of Perceptual Organization: Similarity, Good Figure, Proximity, Continuity, Closure, and Common Fate [10]. While new organizations have been proposed, such as Invariance or Multistability, the basic concept still stands: when the brain receives overwhelming data from its lower-order visual processing centers, especially in cases like Figure 1, it quickly utilizes these principles to simplify the image, adding and subtracting elements for a ‘final rendition’ that is far from accurate. Consider ‘Continuity’ in Figure 1, for example, where a flock of birds is arranged in the shape of a curved line. The combined firing of LGN cells associated with these birds is at or close to the threshold for the simple cells in V1 to interpret it as a continuous line [11]. The simple cells responsible for identifying continuous lines may fire weakly, as though saying ‘good enough.’ Through these refinement patterns, complicated objects in our surroundings, as well as our surroundings themselves, become easily identifiable. This willingness of basic visual processing to refine information often results in unexpected (and sometimes helpful) misinterpretations [10].
Part II: The Element of Time
“Unlike all the other art forms, film is able to seize and render the passage of time, to stop it, almost to possess it in infinity. I’d say that film is the sculpting of time.”
- Andrei Tarkovsky
Film is a medium of time. Its uncanny realism, inescapable visceral, and intrinsic, almost instinctual fluidity stem from the relationship between images across time—how they interact before and after—as well as the way a single image transforms over time. Hence, so much of our immersion in film depends on how film manipulates our ability to predict at the most fundamental level. We possess an innate understanding of physical laws like momentum and force, accurately estimate the effects of gravity, assume continuity and smoothness, and recognize rhythm. This basic intuition has provided us with the necessary means to navigate our physical world with high-accuracy prediction. Our inherent instincts form the a priori arsenal that ultimately constructs the scaffolds on which a holistic reconstruction of film is built.
One of the key components to refining incoming information for efficient processing is top-down processing. The linear path from the retina to V1 described in Part I is largely a bottom-up process, where information moves ‘up’ the levels of cognition [12]. Conversely, top-down processing uses pre-existing knowledge or expectations to affect the processing of ‘lower’ cognition levels [12]. For example, identifying objects placed within a semantically consistent scene, like a pot in a kitchen, occurs faster than identifying objects in semantically inconsistent scenes, like a mailbox in a kitchen. Expectations and prior knowledge are combined with contextual cues to affect the identification of the object [13].
Neurotransmitters play a key role in top-down feedback. Mainstream neuroscience has placed certain neurotransmitters, such as dopamine, at the forefront of behavioral analysis, associating them with high-order functions like motivation and addiction [14]. Yet most of the responsibility of neurotransmitters is regulating subconscious processing long before information reaches ‘conscious awareness.’ For example, the innate a priori predictions of movement can streamline visual processing through the activation of glutamate and GABA, the excitatory and inhibitory neurotransmitters [15]. Feedback projections from regions like the prefrontal cortex release glutamate onto neurons in V1, enhancing their activity by top-down predictions. Similarly, GABA can inhibit certain neurons in V1, ensuring that the brain prioritizes information that aligns with prior expectations by reducing the processing of unexpected, irrelevant data [16]. When predictions are wrong and need to be adjusted, the brain may activate dopaminergic pathways to help modulate top-down feedback by altering the weight of priors (how strongly the brain believes a prediction) based on how accurate they have been in the past [17].
These interactions can also be described with Bayes Theorem, a mathematical model for calculating probability whereby the probability of an event can be adjusted with new information. From the Bayesian perspective, the brain arrives at the most probable interpretation of visual information by combining the new information (conditional probability) and the information already known” [18]. Imagine a top-down shot of Leonardo DiCaprio (where we can only see the top of the head) moving 45 degrees to the right. Neurons tuned to nearby directions (e.g. 40 or 50 degrees) will fire, but neurons tuned to 90 degrees will fire less. The resulting pattern of firing rates across the population of neurons forms a distribution of activity that peaks at around 45 degrees, approximating the most likely direction of motion. This population level allows the brain to interpret ambiguous or noisy sensory data more accurately by combining the weak responses of multiple neurons rather than relying on a single neuron’s response. The approximation of the most likely direction of motion is paired with top-down information on expected motion. The strong intuition mentioned above, which makes assumptions like “objects move downward due to gravity” or “objects move in smooth trajectories” will bias the activity of neurons representing the expected direction, combining prior knowledge with incoming data.
Often, filmmakers shoot and edit their films in a way that aligns with our expectations of motion continuity to encourage immersion. We see this conformity to our expectation of motion continuity in the ‘match cut,’ as illustrated in Figures 3 and 4. The motion of Leo throwing the green stone is matched across the two shots, making the 180° perspective switch almost unnoticeable. Other film editing techniques are also used to make this cut invisible. The first is the slight overlap of movement between the two shots, which is employed here to account for the slight delay where the previous shot remains in the visual field for an extra 300-400 milliseconds [13]. If they had cut exactly halfway between the movements with no overlap, Leo’s throw would have seemed disjointed, as though he skipped forward in time.
Part III: The Final Rendition
“Your focus determines your reality.”
- Qui-Gon Jinn
Star Wars: The Phantom Menace
So, through various prediction and estimation processes, our brain generates a very different version of the film on the screen before it reaches our conscious comprehension. It is at this point that the ‘realness’ of the so-called ‘final rendition’ comes into question. Here are some guesses as to what the rendition might look like if it were extracted from the mind and re-projected onto the movie screen. First, the rendition is unlikely to encompass the frame from edge to edge; it is likely that spots are missing, blurred, or black and white in areas where our attention is not focused. Second, in these areas of inattention, objects and lines may morph into simpler shapes and lines. Third, various simulated iterations of objects in focus may exist in front and behind the real object as after images or motion predictions. The ‘realness’ of these mentally constructed objects or figures can be defined somewhat quantitatively by the level of visual processing it reaches and the strength of firing at each level. At the very start of the process, the firing of the photoreceptors themselves is unchangeable and inevitable, though top-down feedback can reach horizontal cells (which begin ‘processing’ by passing information between bipolar cells) [19]. The higher up the processing chain, the more interconnected the visual information becomes, and the more likely it is for an indirectly connected cell to fire (when in fact there is no real sensory information coming from the cones it’s attached to) [20]. Thus, we can assume that this ‘final rendition’ will have faint outlines or simplified shapes existing in the path of an object's motion, possibly in multiple places at once. This in turn forces the brain to stay engaged with the film on a subconscious level as it attempts to confirm or reject predictions, adjust top-down feedback, and make future predictions.
On this fundamental level, film employs small imperfections, subversions, and unexpected events to force the inherently sporadic and predictive tendencies of the brain to remain engaged. Attention—the recruitment of a fixed number of mental resources—is thus adequately engaged at the level of the narrative, keeping it away from the fictitious nature of both the narrative itself and its 2D frame display. Concurrently, film employs predictive coding and perceptual organization to hide unavoidable unrealistic aspects of film form; elements such as editing, especially across different spaces and times, unfamiliar settings, unnatural lighting, sci-fi or unrealistic objects, are hidden, or better, made real, by this manipulation of fundamental visual processing. Perhaps they are the very contradictions and chaotic illusions that we are used to in the real world that are replicated by the visual ‘tricks’ in film, effectively reproducing the physicality and ‘realness’ of the real world within the confines of the cinema.
Conclusion
While film analysis often focuses on the narrative elements that play a conscious role in shaping the audience's expectations and reactions, “neurocinematics,” as R L Ceciu described, may point towards fundamental patterns of visual stimuli as a way of consolidating the subconscious immersion of the film. By acknowledging the ability of films to accurately control subconscious processing patterns, filmmakers can be intentional about the visual elements of their films in ways that serve a deeper purpose than a conscious understanding of the plot. Experimental filmmakers can rely on neurocinematics to predict visceral reactions to certain techniques rather than through intuition or trial and error alone. Consequently, with enough attention and research, the film may continue to broaden its ability to provide alternate realities that expand our linear experience of life.
REFERENCES:
Baudrillard, J. (1981). Simulacra and Simulations. Editions Galilee. Retrieved from https://0ducks.wordpress.com/wp-content/uploads/2014/12/simulacra-and-simulation-by-jean-baudrillard.pdf
Livingstone, M. S., & Hubel, D. H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience, 7(11), 3416–3468. https://doi.org/10.1523/JNEUROSCI.07-11-03416.1987
Ingram, N. T., Sampath, A. P., & Fain, G. L. (2016). Why are rods more sensitive than cones? The Journal of Physiology, 594(19), 5415–5426. https://doi.org/10.1113/JP272556
Molday, R. S., & Moritz, O. L. (2015). Photoreceptors at a glance. Journal of Cell Science, 128(22), 4039. https://doi.org/10.1242/jcs.175687
Grossniklaus, H. E., Geisert, E. E., & Nickerson, J. M. (2015). Chapter Twenty-Two - Introduction to the Retina. In J. F. Hejtmancik & J. M. Nickerson (Eds.), Progress in Molecular Biology and Translational Science (Vol. 134, pp. 383–396). Academic Press. https://doi.org/10.1016/bs.pmbts.2015.06.001
Kramer, R. H., & Davenport, C. M. (2015). Lateral Inhibition in the Vertebrate Retina: The Case of the Missing Neurotransmitter. PLOS Biology, 13(12), e1002322. https://doi.org/10.1371/journal.pbio.1002322
Grimes, W. N., Songco-Aguas, A., & Rieke, F. (2018). Parallel Processing of Rod and Cone Signals: Retinal Function and Human Perception. Annual Review of Vision Science, 4(Volume 4, 2018), 123–141. https://doi.org/10.1146/annurev-vision-091517-034055
Jernigan, M. E., & McLean, G. F. (1992). Lateral Inhibition and Image Processing. In Nonlinear Vision: Determination of Neural Receptive Fields, Function, and Networks. CRC Press.
Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology, 160(1), 106–154. https://doi.org/10.1113/jphysiol.1962.sp006837
Pettersson, R. (2017). Gestalt principles. In Information Design. Routledge.
Lv, C., Xu, Y., Zhang, X., Ma, S., Li, S., Xin, P., … Ma, H. (2017). Edge Detection Based on Primary Visual Pathway. In Y. Zhao, X. Kong, & D. Taubman (Eds.), Image and Graphics (pp. 430–439). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-71589-6_37
Ichinose, T., & Habib, S. (2022). On and off signaling pathways in the retina and the visual system. Frontiers. Retrieved from https://www.frontiersin.org/journals/ophthalmology/articles/10.3389/fopht.2022.989002/full
Gafni, H., & Zeevi, Y. Y. (1977). A model for separation of spatial and temporal information in the visual system. Biological Cybernetics, 28(2), 73–82. https://doi.org/10.1007/BF00335287
Costa, K. M., & Schoenbaum, G. (2022). Dopamine. Current Biology, 32(15), R817–R824. https://doi.org/10.1016/j.cub.2022.06.060
Majumdar, S. (2023). Role of glutamate in the development of visual pathways. Frontiers. Retrieved from https://www.frontiersin.org/journals/ophthalmology/articles/10.3389/fopht.2023.1147769/full
Pitchaimuthu, K., Wu, Q., Carter, O., Nguyen, B. N., Ahn, S., Egan, G. F., & McKendrick, A. M. (2017). Occipital GABA levels in older adults and their relationship to visual perceptual suppression. Scientific Reports, 7(1), 14231. https://doi.org/10.1038/s41598-017-14577-5
Valdés-Baizabal, C. (2020). Dopamine modulates subcortical responses to surprising sounds. Retrieved from https://journals.plos.org/plosbiology/article?id=10.1371%2Fjournal.pbio.3000744&utm_
Pazhoohi, F., & Kingstone, A. (2021). The Effect of Movie Frame Rate on Viewer Preference: An Eye Tracking Study. Augmented Human Research, 6(1), 2. https://doi.org/10.1007/s41133-020-00040-0
Pazhoohi, F., & Kingstone, A. (2021). The Effect of Movie Frame Rate on Viewer Preference: An Eye Tracking Study. Augmented Human Research, 6(1), 2. https://doi.org/10.1007/s41133-020-00040-0
Chen, Y. (2024). Decoding dynamic visual scenes across the brain hierarchy. PLOS Computational Biology. Retrieved from https://journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.1012297
Comments