Creating Sounds for Things We Cannot See: Kenny Young on VR, Music, and Guiding Attention

Kenny Young

Many forms of media depend upon controlling attention. Films decide where audiences look through editing, framing, and camera movement. Theatre guides attention through staging and movement. Conventional games frequently do something similar through interface design, visual effects, or camera behaviour. Important information rarely appears entirely by accident. Designers often decide where attention should go long before audiences realise those decisions are being made. Most of this guidance becomes invisible precisely when it works well. Players rarely stop to think about how often games quietly redirect their attention from one place towards another. Experiences simply feel natural. Objectives appear at appropriate moments, important events seem difficult to miss, and information arrives when required.

Kenny Young’s guest lecture explored what happens once some of these assumptions begin disappearing. Virtual reality introduces a relatively simple change that gradually creates much larger consequences. Players control the camera continuously. Looking left means physically turning left. Looking upwards requires physically raising the head. Looking away from something important may simply mean missing it altogether. Initially this sounds like a relatively minor alteration, though the consequences begin spreading surprisingly far once control over attention starts shifting away from designers themselves.

Imagine hearing something important happening behind you in a conventional game. Designers possess numerous methods for ensuring that players notice it. Cameras may shift automatically, indicators can appear around the screen, and control may even be briefly interrupted. Decades of game design have produced increasingly sophisticated methods for solving these problems. Virtual reality complicates many of these solutions. Fixed interface elements become intrusive, large overlays can weaken immersion, and information existing outside the player’s field of view can remain entirely unnoticed. Questions therefore begin emerging around how players discover important information once designers can no longer simply place it directly in front of them.

Young suggested that sound changes role at precisely this point. Human vision behaves selectively. We actively choose where to direct our eyes and ignore much surrounding information. Hearing functions rather differently. Sounds continue arriving whether or not we intentionally seek them out. A player may choose not to look towards something important, though hearing something nearby can still trigger an immediate response. Sound therefore begins moving away from a supporting role attached to visible events and towards something more active.

Players do not simply hear sounds in games. They gradually learn them. Initially a sound may exist only as another event occurring within a larger environment. Yet repeated exposure slowly changes its role. Through familiarity, sounds begin accumulating meaning. This process often happens without players consciously noticing it. A sound that initially appears neutral gradually becomes linked with expectations, actions, and outcomes. Eventually hearing the sound no longer involves interpreting something unfamiliar. Players instead recognise patterns they have already learned.

Young discussed the familiar alert sound from Metal Gear Solid as an example. During early encounters players hear a brief cue alongside visual information, though repeated exposure gradually changes the relationship. Eventually players stop hearing the sound as a sound effect at all. Instead, it begins behaving almost like language.

Language may not be entirely the right word, though the comparison becomes useful. Words themselves do not naturally contain meaning. People gradually learn relationships between sounds and ideas through repeated experience until recognition becomes almost immediate. Something similar begins happening within games. A short musical cue or brief sound effect acquires meaning through use rather than explanation. Players are not consciously translating sounds each time they hear them. Recognition simply becomes increasingly automatic.

Nobody pauses a game to explain that a particular sound means danger. Players learn these relationships through repeated experience. Over time certain sounds become linked with expectations, actions, and outcomes until responses begin occurring almost automatically. Listening changes in subtle ways once these associations form. Players stop consciously analysing what they hear, as attention begins shifting before deliberate thought catches up. Sound therefore becomes something more than feedback occurring after an event. It starts creating expectations about what might happen next.

Music introduces another layer to these learned relationships. Discussions around game music frequently focus on emotion, atmosphere, and immersion. Players may notice tension increasing during combat, emotional themes returning around familiar characters, or changing musical textures supporting movement through a world. Young explored another possibility entirely. Under certain conditions, music may also begin operating as information.

Much of his work on Tethered explored whether these kinds of relationships could be developed within virtual reality environments. Strategy games already involve unusually large amounts of simultaneous information. Resources require management, environments continue changing, threats emerge unexpectedly, and events occur across multiple locations at once. Conventional interfaces frequently solve these problems visually. Players monitor maps, indicators, menus, and notifications distributed around the screen. Translating these expectations into VR introduced a more difficult question. How can players remain aware of a world once they can comfortably see only part of it at any given moment?

Rather than functioning purely as atmosphere or emotional support, musical phrases could gradually become learned signals recognised through repeated interaction. Certain sounds became associated with changing conditions, important events, or emerging situations. Initially these sounds carried little meaning beyond existing as recognisable musical gestures. Over time something rather different happened. Players were not simply listening to music accompanying a world. They were gradually learning the world itself.

Listening consequently begins developing an unusual relationship with navigation. Physical landmarks help people orient themselves within real environments, though players may also begin constructing sonic landmarks. Certain sounds become associated with places, behaviours, or changing conditions. Listening therefore starts becoming part of understanding how a world behaves. Particular musical phrases began functioning almost like landmarks within an environment. Certain combinations of sounds became associated with emerging threats or opportunities requiring attention. Over time players could respond before consciously thinking about what had changed. Listening therefore became intertwined with understanding the behaviour of the world itself.

Examples such as these begin shifting the discussion slightly. Rather than asking whether music sounds appropriate or emotionally effective, another question begins appearing. How do people learn sonic environments? Under what circumstances do sounds stop behaving like sounds and begin behaving more like information? Underlying processes of this kind may already exist across many forms of game audio, even if virtual reality makes them easier to recognise.

Extending these ideas into working systems introduced additional challenges. Sounds needed to remain distinctive while fitting comfortably alongside underlying music. Delays had to remain short enough that players still connected events with their causes, while multiple simultaneous events could create confusion or dissonance. Initial solutions often resolved one issue only to reveal another elsewhere. Technical constraints, musical decisions, and player behaviour continually interacted throughout development. Creative work therefore emerged less as a process of executing perfect ideas and more as a continual process of adjustment.

Running throughout the lecture was a broader observation concerning the role of sound itself. Discussions surrounding game audio frequently emphasise realism, emotion, and atmosphere. These remain important concerns, though Young’s work suggested something slightly different. Once familiar methods for directing attention become less reliable, sound begins taking on responsibilities traditionally associated with cameras and interfaces.

Virtual reality may therefore reveal something that has existed quietly within games for much longer. Sound has rarely functioned only as decoration or atmosphere. It has also shaped where players look, what they notice, and how they organise experiences around them.

Perhaps the more interesting question is not whether sound helps players understand virtual worlds. It may instead involve asking how much of our experience has always depended upon sound guiding us in ways we barely notice. Once designers lose many familiar methods for directing attention, sound begins moving from the background towards the centre of interaction itself.