Author: iainmcgregor

  • How Do You Design the Sound of a Blockbuster Game? Michael Caisley on Creativity, Recording, and Crafting the Sound of Call of Duty

    Michael Caisley

    How do you design the sound of a blockbuster game?

    Modern video games are built from extraordinarily complex systems. Artificial intelligence, physics, animation, graphics and networking all operate simultaneously to create worlds that respond continuously to the player’s decisions. Sound design must function within that same complexity. Unlike film, where every frame is predetermined, game audio unfolds differently every time someone plays. Thousands of individual sounds interact dynamically, responding to changing environments, player behaviour and gameplay events without losing clarity or dramatic impact. During his online guest lecture for Edinburgh Napier University, Michael Caisley drew upon his experience as Senior Sound Designer on Call of Duty: Advanced Warfare to explore how one of the industry’s largest productions approached this challenge. Throughout the session, one principle emerged repeatedly. Great game audio is designed as a complete system rather than a collection of individual sound effects.

    This philosophy shaped every stage of the project’s development. Rather than asking how individual weapons, footsteps or explosions should sound, the audio team began with a broader question. How should the player experience the world? Every recording, editing decision and implementation technique ultimately served that objective. Sound design therefore became an exercise in shaping perception rather than simply producing assets. Individual recordings remained important, though their true value emerged only through the relationships they formed with every other element of the soundtrack. The player never experiences sounds in isolation. They experience an acoustic world.

    Caisley explained that this perspective influenced one of the team’s earliest decisions. Although Call of Duty already possessed an established sonic identity developed across multiple successful titles, the audio team resisted the temptation simply to inherit those conventions. Instead, they treated Advanced Warfare as an opportunity to rethink the game’s entire sound philosophy from first principles. Existing assets, familiar production techniques and long-standing implementation methods were all reconsidered. Their ambition was not to reject the past, but to ensure that every creative decision continued to serve the experience they wanted players to have. Innovation therefore emerged through careful questioning rather than change for its own sake.

    That philosophy also transformed the relationship between sound design and implementation. In many production pipelines, sound designers create assets that are later integrated into the game by other specialists. Caisley described a markedly different approach. Sound designers remained responsible for implementation inside the game itself, allowing them to shape how recordings behaved once they became part of the interactive experience. The timing of a sound, the circumstances under which it played, the way it interacted with other events and its contribution to the overall mix all became part of the design process. Creating an excellent recording represented only the beginning. The player’s experience ultimately depended upon how successfully that recording functioned within the wider system. Implementation was therefore not separate from sound design. It was an essential part of it.

    The same systems-oriented thinking naturally extended to recording. Rather than relying primarily upon commercial sound libraries, the team invested heavily in producing original recordings specifically for the game. Specialist libraries remained valuable resources, particularly carefully curated collections produced by experienced field recordists, though Caisley consistently argued that original recording provides opportunities to discover sounds that nobody else possesses. More importantly, recording becomes a creative process rather than simply a method of gathering raw material. Unexpected textures, unusual perspectives and subtle acoustic details often emerge only when designers capture sounds for themselves. Distinctive game audio begins long before editing or implementation. It begins with listening carefully to the world.

    One particularly revealing example involved footsteps. Traditional Foley often records isolated footsteps on carefully prepared surfaces inside controlled studio environments. Caisley questioned whether this approach remained appropriate for a first-person game in which movement is experienced continuously through the player rather than observed from an external viewpoint. Instead, the team carried lightweight portable recorders into forests, hillsides and outdoor locations, capturing complete performances that naturally progressed from walking to running and sprinting. Rather than constructing movement artificially from disconnected recordings, they captured the changing rhythm, effort and momentum that emerge naturally when people move through real environments. The resulting recordings felt noticeably more convincing, illustrating that authenticity sometimes depends less upon technical precision than upon preserving the natural behaviour of the performer.

    The recording equipment itself reflected the same practical philosophy. Caisley encouraged students not to become preoccupied with expensive technology at the expense of creative opportunity. Much of the team’s field recording relied upon compact portable recorders that could be deployed quickly whenever an interesting sound presented itself. Mounted directly onto lightweight boom poles, these systems reduced handling noise while allowing recording sessions to remain flexible and spontaneous. The lesson extended far beyond the specific equipment being used. Interesting sounds rarely arrive when it is convenient to record them. Designers therefore benefit from tools that allow them to respond immediately rather than waiting for ideal conditions or elaborate recording setups. Creativity, he suggested, often rewards preparedness more than perfection.

    The same willingness to question established practice shaped the recording of weapons. Rather than organising one large recording session intended to capture every firearm in a single location, the team divided the work across numerous smaller sessions. This approach simplified logistics, though its greatest benefit proved creative rather than organisational. Each session could be reviewed afterwards, allowing the team to identify opportunities for improvement before returning to record additional material. Different environments also introduced naturally varying acoustic characteristics, providing a richer collection of perspectives than a single location could have offered. Recording therefore became an iterative process in which every session informed the next. The objective was not simply to accumulate material, but to refine the sonic identity of the game through continual experimentation.

    Perhaps the most important lesson from this stage of the lecture concerned the relationship between individual sounds and the finished player experience. Caisley observed that players rarely remember isolated recordings. They remember moments. The impact of those moments depends upon countless design decisions working together, from recording and editing through implementation, mixing and gameplay design. The audio team’s objective was therefore never to create the loudest explosion or the most detailed weapon recording. It was to build a soundtrack in which every element supported the player’s understanding of the world. Call of Duty: Advanced Warfare consequently adopted a more dynamic approach to mixing, allowing important sounds to occupy the foreground while leaving space for the rest of the soundtrack to breathe. Restraint became every bit as valuable as spectacle. The most memorable moments did not emerge from individual sound effects alone. They emerged from a coherent acoustic world in which every element strengthened the player’s belief that the environment around them was responsive, believable and alive.

    Having established the technical foundations of the project, Caisley turned towards the creative decisions that ultimately give a game its identity. Recording and implementation provide the raw materials, though they do not determine how a player experiences a moment. That depends upon judgement. Throughout the remainder of the session, he returned repeatedly to an idea that sounds deceptively simple but lies at the heart of professional sound design. Every sound reflects a design decision. The role of the sound designer is not merely to create convincing audio, but to decide what deserves to be heard, when it should be heard and, just as importantly, what should remain absent.

    This philosophy shaped the way Caisley approached almost every design problem. Instead of searching immediately for the perfect recording, he preferred to build what he described as palettes of possibilities. Families of related sounds sharing particular textures, movements and tonal characteristics were assembled through recording, processing and experimentation. Organic recordings of motors, impacts, machinery and environmental sounds were manipulated repeatedly, gradually forming a collection of materials from which the final design could emerge. Creativity therefore developed through exploration instead of beginning with a predetermined solution. Designers rarely know exactly what they are searching for at the start of a project. They discover it by experimenting until unexpected relationships begin to reveal themselves.

    His workflow reflected the same exploratory mindset. Projects often began in apparent disorder, with sounds accumulating rapidly as multiple ideas were investigated simultaneously. Immediate organisation was deliberately given lower priority than experimentation. Once a broad range of possibilities had been created, the process shifted towards careful refinement. Caisley compared this approach to sculpting. A sculptor begins with a block of material and gradually removes everything that does not belong until the final form becomes visible. Sound design, he suggested, often develops in exactly the same way. Instead of continually asking what should be added, designers should also ask what can be removed.

    This idea challenges one of the most common assumptions made by new sound designers. Richer sound does not necessarily result from adding more layers. As recordings accumulate, frequency masking increases, textures become crowded and important details begin to disappear. Caisley described repeatedly muting, removing and simplifying elements until only those making a genuine contribution remained. Equalisation, dynamics processing, timing adjustments and careful layering all supported this process, though none represented the objective in itself. Their purpose was to improve clarity, strengthen communication and ensure that every remaining sound justified its place within the mix. Professional sound design therefore depends less upon the quantity of material than upon the quality of the decisions shaping it.

    A particularly memorable example came from a sequence in which the player escapes across a glass roof before an ally destroys the structure beneath pursuing enemies. The obvious solution might appear to involve recording increasingly dramatic glass impacts before combining them into one spectacular crash. Caisley approached the problem very differently. The event was divided into a sequence of distinct dramatic stages. Initial bullet impacts, subtle structural weakening, growing instability and the final collapse each received their own carefully judged sonic treatment. Texture, pacing and silence changed gradually as the scene unfolded, allowing players to follow the progression of the collapse as a connected series of events rather than experiencing a single overwhelming burst of noise. The sequence derived its dramatic impact from the way the sound evolved over time, allowing the narrative of the scene to unfold naturally through listening as well as through the visuals.

    The same attention to dramatic pacing shaped Caisley’s approach to synchronisation. Students often assume that every visible action should be matched precisely by an accompanying sound. Professional practice, he suggested, is considerably more nuanced. Delaying one sound slightly, allowing another to emerge first or simplifying an otherwise crowded moment can produce a stronger dramatic effect than strict synchronisation alone. Rhythm, pacing, expectation and contrast all become compositional tools that guide the player’s attention. Instead of following every visual event mechanically, sound design helps determine what players notice, what they anticipate and how they interpret the unfolding action. Games therefore rely upon many of the same principles of dramatic storytelling found in music and cinema, while remaining responsive to player interaction.

    Equally revealing was Caisley’s discussion of realism. Throughout the lecture, he challenged the assumption that authentic sound must originate from authentic sources. Recording larger explosions does not necessarily produce better explosions, nor does striking more metal automatically create more convincing mechanical impacts. Professional sound designers routinely combine recordings whose original sources bear little resemblance to the finished result. Environmental ambiences, machinery, organic textures and countless unexpected recordings may all contribute qualities that literal recording alone cannot provide. What ultimately matters is not the origin of the sound, but whether it supports the player’s perception of the world. Believability depends upon the finished experience rather than literal accuracy.

    Technical processing formed part of this broader creative process rather than existing as an end in itself. Equalisation, compression, distortion and other processing tools undoubtedly shape the final soundtrack, though Caisley resisted presenting them as universal recipes. Every adjustment served a specific purpose within the wider composition. Heavy compression might transform an otherwise unremarkable recording into the perfect supporting layer. Subtle timing adjustments could reveal details previously hidden within the mix. Equalisation often preserved recordings that might otherwise have been discarded. Considered individually, many processed sounds appeared incomplete or even unattractive. Their value emerged only through their relationship with every other element. As throughout the lecture, the emphasis remained firmly upon systems rather than isolated sounds.

    Towards the end of the session, Caisley reflected upon the qualities that distinguish successful sound designers from merely competent technicians. Technical expertise undoubtedly matters, though he argued that curiosity, collaboration and the willingness to accept constructive criticism exert a far greater influence over long-term professional development. Working alongside experienced colleagues continually challenges assumptions and exposes designers to alternative ways of thinking. Equally valuable is the habit of listening analytically to other people’s work. Rather than deciding whether an entire game succeeds or fails, Caisley encouraged students to identify individual moments that demonstrate particularly thoughtful creative decisions. Examining one successful interaction in depth often teaches far more than making broad judgements about an entire soundtrack. Developing as a sound designer therefore depends as much upon careful listening as upon creating new sounds.

    Taken together, Caisley’s presentation revealed that blockbuster game audio is built as much through judgement as through technology. Recording, editing, implementation and mixing undoubtedly provide the necessary tools, though those tools acquire meaning only through the decisions that shape them. Every sound exists in relation to every other sound, every moment contributes to a larger dramatic experience and every creative choice influences how players understand the world around them. Sound design is not the art of creating more sound, but of making better decisions. Technology provides the tools. Careful listening, thoughtful judgement and an understanding of human perception transform those tools into interactive experiences that players instinctively accept as real.

  • How Do Robots Communicate Through Sound? Connor Moore on Audio UX, Robotics, and Designing Meaningful Interactions

    Connor Moore

    How do robots communicate through sound?

    People increasingly interact with technology through sound. Smartphones acknowledge completed payments, electric vehicles alert drivers to potential hazards, wearable devices provide subtle notifications and intelligent products communicate through a growing vocabulary of tones, chimes and alerts. Yet these sounds rarely receive the same attention as visual design. During his online guest lecture for Edinburgh Napier University, Connor Moore explored the growing discipline of Audio User Experience (Audio UX), demonstrating how carefully designed sounds help products communicate naturally, build trust and express personality. Drawing upon projects for companies including Google, Tesla and Postmates, he argued that successful product sound design extends far beyond creating attractive audio. It begins with understanding the people for whom they are designed. Throughout the session, one principle emerged repeatedly. Every sound should communicate with purpose.

    Moore introduced his work by describing the remarkable breadth of modern Audio UX. Working from California’s Bay Area, he collaborates with companies developing products across robotics, automotive technology, connected devices, consumer electronics and digital services. Although these industries appear very different, they all share a common challenge. Products increasingly communicate with people through sound, requiring designers to think carefully about what those sounds communicate and how they contribute to the wider identity of a brand. Rather than approaching each project as an isolated collection of sound effects, Moore described building coherent sonic systems that extend across products, marketing, physical environments and user interactions. Individual sounds matter, though they become most effective when they form part of a larger and recognisable design language.

    This broader perspective also explains why strategy sits at the beginning of every project rather than at the end. Before designing a single sound, his team seeks to understand the objectives of the product, the identity of the organisation and the experience that users should ultimately have. Brand workshops, creative discussions and detailed reviews of existing sounds all contribute towards this early stage of development. Competitor analysis also plays an important role. Understanding how other companies sound allows designers to identify opportunities for meaningful differentiation rather than unintentionally reproducing familiar ideas. The objective is not simply to sound different. It is to create a sonic identity that genuinely reflects the values and personality of the organisation. Sound therefore becomes a strategic design material rather than a decorative addition introduced once the product has already been completed.

    One of the most thought-provoking ideas introduced during the presentation concerned what Moore described as connected audio ecosystems. Many organisations continue to commission isolated sounds for individual products or services, yet users increasingly encounter the same company across multiple devices and environments. A person may hear a notification on a smartphone, interact with a smart speaker at home, use an in-car navigation system and later encounter advertising or public installations produced by the same organisation. Rather than allowing each experience to develop independently, Moore argued that they should all share recognisable sonic characteristics. Consistent instrumentation, similar timbral qualities and carefully related musical ideas allow users to recognise a brand without needing to see a logo or screen. Sound therefore becomes another component of brand identity, working alongside visual design to create familiarity and trust.

    Google’s product ecosystem provided one of the clearest illustrations of this philosophy. Moore described how his work began during the development of Google Glass, a product that sought to make an unfamiliar technology feel approachable. Rather than emphasising futuristic electronic sounds, the design drew upon simple acoustic instruments such as piano, chimes and mallet percussion. These familiar timbres helped ground an otherwise unfamiliar experience, making the product feel more human and intuitive. As Google’s product portfolio expanded through devices such as Pixel phones, Google Pay and automotive systems, this underlying sonic character evolved while remaining recognisably connected. Different products naturally demanded different technical solutions and frequency ranges, though the overall identity remained remarkably consistent. Moore argued that brands evolve in much the same way as people do. Their sonic identities should therefore develop over time while retaining a recognisable sense of continuity.

    Perhaps the most unexpected principle discussed during the session concerned silence. Designers often assume that every interaction requires another notification, another confirmation or another layer of feedback. Moore challenged this instinct directly. As products increasingly incorporate sound into everyday life, designers also acquire a responsibility not to make the world unnecessarily louder. Like negative space in graphic design, silence performs an important communicative function. It creates contrast, draws attention to genuinely important events and prevents users from becoming overwhelmed by constant auditory stimulation. Successful Audio UX therefore depends not only upon knowing which sounds should exist, but equally upon recognising which moments deserve silence instead.

    He illustrated this philosophy through the development of Sense, a sleep monitoring device designed to help users understand the quality of their sleep. Conventional alarm clocks often rely upon abrupt, attention-grabbing sounds that force people awake almost instantly. Moore saw an opportunity to rethink that experience entirely. Instead of beginning loudly, the alarms gradually evolved over time, introducing increasing musical complexity, richer timbres and subtle changes in tempo. Lighter sleepers could wake during the earliest stages, while heavier sleepers would gradually encounter a more energetic composition. Even error tones and voice interactions were designed using soft, restrained timbres that preserved the calm atmosphere of the bedroom rather than disrupting it. The project demonstrated that product sounds need not simply communicate efficiently. They can also influence the emotional quality of everyday experiences.

    Moore then introduced one of his central design philosophies: communicative and expressive design. Throughout the presentation, he repeatedly distinguished between creating sounds that merely reinforce a brand and creating sounds that genuinely help people understand what is happening. Branding undoubtedly matters, though communication always takes priority. Every sound should first convey meaning. Only then should it contribute towards a wider sonic identity. This perspective encourages designers to think carefully about urgency, expectation and human perception rather than treating every notification as another opportunity for creative expression. Product sounds exist to guide behaviour as much as they exist to establish identity.

    Tesla provided a particularly revealing case study. Moore described developing different categories of sounds according to the urgency of the information they needed to convey. Low-priority events, such as incoming calls, were designed to emerge gradually using softer timbres and lower levels of perceptual urgency. Medium-priority notifications, including seatbelt reminders, employed greater repetition and brighter timbres, encouraging users to respond without becoming unnecessarily stressful. High-priority warnings, including forward collision alerts, demanded a very different approach. Higher frequency content, more percussive attacks and rapid repetition ensured that these sounds immediately captured attention during situations where rapid action could prevent an accident. Rather than relying upon arbitrary aesthetic decisions, Moore demonstrated how pitch, repetition, harmonic content and timbre can all be manipulated systematically to communicate different levels of urgency. Sound becomes a carefully designed language through which products communicate urgency, intention and behaviour.

    By this stage, a clear philosophy had emerged. Audio UX is not simply concerned with creating pleasant sounds or memorable sonic logos. It asks how products should communicate with the people who use them every day. Strategy, branding, silence, musical structure and perceptual psychology all contribute towards that objective, though none of them represents the ultimate goal. Every design decision serves the relationship between people and technology. Once sound is understood as a form of communication rather than decoration, the challenge shifts from asking what a product should sound like to asking what it should say. That question became even more significant when Moore turned to the rapidly developing world of robotics.

    The second half of Moore’s presentation shifted from broad design principles towards a detailed case study that demonstrated how those ideas are applied in practice. The project centred on Serve, the autonomous delivery robot developed by Postmates. Rather than treating the robot simply as another product requiring notification sounds, Moore used it to explore a far broader question. How should an intelligent machine communicate with people as it moves through shared public spaces? The answer, he suggested, depends upon much more than selecting attractive sounds. It requires understanding personality, context, expectation and human behaviour long before the first sound is ever designed.

    Like every project discussed earlier in the presentation, the design process began with strategy rather than sound. Before any recording or composition took place, the team explored what the robot represented, how people would encounter it and the personality it should express. Several distinct sonic directions were developed around different interpretations of the brand before being refined through successive design reviews and evaluated within the robot itself. Moore emphasised that successful Audio UX develops through continual iteration rather than moments of inspiration. Sounds that appear convincing inside a studio may behave very differently once reproduced by a moving robot navigating busy streets, restaurants and crowded pavements. Testing therefore becomes an integral part of the creative process rather than simply a means of checking technical performance.

    One of the most revealing aspects of the project concerned personality. Popular culture has encouraged audiences to expect robots to communicate through futuristic electronic sounds or highly expressive synthetic voices. Moore deliberately avoided both extremes. The ambition was to create a robot that felt warm, approachable and reassuring without pretending to possess human intelligence or emotional awareness. At the same time, the team resisted the temptation to rely upon recorded speech, recognising that a natural voice would create expectations that the technology could not consistently fulfil. Instead, they searched for a middle ground in which sound suggested character without imitating humanity. This balance between familiarity and honesty reflected one of the most thoughtful ideas running throughout the presentation. Good Audio UX should communicate clearly without misleading users about a product’s capabilities.

    Developing that personality required exploration rather than immediate certainty. Moore described creating several contrasting sonic directions, each expressing a different interpretation of the robot’s identity. Some embraced more mechanical qualities that acknowledged the machine’s physical presence. Others explored vocal-like synthesis capable of suggesting expression without becoming literal speech. A third direction employed simple sine-wave tones that created a calmer, softer and more abstract character. Rather than choosing a favourite instinctively, these alternatives became prototypes through which designers could observe how people responded emotionally to different sonic identities. The final design combined warmth, clarity and subtle expressiveness, producing a robot that felt approachable without becoming theatrical or sentimental. The process illustrated an important principle: successful sound design rarely emerges fully formed. It develops through comparison, evaluation and refinement.

    Attention then shifted from the robot’s overall personality to the design of individual interactions. Different situations demanded different styles of communication. Interactions inside restaurants, where staff loaded deliveries into the robot, prioritised efficiency through short, direct auditory cues that confirmed actions without interrupting the workflow. Encounters with members of the public required a gentler approach. Longer note durations, more relaxed phrasing and softer musical gestures created the impression of patience rather than urgency. In effect, the robot adapted its acoustic behaviour according to the social environment in which it operated, much as people instinctively alter their own behaviour between professional and public settings.

    A particularly memorable example centred on one of the simplest interactions imaginable: saying, “Excuse me.” Rather than relying upon recorded speech, Moore developed a brief auditory gesture that politely attracted attention before allowing the robot to continue its journey. The intention was not to surprise pedestrians or demand an immediate response. Instead, the sound functioned more like a courteous acknowledgement of another person’s presence. This small interaction captured a principle that extended throughout the presentation. Effective communication often depends upon restraint rather than intensity. Products should seek attention only when attention genuinely needs to be given.

    Safety presented a different set of priorities. Warning sounds must communicate immediately and unambiguously, leaving little room for ambiguity or interpretation. Here, Moore returned to ideas introduced earlier in the presentation. Instead of inventing unfamiliar sonic languages, the design frequently drew upon acoustic references that people already understood from everyday experience. Turn indicators, movement cues and other operational sounds retained familiar characteristics while remaining consistent with the robot’s wider sonic identity. This reduced the need for users to learn entirely new sonic conventions. Familiar sounds could be interpreted almost instinctively, allowing people to respond appropriately without consciously analysing what they had heard.

    Perhaps the most technically demanding challenge involved the robot’s continuous movement through public space. Moore explored several possible solutions, including humming, whistling and slowly evolving tonal textures that he described as “glowing.” Each communicated the robot’s presence in a slightly different way. Some attracted attention more effectively, while others blended more comfortably into the surrounding soundscape. Extensive user testing, including sessions involving blind participants, revealed that restrained harmonic complexity and carefully controlled modulation proved more effective than more elaborate alternatives. Yet Moore resisted the temptation to increase the amount of sound simply to improve awareness. His longer-term ambition was quite the opposite. Intelligent products should become quieter rather than louder. If a robot recognises that nobody is nearby, there may be no need for it to produce sound at all.

    This idea provides a fitting conclusion to Moore’s broader philosophy of Audio UX. The discipline is not concerned with filling products with attractive sounds or memorable sonic logos. It asks how technology can communicate clearly, respectfully and appropriately with the people who use it. Whether designing for autonomous robots, electric vehicles, smartphones or medical devices, the same principles continue to apply. Strategy comes before implementation. Communication matters more than novelty. Personality should remain authentic. Silence deserves to be designed as carefully as sound itself. When those ideas come together successfully, sound ceases to be decoration and becomes an essential part of the conversation between people and technology.

  • How Does Sound Affect Us? Julian Treasure on Listening, Wellbeing, and Designing with Our Ears

    Julian Treasure

    How does sound affect us?

    Most people think about sound only when it becomes a problem. We notice the neighbour’s loud music, the traffic outside a bedroom window, the distracting conversation in an open-plan office or the shrill alarm that interrupts an otherwise quiet day. Far less attention is paid to the countless sounds that quietly shape our emotions, influence our behaviour and affect our health from one moment to the next. During his online guest lecture for Edinburgh Napier University, Julian Treasure argued that this oversight represents one of the greatest shortcomings of modern design. Buildings, products and public spaces are often designed primarily for the eye, while the ear receives remarkably little attention. Yet sound continually influences the way people think, work, communicate and feel. Throughout the presentation, one message emerged repeatedly. If we wish to design better experiences, we must learn to design with our ears.

    Treasure began by asking what sound actually affects. The answer, he suggested, is surprisingly simple. Sound influences our happiness, our effectiveness and our wellbeing, along with those of everybody who shares the environments we create. This observation immediately shifts the discussion away from traditional concerns about noise control or acoustic specifications. Sound becomes a human issue rather than merely a technical one. The quality of an acoustic environment influences far more than whether a room sounds pleasant. It affects how effectively people communicate, how comfortably they work, how safely they respond to hazards and how they experience the spaces in which they spend their lives. Sound therefore deserves to be regarded as one of the fundamental materials of design rather than an afterthought considered once construction has already been completed.

    To explain why sound exerts such profound influence, Treasure described four principal ways in which it affects human beings. The first is physiological. Unlike vision, which depends upon the direction in which we happen to be looking, hearing continuously monitors the environment around us. Human beings cannot close their ears in the way they close their eyes. Throughout evolution, this has made hearing our primary warning system, continually searching for signs of danger beyond the limits of our vision. As a consequence, sound reaches deeply into the nervous system with remarkable immediacy. Sudden or unpleasant sounds trigger hormonal responses associated with stress and vigilance, while calmer acoustic environments encourage relaxation. Treasure illustrated this contrast using familiar examples. An unexpected loud noise immediately increases physiological arousal, whereas gentle natural sounds such as breaking waves often slow breathing and encourage a sense of calm. These responses are not matters of personal preference alone. Sound influences heart rate, hormone secretion, breathing patterns and even patterns of brain activity, quietly shaping the body’s internal rhythms throughout the day.

    The discussion then moved beyond physiology towards psychology. Music provides perhaps the most familiar illustration of this relationship. People instinctively choose particular music to celebrate, to concentrate, to relax or to reflect, recognising that different sounds evoke different emotional states. Treasure argued that natural sounds often produce similarly powerful responses. Birdsong, for example, tends to create feelings of safety and reassurance. Rather than being arbitrary preferences, these reactions may reflect deep evolutionary associations developed over thousands of generations. Birds sing when environmental conditions are relatively safe, allowing those sounds to become unconsciously associated with security. Although listeners rarely analyse these processes consciously, they nevertheless influence emotional experience in subtle yet persistent ways. For sound designers, this observation carries important implications. Designing an acoustic environment involves much more than controlling sound levels. It also requires understanding the emotional associations that different sounds naturally evoke.

    Treasure’s argument became even more relevant to contemporary workplaces when he turned to the cognitive effects of sound. Human attention is a limited resource. People often imagine that they can listen to several conversations simultaneously, though the reality proves rather different. Treasure observed that the human brain possesses only a limited capacity for processing speech, making it extremely difficult to concentrate when nearby conversations compete for attention. Open-plan offices provide a familiar example. Designers frequently value openness, flexibility and visual communication, yet the resulting soundscape often undermines the very productivity these environments seek to encourage. Relevant speech continually draws attention away from the task at hand, interrupting concentration and increasing mental effort. Research cited during the presentation suggests that productivity can fall dramatically under these conditions, illustrating that acoustic design contributes directly to cognitive performance rather than simply influencing comfort. Decisions about the sonic character of workplaces therefore become decisions about how effectively people can think.

    By this stage, a broader pattern had already become clear. Sound is not simply something that accompanies our activities. It shapes them. Physiological responses, emotional reactions and cognitive performance all depend, to varying degrees, upon the acoustic environments within which people live and work. This perspective challenges a long-standing tendency to regard sound as secondary to visual design. Treasure instead presented listening as a central consideration for architects, designers, engineers and sound professionals alike. Before deciding how a space should look, he suggested, we should also ask how it will sound, and how those sounds will influence the people who experience them every day. That question would remain at the heart of the remainder of the discussion.

    Having established that sound influences our physiology, psychology and cognition, Treasure turned to its effect upon behaviour. This influence often operates below the level of conscious awareness, making it particularly easy to overlook. Most people assume they make decisions independently of their acoustic surroundings, yet evidence suggests otherwise. Unpleasant environments encourage people to leave sooner, while attractive soundscapes invite them to remain longer. Treasure illustrated this with a striking study of consumer behaviour. In a supermarket displaying French and German wines with identical visual presentation, researchers changed nothing except the background music. On days when French music was played, French wine substantially outsold German wine. When German music replaced it, purchasing patterns reversed. Customers generally remained unaware that the music had influenced their choices, demonstrating that sound can shape behaviour without requiring conscious attention. For designers, retailers and architects alike, this example reinforced an important point. The acoustic environment is never simply a backdrop. It actively participates in shaping human decisions.

    The implications extend far beyond retail spaces. Treasure argued that every designed environment communicates through sound, whether intentionally or otherwise. A restaurant may create an atmosphere that encourages relaxed conversation, while another overwhelms diners with reverberation and competing voices. A hospital waiting room may reduce anxiety through carefully considered acoustics, or increase it through intrusive alarms and mechanical noise. An office may support concentration, or continually undermine it through poorly managed speech privacy. In each case, the acoustic environment becomes part of the overall design, influencing how people behave within the space. Designers therefore make decisions about human experience whenever they make decisions about sound, even if those decisions consist of ignoring it altogether.

    Treasure observed that this neglect reflects a broader imbalance within contemporary design practice. Buildings are routinely judged according to their appearance, products are evaluated through their visual form, and digital technologies devote enormous attention to graphical interfaces. Comparatively little thought is often given to how these same environments sound. This imbalance is surprising when one considers that hearing operates continuously. We can choose where to look, though we cannot simply decide to stop hearing the world around us. Sound therefore accompanies every activity, continually influencing perception in ways that visual design alone cannot achieve. Rather than treating acoustics as a specialist concern addressed late in a project, Treasure encouraged students to recognise listening as a fundamental design consideration from the very beginning.

    This perspective resonates strongly with professional sound design. Whether creating a film soundtrack, designing interface sounds, producing a virtual instrument or developing an interactive game, practitioners rarely add sound simply to occupy silence. Every sound communicates information, guides attention or influences emotional response. Treasure’s presentation broadened this principle beyond media production into everyday life. The same questions that sound designers ask while constructing a soundtrack also apply to architecture, product design and urban planning. What should the listener notice? Which sounds deserve emphasis? Which should remain unobtrusive? How can sound support rather than distract from the intended experience? The boundaries between sound design and environmental design begin to blur once listening itself becomes the central concern.

    Perhaps the most compelling aspect of Treasure’s argument lay in its optimism. If sound can undermine wellbeing, productivity and behaviour, it can equally improve them. Pleasant acoustic environments encourage relaxation, reduce physiological stress and support clearer thinking. Appropriate sound can strengthen communication, promote social interaction and make public spaces more welcoming. Rather than presenting acoustics as a matter of reducing unwanted noise, Treasure reframed the discussion in positive terms. The objective is not simply to remove bad sound, but to create environments in which good sound actively contributes to human wellbeing. This shift in perspective encourages designers to think creatively about the role sound can play rather than treating it solely as a problem to be controlled.

    These ideas naturally led towards a broader discussion of listening itself. If sound exerts such profound influence over human experience, then the ability to listen carefully becomes an essential professional skill rather than an incidental personal habit. Treasure suggested that hearing and listening are not the same activity. Hearing occurs automatically, while listening demands conscious attention, intention and practice. In an increasingly noisy world filled with competing sources of information, the ability to listen thoughtfully may be becoming more valuable rather than less. This distinction between passive hearing and active listening would ultimately form the foundation of his concluding message, not only for sound designers but for anyone responsible for creating environments in which other people live, work and communicate.

    Having demonstrated that sound influences physiology, emotion, cognition and behaviour, Treasure turned towards a more practical question. If sound has such profound effects upon human experience, what should designers actually do differently? His answer was strikingly optimistic. Rather than treating acoustics as a problem to be solved, he encouraged students to think of sound as a resource that can be shaped deliberately to improve people’s lives. Well-designed soundscapes do more than reduce unwanted noise. They encourage particular patterns of behaviour, support communication and create environments in which people feel healthier, calmer and more engaged. Designing with sound therefore becomes an act of positive intervention rather than damage limitation.

    Treasure illustrated this philosophy through a series of real-world projects. Airports, shopping centres and public spaces all benefited from carefully designed soundscapes that considered not only what people heard, but how those sounds influenced the way they behaved. Introducing natural sounds and thoughtfully composed musical environments increased customer satisfaction, encouraged visitors to remain longer and, in several cases, improved commercial performance. In one public space, the introduction of a biophilic soundscape was even associated with a measurable reduction in crime. These examples reinforced a central point running throughout the presentation. Sound does not merely accompany human activity. It shapes it. Decisions about the acoustic environment therefore become decisions about wellbeing, behaviour and social experience rather than simply matters of technical acoustics.

    Although these examples came from architecture and environmental design, their relevance extends directly to professional sound design. Every soundtrack contains foreground and background elements competing for the listener’s attention. Treasure encouraged students to think carefully about the role each sound should play within that wider acoustic picture. Not every sound deserves prominence, and not every moment benefits from additional music or greater complexity. Like a visual composition, an effective soundscape depends upon balance, hierarchy and clarity. He also encouraged designers to draw inspiration from natural environments, particularly through the thoughtful use of biophilic sound and adaptive or generative soundscapes that evolve over time rather than repeating mechanically. Different spaces support different activities, and their sonic character should reflect those differing purposes. Designing for concentration requires different acoustic decisions from designing for relaxation, learning or social interaction.

    The discussion naturally led back to listening itself. Treasure argued that hearing should never be confused with listening. Hearing is automatic. Listening is intentional. It requires attention, effort and continual practice. In an age characterised by constant distraction and increasingly complex acoustic environments, the ability to listen carefully becomes one of the most valuable professional skills a sound designer can develop. Technical expertise undoubtedly remains important, though it cannot substitute for careful listening. The most sophisticated recording equipment or software offers little value if the designer fails to recognise what listeners actually experience. Listening therefore becomes both a creative skill and an ethical responsibility. Before changing the sound of the world, designers must first learn to hear it properly.

    Treasure concluded by describing what he called the four foundations of effective listening: being conscious, committed, compassionate and curious. Conscious listening requires recognising that listening is an active process rather than a passive consequence of hearing. Commitment acknowledges that good listening demands time, attention and intention. Compassion encourages genuine understanding of other people through careful listening, particularly when viewpoints differ from our own. Curiosity reminds us that every sound and every conversation offers an opportunity to learn something new. Although these principles were presented in the context of listening, they also describe many of the qualities that distinguish thoughtful sound designers. Successful practitioners remain attentive, purposeful, empathetic and continually curious about how people experience the acoustic world around them.

    Treasure’s final appeal brought together everything that had preceded it. He encouraged students to become champions of listening and, above all, to “design with your ears.” This simple phrase encapsulated the wider philosophy running throughout the presentation. Sound should never be regarded as an afterthought added once visual design has been completed. It is one of the primary ways in which people experience the world. Every building, product, public space and interactive system possesses an acoustic identity that influences those who encounter it. Whether designing a film soundtrack, a hospital, a mobile application or a railway station, the same principle applies. The sounds we create shape the lives of the people who hear them.

    Taken together, Treasure’s presentation offered a compelling vision of contemporary sound design. It challenged the traditional tendency to regard sound as secondary to vision and instead positioned listening at the centre of human experience. Physiological responses, emotional wellbeing, cognitive performance, behaviour and communication all depend, to varying degrees, upon the acoustic environments we inhabit. For sound designers, this represents both an opportunity and a responsibility. Every decision about sound has consequences extending beyond aesthetics alone. Designing well therefore means more than creating compelling audio. It means understanding how people listen, recognising how profoundly sound affects everyday life and applying that knowledge to create environments in which individuals and communities can genuinely flourish.

  • How Do You Design a Virtual Instrument? Alejandro Cabrera on Sampling, Sound Design, and Building Kontakt Libraries

    Alejandro Cabrera

    How do you design a virtual instrument?

    Every virtual instrument begins long before the first note is recorded. Musicians often experience sample libraries as polished products that load instantly inside a digital audio workstation, responding naturally to every performance. Hidden behind that apparent simplicity lies an extraordinary amount of planning, recording, editing and technical development. During an online guest lecture for Edinburgh Napier University, Sound Design alumnus Alejandro Cabrera drew upon his professional experience developing sample libraries at 8Dio to reveal how professional virtual instruments are created. Although he used Kontakt to illustrate many of the techniques, his wider message extended well beyond any individual software platform. Successful sound design depends as much upon preparation, organisation and critical listening as it does upon recording itself.

    Cabrera began by challenging a common misconception. Building a virtual instrument is not simply a matter of recording every note and loading the resulting files into a sampler. Instead, it is a carefully structured process comprising pre-production, recording, editing and software development, with each stage influencing everything that follows. Recording sessions may occupy only a small proportion of the overall project, yet their success depends almost entirely upon the decisions made beforehand. Choosing the instrument, selecting an appropriate recording space, determining microphone configurations, deciding which articulations should be captured and calculating the number of samples required all take place before the recording engineer presses record. By the time the first note is performed, many of the most significant creative decisions have already been made.

    Planning emerged as one of the defining themes of Cabrera’s presentation. Recording studios are expensive environments in which every unnecessary decision consumes valuable time. Arriving without a detailed recording plan risks producing inconsistent material, overlooking essential articulations or capturing far more audio than the finished instrument will ever require. To avoid these problems, Cabrera demonstrated the production sheets used to calculate precisely how many samples each instrument will need. The combination of notes, microphone positions, dynamic layers, articulations and recorded variations quickly expands into thousands of individual files. Even a comparatively modest instrument can generate an unexpectedly large collection of audio once every variation has been considered. Careful preparation therefore becomes far more than administrative organisation. It provides the framework upon which the entire virtual instrument will later be constructed.

    This emphasis upon preparation reflects a broader principle that extends well beyond sample library development. Whether recording Foley, ambience, dialogue or musical instruments, professional sound designers rarely begin by placing microphones in front of a source and hoping for the best. They begin by asking what the finished project needs to achieve. Every technical decision should support that objective. Microphone placement depends upon the character of the instrument, the intended listening experience and the amount of flexibility required during production. Recording an intimate acoustic instrument demands different decisions from sampling a full drum kit with multiple microphone positions, while noisy environments require different strategies from carefully controlled studio spaces. Cabrera encouraged students to think of recording not as an isolated technical exercise, but as one stage within a much larger design process in which every decision influences those that follow.

    One particularly revealing discussion centred upon how rapidly complexity increases once realism becomes the goal. Professional sample libraries rarely rely upon a single recording of each note. Different playing dynamics, alternative articulations, multiple microphone positions and repeated performances all contribute towards creating an instrument that responds naturally to the performer. Cabrera introduced concepts such as velocity layers and round robins, not simply as software features, but as perceptual design decisions. Human listeners detect repeated sounds remarkably quickly. Replaying exactly the same recording whenever a note is triggered produces an artificial, mechanical quality that immediately reveals the illusion. Recording carefully controlled variations allows the instrument to remain convincing even during repeated passages, illustrating that realism often depends less upon producing more sound than upon introducing meaningful variation. The objective is not to simulate every possible performance. It is to create enough believable variation that musicians stop thinking about the technology and simply play.

    By this point, a recurring theme had become unmistakable. Building a convincing virtual instrument is not primarily a software problem. It is a sound design problem. The quality of the finished library depends upon understanding the instrument, anticipating how musicians will perform with it and making thoughtful decisions long before the first recording session begins. Technology undoubtedly provides the tools, though preparation, organisation and critical listening determine how successfully those tools can ultimately be used.

    Once the recordings have been completed, the project enters what is often the longest and least visible stage of development. Thousands of individual recordings must be reviewed, edited and organised before they can become a playable instrument. Cabrera emphasised that this work extends far beyond removing unwanted noise or trimming the beginnings and endings of files. Every sample must behave consistently alongside every other sample, allowing the finished instrument to respond naturally regardless of how it is played. Editing therefore becomes a continuation of the design process rather than a separate technical activity. Decisions made at this stage shape the responsiveness of the instrument every bit as much as the recordings themselves.

    Organisation proved equally important. A professional sample library may contain many thousands of individual audio files representing different notes, articulations, dynamic levels, microphone positions and performance variations. Without a rigorous naming convention and carefully structured file management, even relatively modest projects quickly become difficult to maintain. Cabrera demonstrated how systematic organisation supports every subsequent stage of development. Samples can be located immediately, revisions become easier to implement and future updates remain manageable long after the original recording sessions have finished. Good organisation rarely attracts attention, yet it underpins almost every successful production workflow.

    The discussion then turned to Kontakt, the software platform used to assemble these recordings into fully playable virtual instruments. Rather than presenting Kontakt as a collection of technical features, Cabrera used it to demonstrate a broader principle. Software should serve the behaviour of the instrument rather than dictate it. Every mapping decision, performance control and scripting choice exists to make the instrument respond in ways that feel intuitive to the musician. The objective is not simply to trigger recordings accurately, but to create the impression that a real instrument is responding naturally to performance. Technology becomes valuable only when it disappears behind the experience of playing.

    This philosophy also shaped Cabrera’s discussion of scripting. Many musicians never see the programming that sits beneath the graphical interface, yet these invisible systems determine how the instrument behaves. Scripts decide which recordings should be triggered, how different articulations are selected, how repeated notes vary over time and how controls respond to the performer. Much of the intelligence within a modern virtual instrument therefore lies not in the recordings themselves, but in the logic that governs their behaviour. Sound design, software engineering and user experience become closely interconnected, each contributing towards the illusion that the performer is interacting with a coherent musical instrument rather than a collection of audio files.

    Throughout the discussion, Cabrera consistently resisted the temptation to equate realism with complexity. Recording more samples, adding more controls or increasing the number of available options does not automatically produce a better instrument. Every additional recording increases editing time, complicates organisation and places greater demands upon storage, processing power and the musician using the library. The more important question concerns value rather than volume. Which additional recordings genuinely improve the playing experience, and which merely increase complexity without offering meaningful benefit? Successful virtual instruments emerge through thoughtful selection rather than unlimited accumulation.

    These decisions reflect a much broader principle within sound design. Whether recording dialogue, creating Foley, designing interactive game audio or developing sample libraries, practitioners continually shape the listener’s experience by deciding which details deserve attention and which can remain implicit. Technology undoubtedly expands the range of available possibilities, though it rarely removes the need for editorial judgement. Every successful project depends upon identifying the information that listeners or performers genuinely need, then presenting it clearly without unnecessary complication. The objective is not technical excess, but meaningful communication.

    The discussion also highlighted the collaborative nature of professional practice. Developing a virtual instrument combines disciplines that are often treated separately within education and industry. Recording engineers, musicians, software developers, editors, interface designers and producers each contribute different forms of expertise, yet the finished instrument succeeds only when those contributions work together coherently. Cabrera’s examples demonstrated that professional sound design rarely develops in isolation. The most effective solutions emerge when technical and creative perspectives continually inform one another throughout the production process rather than being treated as independent stages.

    Taken together, these discussions revealed that virtual instruments represent far more than collections of recorded sounds. They are carefully designed systems that combine acoustics, performance, recording, editing and software into a single expressive tool. Every decision, from the earliest planning documents to the final user interface, contributes towards the illusion that a performer is interacting with a living instrument rather than triggering digital recordings. For sound designers, perhaps that is the most enduring lesson. The success of a design is rarely determined by the sophistication of its technology alone. It depends upon how completely the technology disappears, allowing creativity, expression and musical performance to take centre stage.

  • How Does a Crowd Find Its Voice? David Monteath on Crowd ADR, Performance, and Creating Believable Worlds

    David Monteath

    How does a crowd find its voice?

    When audiences watch a film or television programme, their attention naturally settles upon the principal actors. Far less notice is taken of the countless background voices that transform a collection of images into a believable social world. Conversations drifting through a restaurant, murmured discussions in an office, distant arguments in a crowded street or the indistinct atmosphere of a busy marketplace all contribute to the impression that life continues beyond the central characters. Remove those voices, and even the most carefully photographed scene can feel strangely artificial. During his online guest lecture for Edinburgh Napier University, David Monteath returned to the University as a Sound Design alumnus to explore the specialised craft of crowd ADR. Drawing upon more than three decades working as an actor and voice artist, he demonstrated that believable crowd performances depend upon observation, improvisation and an understanding of dramatic context rather than simply recording large numbers of voices. One principle underpinned the discussion. Context is king.

    Rather than replacing the dialogue of principal actors, crowd ADR creates the sense that an entire world exists beyond them. A small group of performers may become the customers in a restaurant, the spectators at a football match, the passengers waiting on a railway platform or the crowd gathered in a courtroom. Individual conversations overlap, reactions ripple through the group and emotional responses emerge at precisely the right moments, creating the impression that every person visible on screen possesses a life extending beyond the immediate story. Audiences rarely notice these performances consciously, yet they immediately recognise when they are missing. Scenes that lack convincing crowd performances often feel unexpectedly empty, regardless of how carefully they have been photographed or edited.

    Monteath repeatedly challenged the assumption that this work consists simply of creating background noise. Crowd ADR is first and foremost a form of acting. Every performance responds to the circumstances of the scene, the relationships between characters and the emotional atmosphere established by the director. People waiting quietly in a hospital corridor behave differently from supporters leaving a football stadium. Conversations in an expensive restaurant differ from those heard in a busy café, while voices surrounding a royal procession carry a very different energy from those accompanying a political protest. Every reaction, interruption and fragment of conversation exists to support the dramatic reality of the scene rather than to attract attention in its own right. Authenticity emerges from understanding how people genuinely behave in different situations, not from making scenes louder or busier.

    This emphasis upon dramatic context shaped every practical discussion throughout the lecture. Monteath encouraged students to think beyond individual words and instead consider the circumstances in which those words are spoken. Before deciding how loudly to speak, how quickly to react or even what might be said, performers first need to understand where they are, who surrounds them and what is happening within the story. The same phrase may require entirely different delivery depending upon whether it takes place in a library, an airport, a football ground or the middle of a battlefield. Successful crowd performers therefore begin by observing people. Everyday behaviour, casual conversations, shared laughter, hesitation, disagreement and excitement all provide material that can later be adapted naturally within the recording studio. The objective is not to invent behaviour, but to recognise and recreate it convincingly.

    Perhaps the most revealing insight from this opening part of the lecture concerned the relationship between realism and audibility. Many beginning sound designers instinctively assume that important sounds should always be heard clearly. Monteath argued for almost the opposite approach. Successful crowd ADR often succeeds precisely when audiences remain largely unaware of it. Background voices should usually be felt rather than heard, contributing movement, texture and emotional energy without competing with the principal dialogue. Monteath returned repeatedly to the idea that audiences should sense the presence of a living world long before they consciously identify individual voices. Crowd ADR achieves its greatest success not when listeners admire the performance, but when they accept the world on screen without ever questioning how it came to life.

    One of the most valuable themes running through the lecture concerned the difference between sounding natural and sounding believable. These ideas are not always identical. Performers working in crowd ADR rarely speak at the same level they would use in everyday conversation, yet exaggeration can become equally unconvincing. Monteath described the continual process of judging how voices should sit within the perspective of the scene. A performer passing close to the camera requires a different vocal presence from someone crossing the background several metres away, while conversations taking place outdoors demand a different energy from those occurring in confined interior spaces. Every decision depends upon dramatic perspective rather than fixed performance rules. Context, once again, determines everything. For sound designers, these distinctions become equally important during editing and mixing. A crowd recording that sounds entirely convincing in isolation may feel unexpectedly prominent once placed alongside production dialogue, Foley and ambience. Perspective therefore emerges through the relationship between every element of the soundtrack rather than through any individual recording considered on its own.

    This attention to perspective extends beyond volume alone. Monteath discussed the subtle adjustments people make instinctively when speaking in different environments. Outside, voices naturally rise in level before settling into an appropriate projection as people unconsciously judge the surrounding space. He compared this process to a form of echolocation. Speakers continually test their surroundings, modifying projection almost instantly until their voice feels appropriate for the environment. Recording inside a studio removes many of the environmental cues that normally guide these unconscious adjustments, requiring performers to recreate them deliberately. The challenge is not simply to speak more loudly for an exterior scene, but to reproduce the natural behaviour that accompanies speaking outdoors. Audiences rarely analyse these details consciously, though they recognise immediately when they feel unconvincing. Successful crowd ADR therefore depends upon recreating patterns of human behaviour rather than merely increasing vocal intensity.

    The physical demands of crowd ADR also proved far greater than many students had expected. Scenes involving panic, conflict or large-scale action often require sustained shouting over many hours, placing considerable strain on performers’ voices. Monteath reflected upon sessions in which actors had pushed themselves to the point of temporary vocal exhaustion, particularly when recording intense battle scenes. Curiously, he observed that shouting repeatedly inside a recording studio often proves more tiring than raising the voice naturally outdoors. In everyday life, people instinctively project according to their surroundings. Within the artificial environment of a studio, performers can find themselves holding unnecessary tension in the throat in ways that feel surprisingly unnatural. Maintaining vocal health therefore becomes an important professional skill alongside acting itself. It also reflects another aspect of professional sound design that audiences rarely consider. Recordings capable of conveying fear, excitement or urgency often depend upon performers sustaining physically demanding work throughout lengthy recording sessions while preserving consistency from one take to the next.

    The discussion of large battle sequences illustrated another revealing aspect of the profession. Crowd performers may spend an entire day creating layers of screams, reactions and movement for scenes involving hundreds or even thousands of people, fully aware that much of their work will eventually disappear beneath music, sound effects and the principal action. Monteath recalled recording material for a major battle sequence in Game of Thrones, where hours of physically demanding vocal performances ultimately became almost imperceptible within the finished soundtrack. Rather than expressing disappointment, he presented this as an inevitable consequence of professional sound design. The objective was never for individual performances to stand out. Their purpose was to contribute energy, scale and credibility to the scene, even if audiences remained almost entirely unaware of their presence. The irony is that some of the hardest work in post-production often becomes the least conspicuous in the finished mix.

    Monteath’s recurring phrase, “Context is king,” captures this philosophy particularly well. Every vocal decision derives from the dramatic situation rather than from the performer. Voices rise or fall according to the surrounding environment, emotional reactions emerge in response to the unfolding action and every fragment of conversation exists to reinforce the illusion that life extends beyond the principal characters. Successful crowd ADR is therefore measured not by how clearly individual voices are heard, but by how convincingly they allow audiences to believe in the world unfolding around them. Like many aspects of professional sound design, its greatest achievement lies in remaining almost invisible while making the fictional world feel entirely real.

    The lecture concluded with a discussion that moved beyond recording techniques and towards the broader decisions that shape professional sound design. One student described the challenge of creating the atmosphere for a bank robbery scene. Adding more and more voices had seemed the obvious solution, yet the result quickly became cluttered and distracted from the drama. Monteath’s response illustrated once again why crowd ADR depends upon judgement rather than quantity. Real crowds rarely behave as a single, unified group. Even in moments of fear, surprise or excitement, different people react at different times and in different ways. Some remain silent, others whisper, a few call out, while many simply watch events unfold. Attempting to represent every visible person with an equally prominent vocal performance often produces a soundtrack that feels less realistic rather than more so. Believability emerges through carefully judged variation, allowing individual reactions to appear and disappear naturally instead of competing continuously for the listener’s attention.

    This observation extends well beyond crowd ADR. Throughout post-production, sound designers continually decide what deserves the audience’s attention and what should remain part of the wider acoustic environment. A convincing soundtrack is not created through the accumulation of detail, but through the careful organisation of that detail into a coherent dramatic experience. Crowd performances occupy a role similar to ambience, Foley and environmental sound. They establish context, scale and emotional texture without constantly demanding attention. Their purpose is not to demonstrate how much work has been carried out, but to convince audiences that the world extending beyond the principal characters already exists. Like every other element of a soundtrack, their success depends upon supporting the story rather than competing with it.

    Towards the end of the lecture, discussion turned to the growing influence of artificial intelligence within the voice industry. Monteath acknowledged that AI is already beginning to affect areas such as commercial voice-over, where some clients have started experimenting with synthetic voices. He regarded crowd ADR rather differently. While aspects of the work may eventually become automated, authentic crowd performance depends upon subtle variations that emerge naturally whenever people work together. Voices change over the course of a recording session as performers become tired. Emotional intensity shifts between takes. Individual personalities influence rhythm, timing and vocal colour in ways that are difficult to predict or reproduce consistently. These variations might appear inconvenient from a purely technical perspective, yet they contribute directly to the richness, unpredictability and authenticity that audiences instinctively recognise as human. Technology will continue to evolve, though observation, collaboration and performance remain at the heart of believable sound design.

    For sound design students, perhaps the most valuable lesson lay in the way Monteath described his profession. Crowd ADR may appear to occupy the margins of post-production, hidden beneath dialogue, music and sound effects, yet it influences how audiences perceive almost every scene they watch. Every murmur in the background of a restaurant, every distant conversation in a station concourse and every carefully judged reaction during a moment of crisis contributes to the illusion that life continues beyond the frame. These performances do not simply fill silence. They create social spaces that feel inhabited, allowing viewers to concentrate on the story without questioning the reality of the world surrounding it.

    Throughout the lecture, Monteath returned repeatedly to one deceptively simple principle: “Context is king.” Crowd ADR succeeds not through memorable performances or individually recognisable voices, but through creating the impression that every environment extends beyond the limits of the frame. Every carefully judged laugh, argument, whispered conversation and fleeting reaction reinforces a believable social world without distracting from the principal narrative. For sound designers, this represents a broader lesson that reaches far beyond dialogue replacement. Successful audio is rarely measured by how noticeable it becomes. More often, it is measured by how completely it allows audiences to believe in the world they are experiencing. Crowd ADR exemplifies that philosophy. It remains one of the least visible aspects of professional sound design, yet it is also one of the crafts that most quietly transforms moving images into convincing places inhabited by believable people.

  • How Much Sound Does a Game Really Need? Gaetan Troutet on Casual Games, Creative Restraint, and Designing for the Real World

    Gaetan Troutet

    How much sound does a game really need?

    Most players never notice the sounds that have been deliberately left out of a game. During his online guest lecture for Edinburgh Napier University, Gaetan Troutet suggested that this is often the hallmark of successful sound design. Creating an effective soundtrack is rarely about filling every moment with audio. It is about deciding what genuinely deserves to be heard. Drawing upon his work developing casual games for Global Eagle Entertainment, he demonstrated how technical limitations, player behaviour and careful editorial judgement shape almost every creative decision. A single principle underpinned the discussion. Successful sound design depends as much upon restraint as invention.

    The environment in which Troutet’s games are played makes these decisions particularly demanding. Unlike many commercial titles developed for dedicated gaming hardware, his work must function across a diverse collection of in-flight entertainment systems installed on aircraft across the world. Some platforms provide comparatively modern hardware with generous storage and processing resources. Others continue to rely upon considerably older systems whose limited memory and bandwidth require soundtracks to be simplified before they can be deployed. The same game may therefore exist in several different technical versions, each shaped by the capabilities of the hardware on which it will eventually run. Even then, the hardware represents only part of the challenge. Every passenger experiences the soundtrack differently. Some use the headphones supplied by the airline, others connect their own, while many later encounter the same games on mobile devices with entirely different loudspeakers. Unlike a cinema or recording studio, there is no single reference listening environment. Troutet suggested that professional sound designers should accept this uncertainty rather than attempting to eliminate it. The objective is not to produce a soundtrack that sounds perfect under ideal conditions. It is to create one that continues to communicate effectively wherever it is heard.

    Although the lecture centred upon casual games, the questions Troutet raised apply to sound design far more generally. Every project exists within practical constraints, whether they involve memory budgets, processing power, production schedules or playback systems. Rather than viewing these restrictions as obstacles to creativity, Troutet argued that they often encourage clearer thinking. Once every sound occupies valuable storage, competes for the listener’s attention and requires implementation within a functioning game, designers become far more selective about what truly matters. Working as the sole audio practitioner within his development team reinforces that perspective. Troutet moves continually between creating sound effects, composing music, recording dialogue, implementing assets and collaborating with programmers and designers. Rather than treating these activities as separate disciplines, he presented them as interconnected parts of a single design process. Creative decisions influence implementation, technical limitations shape artistic choices and production realities affect every stage of development. Sound design therefore becomes inseparable from the wider process of building the game itself.

    One of the most thought-provoking moments in the lecture centred upon what appears to be a deceptively simple question. When a player performs an action, should that action always produce a sound? Many beginning designers instinctively answer yes. Buttons receive clicks. Menus receive confirmation tones. Every movement, selection, reward and transition appears to justify another layer of feedback. Troutet challenged this assumption directly. Rather than asking which sounds could be added, he encouraged students to ask which sounds genuinely improved the experience. Every additional sound competes for the listener’s attention. Every new cue alters the perceived importance of those surrounding it. Audio that initially appears informative can rapidly become repetitive, distracting or simply exhausting when heard hundreds of times during repeated play. Casual games make this question particularly important. Players often return to them repeatedly in relatively short sessions. Sounds that seem satisfying during the first few minutes may become irritating after dozens of repetitions. Troutet therefore described restraint as an active design decision rather than the absence of creativity. Silence is not an empty space waiting to be filled. It forms part of the overall balance of the soundtrack. Choosing not to add a sound may ultimately improve clarity far more than creating another effect.

    These same principles become particularly apparent in interface design, where audio functions less as decoration than as communication. Troutet encouraged students to think of interface sounds as messages directed towards the player rather than ornamental additions to menus and buttons. A confirmation tone, warning signal or navigation sound should communicate its purpose immediately, allowing players to understand what has happened without continually consulting the screen. One particularly memorable suggestion involved imagining the interface without any graphics at all. If a player were blindfolded and heard only the sounds, could they still distinguish success from failure, confirmation from cancellation, or navigation from selection? If the answer is yes, then the sounds are performing a genuine communicative role. If not, making them louder or more elaborate is unlikely to solve the underlying problem. Rather than treating interface sounds as decorative clicks or beeps, Troutet encouraged students to think of them almost as a spoken language. Every sound should communicate intention. Players should recognise whether an action has succeeded, failed or requires further input without consciously analysing what they have heard. Well-designed interface audio reduces cognitive effort. The player understands first and reflects afterwards. In this sense, interface sounds become part of the conversation between the game and the player rather than simply another layer of feedback.

    The same philosophy shaped Troutet’s approach to creating collections of related sounds. Rather than treating every effect as an independent recording selected from unrelated libraries, he described building what he called families of sounds. Interface elements, gameplay feedback and recurring actions share common characteristics, creating a recognisable sonic vocabulary throughout the game. Individual sounds may differ substantially in pitch, duration or function, though they continue to feel as though they belong together. Players may never consciously analyse these relationships, yet they often perceive the overall soundtrack as more coherent and easier to understand. Creating these relationships frequently meant recording original material rather than relying exclusively upon commercial sound libraries. Library recordings remain valuable resources, though bespoke recordings provide greater flexibility when developing a consistent sonic identity. Variations can be created from common source material, preserving subtle similarities that would be difficult to achieve using unrelated recordings gathered from multiple collections. The objective is not originality for its own sake. It is to ensure that every sound contributes towards a coherent listening experience rather than drawing attention to itself as an isolated event.

    Troutet consistently returned to the relationship between player experience and design judgement. Recording equipment, software and implementation techniques remained important, though they were never presented as ends in themselves. Every technical decision ultimately served the same objective: helping players understand, navigate and enjoy the game. Sound design therefore became an exercise in editorial judgement rather than accumulation. The important question was no longer how another sound might be added, but whether that moment genuinely deserved sound at all. Once that decision becomes the starting point, implementation, iteration and refinement begin to look rather different, forming the focus of the remainder of the lecture.

    Implementation forms the natural continuation of Troutet’s argument. Once the decision has been made that a sound genuinely deserves to exist, another set of questions immediately follows. When should it play? Under what conditions should it remain silent? How should it respond when players behave in unexpected ways? Troutet encouraged students to recognise that creating an individual sound is only one stage of the design process. A carefully recorded asset can still fail if it appears at the wrong moment, masks more important information or becomes repetitive through excessive triggering. Implementation therefore becomes an extension of sound design rather than a separate technical activity. Decisions about timing, variation and behaviour shape the player’s experience just as profoundly as the recordings themselves. Very few sounds remain unchanged after their first implementation. Once assets begin interacting with graphics, gameplay and player behaviour, weaknesses quickly become apparent. Sounds that worked well in isolation may feel intrusive within the finished game. Others disappear beneath music or gameplay effects, while some simply occur too frequently. Rather than treating these discoveries as failures, Troutet presented them as an expected part of development. Every implementation reveals more about how players actually experience the game, allowing successive revisions to refine the soundtrack until it supports interaction naturally.

    This willingness to revise also requires a particular creative mindset. Troutet observed that sound designers often invest considerable effort in creating individual recordings, making it tempting to defend them once they have been completed. Professional practice frequently demands the opposite approach. If a sound distracts players, interrupts the pacing of the game or simply fails to communicate effectively, attachment to the recording itself becomes irrelevant. During the lecture he summarised this philosophy with a familiar expression from creative practice: kill your babies. The phrase may sound severe, though the principle behind it is straightforward. The success of the overall experience matters more than preserving individual ideas. Removing or replacing a favourite sound is sometimes the decision that allows the remainder of the soundtrack to function more effectively. The willingness to edit critically therefore becomes every bit as important as the ability to create new material.

    The same philosophy extends beyond individual recordings into collaboration with the wider development team. Troutet repeatedly emphasised that sound design does not develop independently from programming, art or game design. Audio practitioners inherit decisions made elsewhere while simultaneously influencing the work of others. Effective collaboration therefore depends upon communicating design decisions in terms of the player’s experience rather than purely technical language. Requests for additional implementation features, changes to interface behaviour or modifications to gameplay become far easier to justify when they are framed around what players will understand, notice or enjoy. Communication, in this sense, becomes another aspect of sound design rather than an administrative task surrounding it. Professional organisation supports that collaboration in equally practical ways. Clear file names, consistent project structures and carefully maintained asset libraries rarely receive the same attention as recording or mixing, yet they influence every subsequent stage of production. Projects evolve over months or years, assets require continual revision and other members of the team must be able to locate the correct material quickly. Well organised sessions reduce confusion, simplify implementation and ultimately create more opportunities for genuinely creative work.

    Troutet also cautioned against becoming overly attached to particular software, plug-ins or recording equipment. Digital audio workstations continue to evolve, new tools appear regularly and production techniques inevitably change across a career. These developments undoubtedly influence professional practice, though they remain only means of achieving a larger objective. The more important questions concern what the player should hear, what information deserves emphasis and how audio contributes to the overall experience of the game. The same perspective shaped his comments on sources of inspiration. Commercial sound libraries, films and existing games all provide valuable references, though they should never replace careful design thinking. A distinctive soundtrack emerges through the relationships between sounds, the pacing of interaction and a clear understanding of the audience rather than through the novelty of any individual recording. Troutet consistently returned to the idea that sound design is fundamentally a process of making informed decisions rather than collecting techniques.

    Troutet repeatedly argued that sound should guide interaction rather than compete with it. Audio may reward success, reinforce important actions or draw attention towards changing events, though it should rarely distract players from the activity itself. This philosophy connects directly to the earlier discussions of restraint, interface communication and coherent families of sounds. Every element of the soundtrack exists to support understanding. Once a sound begins attracting attention to itself rather than to the player’s experience, its purpose deserves to be questioned. The measure of successful sound design is therefore not how much audio has been added to a game, but whether every element continues to justify its presence through the experience it creates for the player.

    The lecture concluded by returning, implicitly, to the same deceptively simple question that had shaped the discussion from the beginning. How much sound does a game really need? Troutet offered no universal formula. Different genres, audiences and platforms inevitably require different solutions. Instead, he encouraged students to replace assumptions with judgement. Does this sound communicate something important? Does it improve the player’s understanding? Does it strengthen the overall experience? If the answer is no, then adding more audio is unlikely to solve the problem. Careful omission often represents a stronger design decision than continual addition. Across examples ranging from airline entertainment systems to interface design, implementation and professional collaboration, Troutet consistently presented sound design as an exercise in thoughtful selection. The defining characteristic is judgement. Choosing which sounds deserve to exist, how they relate to one another and when they should remain silent requires an understanding of perception, interaction and communication that extends far beyond recording individual effects. Successful sound design is therefore measured not by the quantity of sounds within a project, but by how effectively those sounds help players understand, navigate and enjoy the worlds they inhabit.

  • How Do You Design Great Sound for Terrible Speakers? Tracy Bush on Creative Constraints, Game Audio, and Designing for the Real World

    Tracy Bush

    How do you design great sound for terrible speakers?

    Modern games present players with remarkably convincing sonic worlds. Dialogue responds naturally to changing situations, environments feel alive with movement and atmosphere, interfaces communicate information almost instinctively, and music adapts to the pace of play. Looking at contemporary productions, it is easy to imagine that these achievements are primarily the result of increasingly powerful technology. During his online guest lecture for Edinburgh Napier University, Tracy Bush suggested something rather different. Drawing upon a career that has included Blizzard Entertainment, Sony Online Entertainment, NCSoft and Sphero, he described how some of the most effective sound design emerges when technology imposes severe limitations. Small memories, limited processors, unpredictable playback systems and tiny loudspeakers do not simply restrict creativity. They force designers to think more carefully about what listeners genuinely need to hear.

    Bush’s own career reflected the rapid evolution of the games industry itself. Music had always formed an important part of his life, though his professional background began in information technology rather than audio. While working during the day, he spent evenings performing as a pianist in bars around San Francisco. After relocating to southern California, he joined Blizzard Entertainment in an IT role. His musical interests gradually became known throughout the company, leading colleagues to involve him in audio work whenever opportunities arose. Rather than following a carefully planned route into game sound, his career developed through a willingness to solve unfamiliar problems wherever they appeared. Looking back, Bush suggested that many people entered the industry in much the same way. Studios were small, responsibilities overlapped, and individuals frequently discovered new specialisms simply by becoming the person willing to tackle the next challenge.

    The games industry of the late 1990s differed substantially from the one students encounter today. Development teams were comparatively small, production pipelines remained fluid and many working practices were still evolving. Audio departments often worked alongside programmers, artists and designers in highly collaborative environments where formal boundaries between disciplines were less rigid than they later became. Bush described an atmosphere in which experimentation emerged naturally from everyday work. New hardware appeared rapidly, production tools changed continuously and every project seemed to introduce another set of technical problems that required fresh solutions. Experience remained valuable, though it rarely eliminated uncertainty.

    The computers on which players experienced those games introduced another level of unpredictability. Audio hardware varied enormously between systems, making consistent playback almost impossible to guarantee. Different sound cards reproduced music in noticeably different ways, while MIDI playback depended heavily upon whichever synthesis hardware happened to be installed inside an individual computer. A carefully balanced piece of music created inside the studio might sound dramatically different once it reached somebody else’s machine. Sound designers could control what left the studio. They could not control how it would ultimately be heard.

    This uncertainty extended well beyond music. Dialogue, sound effects and ambience all passed through hardware whose behaviour remained largely outside the control of the development team. Rather than designing for one predictable playback system, audio professionals found themselves designing for thousands of possible listening environments. Bush described this as one of the defining characteristics of early game audio. The question was rarely how a soundtrack sounded under ideal conditions. Instead, designers learned to ask whether it continued to communicate effectively when reproduced by equipment they had never encountered. The playback system itself became part of the design problem.

    Although contemporary technology has advanced enormously, the underlying challenge remains surprisingly familiar. Players now experience games through televisions, headphones, laptops, handheld consoles, mobile phones and increasingly varied listening environments, each introducing its own acoustic character. Perfect consistency remains elusive. The responsibility of the sound designer therefore extends beyond producing interesting sounds. It includes anticipating how those sounds will survive the journey from the studio to the listener.

    Bush also reflected upon the rapid transformation of production tools during this period. Early editing systems offered comparatively limited support for assembling large projects, requiring significant manual organisation and making complex revisions both time-consuming and potentially destructive. The arrival of Pro Tools transformed those workflows, allowing audio teams to edit non-destructively, manage increasingly complex sessions and collaborate more effectively. At much the same time, improvements in virtual sampling gave composers access to increasingly expressive orchestral sounds without requiring every revision to involve live performers. These developments expanded what small audio teams could realistically achieve while allowing creative ideas to evolve throughout production rather than becoming fixed at an early stage.

    The tools available to sound designers evolved just as quickly. Bush described middleware as another important step in that development. As implementation systems became more sophisticated, audio teams gradually assumed greater responsibility for how sounds behaved inside games rather than simply supplying recordings for programmers to trigger. Interactive playback, transitions and behavioural logic increasingly became part of the sound designer’s creative role. Technology expanded the possibilities available to audio departments, though it also broadened their responsibilities. Understanding implementation became almost as important as creating the sounds themselves.

    One observation from Bush’s time at Blizzard challenged another common assumption about technological progress. Greater technical capability did not necessarily encourage increasingly elaborate soundtracks. He reflected upon how musical direction gradually changed across successive projects, with later productions often favouring greater restraint rather than greater complexity. Earlier scores frequently relied upon dense orchestral textures intended to create scale and spectacle. Later work often achieved stronger dramatic results through simpler arrangements that allowed individual musical ideas greater space to breathe. Rather than filling every available moment with sound, composers became increasingly selective about where music should lead the player’s attention and where silence or restraint could prove more effective.

    The same principle appeared throughout sound design more generally. Memory budgets restricted how many sounds could be stored. Processor limitations reduced the number that could play simultaneously. Dialogue budgets limited the amount of recorded speech available to designers. Every technical restriction demanded choices. Which sounds genuinely communicated useful information? Which could be simplified without affecting the player’s experience? Which details would most influence the way a moment was perceived? Bush’s examples repeatedly suggested that successful sound design depends less upon including everything that is technically possible than upon identifying what is genuinely important for the listener.

    By this stage of the lecture, the discussion had established a way of thinking that extended well beyond the technology of any particular decade. New hardware, new software and new production methods continually alter the practical challenges facing sound designers, yet they rarely change the underlying task. Every project begins with a listener, a playback system and a collection of technical constraints that cannot simply be ignored. The role of the sound designer is to understand those conditions and create the most convincing experience possible within them.

    The relationship between creativity and constraint became considerably more tangible during Bush’s work with Sphero, where many of the assumptions underlying conventional game audio no longer applied. Working on licensed products featuring characters such as R2-D2, BB-8 and Lightning McQueen involved far more than transferring familiar techniques onto a different platform. Every sound would eventually emerge from a miniature loudspeaker housed inside a compact plastic enclosure containing motors, batteries, gears and electronic components. The finished product would be heard in kitchens, classrooms, living rooms and gardens rather than through carefully positioned studio monitors or high-quality headphones. Under those conditions, many established production practices simply ceased to be useful. The question was no longer how a sound performed inside the studio. It became how that sound survived once it reached the device for which it had actually been designed.

    Bush described changing his workflow to reflect that reality. Rather than completing the sound design and then testing it on the finished hardware, he monitored much of his work directly through the loudspeaker installed inside the product itself. Equalisation, dynamics, tonal balance and overall character were judged using exactly the same hardware that customers would eventually hear. The acoustic behaviour of the enclosure, the resonances introduced by the plastic casing and even the mechanical sounds generated by the internal motors became part of the design process. Instead of treating these characteristics as defects to be corrected afterwards, they became factors that shaped creative decisions from the beginning.

    The approach illustrates an important principle that extends well beyond embedded devices. Playback systems are never neutral. Every loudspeaker, pair of headphones, television or mobile phone colours the material passing through it. Sound designers often devote considerable attention to recording, editing and mixing, though the listening environment ultimately contributes just as much to the audience’s experience. Bush repeatedly returned to the importance of understanding where sounds will actually be heard. A design that performs beautifully on large studio monitors may communicate surprisingly little through the hardware used by most listeners. Successful sound design therefore depends not only upon creating interesting sounds, but also upon understanding the conditions under which those sounds will be experienced.

    Tiny loudspeakers presented another unavoidable challenge. Their physical dimensions simply prevented them from reproducing deep bass with any real authority. Attempting to force low frequencies through such hardware produced distortion long before it created convincing weight. Rather than attempting to overcome those physical limitations directly, Bush exploited the way listeners perceive sound. By introducing carefully controlled upper harmonics, he encouraged the auditory system to infer the presence of frequencies that the loudspeaker itself could not reproduce. The hardware remained unchanged, though the listening experience became noticeably richer.

    The solution depended upon psychoacoustics rather than brute force. Human hearing does not operate as a simple measuring device. Listeners continually reconstruct incomplete information, using harmonic relationships, timing cues and previous experience to build coherent auditory impressions. Bush’s work demonstrated how understanding those perceptual processes can prove more valuable than pursuing technically impossible specifications. The objective was never to reproduce frequencies that the loudspeaker could not generate. It was to create a convincing impression of fullness using the resources that remained available. Throughout the lecture, this distinction emerged repeatedly. Good sound design often depends less upon reproducing reality perfectly than upon understanding how listeners interpret what they hear.

    Sampling rates introduced another practical compromise. Embedded devices offered only a fraction of the storage and processing power available to contemporary games, requiring careful management of bandwidth and memory. Bush explained that these restrictions became particularly noticeable when working with robotic characters such as R2-D2, whose personality depends upon bright electronic vocalisations occupying the upper regions of the frequency spectrum. Lower sampling rates inevitably reduced the highest frequencies that could be reproduced accurately, making filtering and careful spectral management essential parts of the design process. Concepts that students often encounter as digital audio theory became everyday creative decisions affecting how expressive and recognisable the finished character would become.

    The material supplied by Lucasfilm also revealed how much organisation underpins apparently effortless performances. Bush did not receive complete scenes or finished sequences ready to be inserted into the product. Instead, he worked with an extensive collection of individual R2-D2 vocalisations drawn from the films. These recordings were not simply organised according to pitch or duration. Their emotional character proved considerably more important. Expressions of curiosity, excitement, concern, frustration and amusement were grouped together so that the robot’s responses could reflect changing situations while remaining faithful to the personality audiences already recognised.

    Randomisation played an important role, though not in the simplistic sense of allowing any sound to play at any time. Bush described carefully controlled systems that introduced variation without sacrificing recognisability. Human listeners identify repeated patterns remarkably quickly, yet behaviour that appears completely unpredictable can feel equally artificial. Convincing interactive audio therefore occupies a position between repetition and randomness. Familiar vocalisations return often enough to establish character, while subtle variations prevent those repetitions from becoming mechanical. The objective is not to surprise the listener continually, but to create the impression of a responsive and expressive personality.

    The same balance appears throughout interactive sound design. Footsteps, interface sounds, environmental ambiences and weapon effects all benefit from controlled variation rather than unlimited randomness. Collections of related recordings, small differences in pitch or timing and carefully managed playback logic often produce more convincing results than vast libraries of unrelated sounds. Bush’s examples demonstrated that believable behaviour frequently depends upon the relationships between sounds rather than the number of sounds available.

    As the lecture broadened beyond embedded devices, Bush argued that creating individual sounds represents only one part of a modern sound designer’s role. Interactive media introduces challenges that simply do not exist in linear forms such as film or television. A film editor knows exactly when every line of dialogue will be heard and how every scene will unfold. Games surrender much of that control to the player. Conversations may begin unexpectedly, be interrupted, or never occur at all. Players may spend hours exploring one environment while another moves through it in minutes. The soundtrack therefore cannot be constructed as a fixed sequence of events. It has to respond continuously to changing circumstances.

    Middleware transformed this aspect of production. Earlier generations of game development relied heavily upon programmers to implement even relatively modest audio behaviour. As middleware matured, sound designers gained much greater control over how sounds responded to events within the game itself. Playback logic, transitions, priorities and interactive behaviours increasingly became part of the sound designer’s creative responsibility. Recording remained an important part of the job, though implementation became equally significant. Designing how sounds behave proved just as important as designing the sounds themselves.

    This shift also changed the relationship between audio departments and the wider development team. Bush repeatedly emphasised that sound design does not exist in isolation. Programmers determine what information becomes available. Designers establish the systems that govern player behaviour. Writers shape dialogue, animators influence timing and movement, while artists define the visual environments within which sounds operate. Audio departments respond to all of these decisions while contributing their own expertise in return. Successful interactive soundtracks emerge through continual collaboration rather than from any single discipline working independently.

    One discussion during the lecture addressed the way sound professionals are perceived within development teams. Bush reflected on labels such as “the sound guy” or “the noise boy”, expressions that dramatically underestimate the breadth of contemporary audio practice. Modern sound designers contribute far beyond the creation of individual sound effects. They solve technical problems, shape interactive behaviour, collaborate across disciplines and influence how players ultimately experience the game. Titles such as Audio Director acknowledge that broader creative and technical responsibility.

    Questions from students later turned towards virtual reality, where many of these relationships become even more apparent. Convincing virtual environments depend upon much more than visual realism. Sound provides continuous information about distance, movement, scale and spatial relationships, allowing users to build coherent mental models of spaces extending beyond their immediate field of view. Carefully designed spatial audio therefore contributes directly to presence, orientation and immersion rather than acting as a decorative addition to the visual experience.

    Across subjects as varied as desktop games, embedded devices, robotic toys and virtual reality, Bush repeatedly returned to the same way of thinking. Every project began with an understanding of the available technology, the listening conditions and the perceptual abilities of the audience. The hardware changed dramatically throughout his career, though the questions facing the sound designer remained remarkably consistent. Rather than asking how to exploit every available technical capability, Bush continually asked what listeners actually needed to hear and how the available technology could communicate that experience most effectively.

    Across projects as different as Blizzard’s games, Sphero’s robotic products and emerging virtual reality systems, Bush consistently returned to the same set of design questions. Technology continued to change throughout his career, introducing new platforms, workflows and constraints, yet the underlying task remained remarkably stable. Successful sound design depended upon understanding how people listen, how technology behaves and how creative decisions bridge the gap between the two. Whether working with a full orchestral score, an interactive dialogue system or a miniature loudspeaker inside a robotic toy, the objective was never simply to produce impressive sounds. It was to create listening experiences that remained convincing under the conditions in which they would actually be heard.

  • How Does a Film Speak Every Language? George Mikrogiannakis on Film Localisation, Dubbing, and International Sound Production

    George Mikrogiannakis

    How does a film speak every language?

    Most audiences rarely stop to consider the question. A film appears in a cinema or on a streaming service, characters speak naturally in the local language, performances feel convincing, and the soundtrack appears entirely coherent. Nothing suggests that thousands of individual decisions, spread across months of work and involving specialists in numerous countries, have contributed to what appears to be a seamless experience. During his online guest lecture for Edinburgh Napier University, George Mikrogiannakis drew back the curtain on that process. Drawing upon many years supervising international localisation for Walt Disney Studios, DreamWorks Animation, and other major productions, he revealed that dubbing represents only one small part of a much larger undertaking. Localisation combines translation, performance, dialogue editing, sound design, recording, mixing, quality control, and project management into a production process whose success depends upon remaining almost completely invisible.

    Before discussing localisation itself, Mikrogiannakis addressed a question that many sound design students might reasonably ask. Why should someone interested in sound effects, Foley, ambience, or mixing concern themselves with dubbing? His answer challenged the assumption that localisation belongs solely to translators or dialogue editors. Film sound does not end when the final mix has been approved. Modern productions are expected to travel internationally, and that expectation influences decisions made throughout post-production. Deliverables, session organisation, music and effects mixes, dialogue editing, documentation, and recording practice all determine whether a soundtrack can later be adapted successfully into dozens of different languages. Understanding localisation therefore provides a wider understanding of professional sound production itself.

    The distinction between dubbing and localisation formed the starting point for the discussion. Dubbing describes the replacement of spoken dialogue with performances recorded in another language. Localisation encompasses everything required to ensure that a film functions naturally within another culture while preserving the creative intentions of the original production. Dialogue must communicate the same dramatic ideas, fit the visible movements of actors’ mouths, respect timing, preserve emotional performances, and integrate seamlessly into the original soundtrack. A successful localisation should never feel like a compromise. Audiences should simply experience the film as though it had always belonged in their own language.

    Commercial realities make this work indispensable. For many major studio productions, international audiences account for the majority of ticket sales. A film that performs well domestically may still depend upon worldwide distribution for its overall commercial success. Localisation therefore becomes an essential stage of production rather than an optional addition. Mikrogiannakis illustrated the scale of this work with one striking example. Pirates of the Caribbean eventually required more than six hundred separate versions to satisfy different languages, territories, exhibition formats, airlines, and distribution requirements. Once work reaches this scale, localisation no longer resembles a straightforward translation exercise. It becomes an international production pipeline operating alongside the creation of the original film.

    Maintaining consistency across so many versions requires careful coordination between numerous creative and technical disciplines. Scripts pass from translators to dialogue adaptors, from recording directors to voice actors, from editors to mixers, and through repeated rounds of quality control before final approval. Every participant contributes something different while working towards the same objective. The audience should experience the same characters, the same emotional performances, and the same dramatic pacing regardless of which language they hear.

    Translation itself proved far more creative than many students had expected. Mikrogiannakis explained that translators receive extensive supporting documentation describing characters, situations, cultural references, jokes, and dramatic context. Their task is not to reproduce individual words as literally as possible. Instead, they seek to preserve the intention behind the dialogue. Humour frequently illustrates this challenge. A joke that depends upon an English idiom or a cultural reference may simply fail when translated directly. Rather than forcing audiences to decode unfamiliar expressions, adaptors reconstruct the underlying comic idea so that viewers in another country experience a similar moment of humour, even if the dialogue itself changes substantially.

    Lip synchronisation introduces further complications. Different languages occupy different amounts of time. A short English sentence may require considerably more syllables elsewhere, while other languages express the same meaning much more concisely. Dialogue therefore undergoes continual adjustment until it satisfies several competing requirements simultaneously. It must sound natural, preserve the original dramatic meaning, fit within the available time, and remain synchronised with the visible movements of the actor’s mouth. Accuracy alone is never enough. Rhythm, emphasis, breathing, pacing, and performance all contribute to whether audiences believe what they are watching.

    The recording process reflects the same attention to detail. Unlike dramatic productions in which actors frequently perform together, dubbing sessions normally record performers individually, allowing every voice to remain completely controllable throughout the final mix. Consistency becomes one of the engineer’s principal responsibilities. Mikrogiannakis emphasised the importance of using the same recording environment, microphone, and acoustic conditions throughout an entire production. Local studios are generally expected to deliver clean recordings with minimal processing. Equalisation, dynamics processing, reverberation, and other creative treatments remain the responsibility of the originating production rather than the individual dubbing facility. Every language version therefore begins from comparable source material before being shaped into the finished soundtrack.

    One particularly memorable example demonstrated just how carefully these productions preserve even the smallest creative details. During the localisation of How to Train Your Dragon, one character briefly speaks while wearing a leather mask. Rather than leaving each territory to interpret the scene independently, the production required two separate recordings of every affected line. One version was performed normally. The second was recorded with an obstruction placed in front of the actor’s mouth to recreate the acoustic effect of speaking through the mask. Such requests may appear unusually specific, though they illustrate a broader principle running throughout the lecture. Localisation seeks to reproduce the experience of the original production as faithfully as possible, even when that requires remarkably detailed technical preparation.

    For sound design students, the most revealing discussion centred upon the music and effects mix, more commonly known as the M&E. At first glance, creating an international version might appear straightforward. Remove the original dialogue, record new voices, and place them into the existing soundtrack. Mikrogiannakis demonstrated why this assumption quickly breaks down. Production dialogue rarely contains voices alone. Clothing movement, footsteps, room reflections, environmental ambience, prop handling, breathing, incidental vocalisations, and countless other sounds often exist within the same recordings. Removing dialogue therefore removes far more than speech.

    Producing a convincing M&E requires many of these elements to be rebuilt separately before localisation can even begin. Foley artists recreate physical actions. Ambience editors restore the acoustic character of locations. Sound editors recover or redesign details that disappear when production dialogue is removed. Every reconstructed element must integrate naturally with the remaining soundtrack so that audiences remain unaware that significant parts of the scene have effectively been recreated. Localisation therefore exposes something that audiences rarely notice. Successful dialogue replacement depends upon the invisible work of many other sound professionals whose contributions make the reconstructed world feel complete.

    The same attention to detail extends into the organisation of production sessions. Mikrogiannakis explained that major studios prescribe how dialogue sessions should be structured long before recording begins. Lead characters, supporting roles, incidental dialogue, and background voices occupy predetermined locations within Pro Tools sessions so that material arriving from different countries can be assembled without confusion. Track layouts, naming conventions, file structures, and version numbers follow equally strict standards. These systems may appear administrative rather than creative, though they exist for a practical reason. Hundreds of dialogue files may pass between translators, recording studios, editors, mixers, and quality-control teams before a film reaches cinemas. Small inconsistencies introduced at the beginning of the process can rapidly become expensive problems once productions begin moving between countries.

    For students accustomed to working alone, this offers an interesting perspective on professional practice. Large productions depend upon predictability as much as originality. Other members of the production team must be able to identify recordings immediately, locate the correct version of every file, and understand how sessions have been organised without needing lengthy explanations. Good organisation does not restrict creativity. It allows creativity to survive within projects involving hundreds of contributors working across multiple continents.

    The recording sessions themselves reflect similar priorities. Actors rarely record together, even when their characters share a conversation. Instead, every performance is captured independently under carefully controlled acoustic conditions. Recording engineers seek consistency above all else, maintaining the same microphones, recording chains, and studio environments wherever possible. Performances can then be balanced, edited, and integrated into the soundtrack with considerably greater precision than would otherwise be possible. The objective is not simply to record dialogue. It is to provide material that remains flexible throughout every subsequent stage of post-production.

    Security introduces another layer of complexity. Long before a film reaches cinemas, localisation teams may already be working on dialogue in numerous languages. Scripts, images, and recordings therefore become highly confidential. Mikrogiannakis described productions protected through extensive non-disclosure agreements, secure online workflows, watermarked media, and carefully controlled distribution systems. In particularly sensitive cases, even the picture supplied to dubbing studios may reveal only a small area surrounding a character’s mouth while the remainder of the image remains concealed. The performers receive enough visual information to synchronise their dialogue without exposing details of the story before release.

    These precautions reveal another aspect of contemporary sound production that audiences rarely encounter. Localisation frequently begins while visual effects continue to evolve, editorial changes remain possible, and marketing campaigns have yet to reveal significant elements of the film. Sound departments therefore work within productions that remain in constant development. Flexibility becomes as valuable as technical expertise. Dialogue may require revision, scenes may be shortened, and editorial decisions may continue long after recording has begun. Every change must then be reflected consistently across every language version.

    Once recording has finished, another phase begins. Every performance is reviewed against the original production to evaluate synchronisation, pronunciation, dramatic intention, technical quality, and consistency with previously approved material. Recordings that satisfy one requirement may still require revision for another. A technically perfect recording may not match the emotional intensity of the original actor. A convincing performance may reveal a slight synchronisation problem. A translation may preserve meaning while sounding unnatural when spoken aloud. Each stage of review narrows these differences until the finished soundtrack supports the same dramatic experience as the original production.

    The process depends upon specialists whose expertise overlaps rather than duplicates. Translators evaluate language. Dialogue directors shape performances. Recording engineers concentrate on technical quality. Editors refine timing and synchronisation. Mixers integrate new dialogue into the existing soundtrack. Supervisors compare each completed version with the original production before granting approval. None of these roles can replace another. The finished film emerges through collaboration between people whose responsibilities remain distinct while contributing towards a shared creative objective.

    One aspect of the discussion resonated particularly strongly for sound design students. Many university projects naturally emphasise creating interesting sounds. Professional productions require that creativity to coexist with organisation, documentation, planning, and consistency. A beautifully designed soundtrack that cannot be delivered reliably to another department quickly becomes difficult to maintain. Localisation demonstrates this reality with unusual clarity. Every recording created during production may later support dozens of additional versions distributed across the world. Decisions made while organising sessions, preparing stems, documenting edits, or recording apparently insignificant details may continue influencing the production years after the original mix has been completed.

    The relationship between creativity and organisation also changes the way professional sound departments approach collaboration. Rather than treating editing, Foley, dialogue, sound effects, ambience, and mixing as isolated activities, localisation reveals how closely each depends upon the others. Replacing dialogue successfully requires carefully prepared music and effects mixes. Those mixes depend upon dialogue editors separating production material accurately. Dialogue editors depend upon clean recordings, consistent session management, and comprehensive documentation. Every department inherits decisions made by the departments before it. Strong workflows therefore support creative outcomes rather than competing with them.

    Audiences rarely recognise any of this work, and perhaps they should not. Successful localisation draws attention towards the story rather than the production process. Viewers become absorbed in performances, relationships, humour, and dramatic tension without considering how many different versions of the soundtrack exist or how many specialists contributed to the one they happen to hear. The technical achievement lies precisely in making reconstruction disappear.

    For sound design students, localisation offers an unusually clear picture of contemporary professional practice. It demonstrates that sound production extends well beyond recording and mixing. Projects continue to evolve after the original soundtrack has been completed, passing through new languages, cultures, technologies, and distribution platforms while preserving a coherent creative identity. Every carefully organised session, every clean recording, every reconstructed ambience, and every accurately prepared deliverable helps make that possible.

    A film may begin life in a single language, though its soundtrack is often expected to communicate with audiences across much of the world. Making that transition successfully depends upon considerably more than translation. It depends upon planning, technical precision, collaboration, and a shared commitment to preserving the creative intentions embedded within the original production. The better those foundations have been established, the more naturally the film speaks to audiences, regardless of which language they hear.

  • What Happens When We Listen to a Place? Barry Truax on Soundscapes, Soundmarks, and Acoustic Ecology

    Professor Barry Truax

    What happens when we listen to a place?

    At first glance, the question appears surprisingly simple. Places are full of sounds. Traffic passes. Birds call. Church bells ring. Doors close. Voices drift across streets and public squares. Yet during his online guest lecture for Edinburgh Napier University, composer, researcher, and acoustic ecologist Professor Barry Truax suggested that listening to a place involves far more than cataloguing the sounds it contains. Throughout a wide-ranging discussion of soundscapes, field recording, acoustic communities, oral history, environmental awareness, and soundscape composition, he repeatedly returned to a central idea. Sound is never simply physical. It is social, cultural, historical, and environmental. To listen carefully to a place is therefore to learn something about the people who inhabit it, the history that shaped it, and the relationships that continue to define it.

    Truax’s own involvement with these questions stretches back more than fifty years. Arriving at Simon Fraser University in 1973, he joined the World Soundscape Project, a pioneering research group founded by the Canadian composer R. Murray Schafer. The project emerged during a period of growing environmental awareness. Yet whereas many environmental discussions focused on landscapes, pollution, or conservation, Schafer and his colleagues became interested in the acoustic dimension of everyday life. Their concern was not simply with noise. They wanted to understand the sonic environments people inhabited and the ways those environments influenced perception, culture, memory, and community.

    The term soundscape became central to this work. Although the word had appeared occasionally before Schafer popularised it, the World Soundscape Project gave it a more systematic meaning. A soundscape was not merely a collection of sounds. Nor was it simply an acoustic environment that could be measured scientifically. What mattered equally was how those sounds were perceived and understood by people living within that environment. The same physical sound might be experienced very differently depending upon context, culture, history, or personal association. Listening therefore became a study not only of acoustics, but also of human experience.

    Vancouver provided the project’s first major laboratory. During the 1970s, members of the World Soundscape Project recorded extensively throughout the city, documenting harbour sounds, trains, ferries, bells, industrial activity, public spaces, and everyday life. At one level, the work resembled a large-scale field recording project. At another, it represented an attempt to understand how a city expressed itself acoustically. Recording became a form of investigation. What sounds defined Vancouver? Which sounds carried social meaning? Which sounds connected residents to their history? Which sounds were disappearing?

    Several examples discussed during the lecture illustrated how these questions often led in unexpected directions. Vancouver’s harbour horns, train whistles, church bells, and the distinctive Canada Horn all emerged as sounds that many residents recognised immediately. Such sounds were not important solely because they were loud or distinctive. They mattered because they connected people to place. Schafer introduced the term soundmark to describe sounds possessing particular cultural significance within a community. The concept deliberately echoed the idea of a landmark. Just as certain buildings, monuments, or geographical features help define a place visually, particular sounds may help define it acoustically.

    The Canada Horn provided an especially interesting example. Installed as part of Canada’s centennial celebrations in 1967, it performs the opening notes of the national anthem each day. Functionally, it operates like a signal. Symbolically, however, it occupies a rather different role. Many Vancouver residents know the sound immediately. It has become woven into everyday life. Listening to it therefore involves more than recognising a horn. It involves recognising a piece of collective identity.

    What fascinated Truax was how such sounds often reveal broader histories. Discussions of harbour horns quickly lead towards transportation networks, migration, industry, and national development. Church bells raise questions about religion, settlement, and changing urban environments. Listening carefully to a city often reveals that sounds function as traces of social and cultural processes that remain largely invisible.

    Many of the examples discussed during the lecture also demonstrated how soundscapes change over time. One recurring theme of the World Soundscape Project involved documenting sounds that were disappearing, being replaced, or acquiring new meanings. Steam whistles gave way to electronic signals. Traditional foghorns were replaced by automated systems. Bell sounds that once travelled across large parts of a city became increasingly difficult to hear amid expanding urban development. Such changes are rarely documented in conventional histories. Buildings receive preservation orders. Photographs enter archives. Yet sounds often disappear without attracting similar attention.

    For Truax, this raises important questions about acoustic heritage. If communities value historic buildings, should they also value historically significant sounds? If a particular sound helps define a place, what happens when it vanishes? These questions do not always produce straightforward answers. Soundscapes are constantly changing. New sounds emerge while others disappear. Yet the discussion highlights an important shift in perspective. Once listening becomes a form of cultural enquiry, everyday sounds acquire a significance that might otherwise be overlooked.

    The lecture repeatedly demonstrated how listening can reveal aspects of history that remain inaccessible through other methods. One approach developed by the World Soundscape Project involved collecting what they called earwitness accounts. Residents described sounds they remembered from earlier periods of their lives. These accounts were not always precise. Memory rarely functions with the accuracy of a recording device. Yet they offered valuable insights into how people experienced changing environments. Through such recollections, researchers gained access not only to lost sounds but also to the meanings attached to them.

    One particularly memorable example came from the Scottish village of Dollar, one of several European communities studied by the project during the 1970s. There the researchers worked closely with David Graham, a former town clerk whose extraordinary memory allowed him to reconstruct entire soundscapes from decades earlier. Standing at locations throughout the village, Graham described railway sounds, station activities, signalling systems, machinery, voices, and routines that had long since disappeared. Listening to him was almost like hearing an acoustic map of the past being reconstructed in real time.

    The significance of these accounts extended beyond nostalgia. Graham was not simply recalling sounds. He was recalling relationships, activities, routines, and forms of social organisation. The sounds mattered partly because they connected people to particular ways of life. Once again, listening became a route towards understanding communities rather than merely documenting acoustics.

    Soundwalks developed as another way of exploring these relationships. Truax described soundwalks as listening walks in which participants move through an environment while paying deliberate attention to its acoustic characteristics. Although deceptively simple, the method encourages a profound shift in awareness. Many sounds that normally fade into the background become newly noticeable. Distances become easier to judge. Acoustic boundaries emerge. Patterns of activity reveal themselves. Places begin to sound different once listening becomes intentional.

    Closely related were memory walks, in which participants revisited locations associated with earlier experiences. Returning people to familiar places often stimulated recollections that might otherwise remain inaccessible. A particular street corner, railway station, church, or public square could trigger detailed memories of sounds, activities, and social interactions. Context helped unlock memory. The environment itself became part of the research method.

    What makes these approaches particularly interesting is that they position listening as an active practice rather than a passive process. Hearing happens continuously. Listening requires attention. Throughout the lecture, Truax repeatedly encouraged students to recognise how much of everyday life passes by acoustically unnoticed. Soundwalks, memory walks, and soundscape research all attempt to interrupt that habit and create opportunities for reflection.

    The final part of the lecture turned towards soundscape composition, a form of creative practice closely associated with Truax’s work. Traditional musical composition often treats sounds as materials that can be organised independently of their original contexts. Soundscape composition adopts a rather different position. Environmental context remains central. The sounds retain connections to places, communities, and experiences from which they originate.

    Truax described a continuum of approaches. At one end are relatively direct recordings that document environments with minimal intervention. At the other are heavily transformed compositions in which sounds are processed, stretched, layered, and reconfigured. What distinguishes soundscape composition is not the degree of manipulation but the continuing relationship between the work and its source context. Listeners are encouraged to recognise environmental references and reflect upon their meanings.

    Soundscape composition also reflects a way of thinking about recorded sound that has influenced many areas of contemporary sound design. Environmental recordings are not treated simply as raw material waiting to be transformed beyond recognition. Their origins continue to matter. A harbour horn, a railway station, or a forest path carries associations that listeners may recognise even after considerable processing. Sound designers regularly make similar decisions when balancing realism with interpretation. Recordings can be edited, layered, stretched, or filtered, yet they often retain traces of the places and experiences from which they came. Truax’s work encourages designers to consider not only how a sound functions within a composition, but also what relationships it continues to carry with the world beyond the loudspeaker.

    This emphasis on context creates an interesting contrast with many traditions of Western art music. Rather than treating context as background information, soundscape composition places it at the centre of the creative process. Environmental, social, historical, and psychological associations become part of the material with which the composer works. A harbour horn is never simply a sound. It carries histories of transportation, labour, geography, and identity. A church bell carries different associations. A train whistle carries others. The composer works not only with acoustic properties but also with layers of meaning.

    Truax illustrated this approach through examples drawn from Vancouver. Familiar sounds appeared first in recognisable forms before gradually being transformed through processes such as time stretching. The effect was not simply aesthetic. Transformation encouraged different forms of listening. Sounds that normally function as signals became objects of reflection. Their internal textures emerged. Musical qualities became apparent. At the same time, their connections to place remained intact.

    Underlying the entire lecture was a broader concern with acoustic ecology. Listening, in this context, is not merely a technical skill or an artistic technique. It is a way of understanding relationships between people and environments. Paying attention to sound reveals aspects of culture, history, memory, community, and ecology that often remain hidden. It encourages reflection upon what societies choose to preserve, what they allow to disappear, and how environments continue to shape experience.

    More than fifty years after the World Soundscape Project began, many of the questions raised by Truax and his colleagues remain unresolved. Cities continue to change. Technologies alter how people communicate, travel, and work. Familiar sounds disappear while new ones emerge. Yet sound rarely occupies the same position within public discussions of heritage and preservation as buildings, monuments, or landscapes.

    Throughout the lecture, Truax returned repeatedly to sounds that had vanished, sounds that survived, and sounds that communities continued to recognise as part of their identity. Harbour horns, church bells, railway sounds, industrial signals, and everyday activities all carried meanings that extended beyond their immediate functions. They connected people to places, histories, and shared experiences. Once lost, many could not easily be recovered.

    Soundscape research therefore asks a question that is both simple and surprisingly difficult. What should be remembered acoustically? Photographs preserve appearances. Written records preserve events. Recordings, memories, soundwalks, and earwitness accounts preserve something different. They preserve traces of how places were experienced by the people who lived within them.

    For Truax, listening is valuable partly for this reason. It draws attention towards aspects of culture and environment that often pass unnoticed. A soundscape is not merely what a place sounds like. It is one way of understanding how a place has been lived, remembered, and shared.

  • How Can Sound Become an Interface? Professor Stephen Brewster on Non-Speech Audio, Multimodal Interaction, and Designing Beyond the Screen

    Professor Stephen Brewster

    What happens when looking at a screen is no longer the best option?

    Computing has become increasingly mobile. Phones accompany people through cities, workplaces, public transport systems, shops, festivals, and countless other environments. Yet much interaction design still assumes that users can devote their attention to a display whenever information needs to be communicated. During his online guest lecture for Edinburgh Napier University, Professor Stephen Brewster challenged that assumption. Drawing on decades of research in human-computer interaction, multimodal interfaces, auditory displays, sonification, and mobile computing, he explored a deceptively simple question. What happens when information is communicated through sound rather than vision?

    Brewster began by situating the discussion within a broader problem. Human beings possess multiple senses, though much digital technology continues to privilege vision above all others. Screens dominate contemporary computing. Menus, notifications, progress indicators, maps, messages, and data visualisations typically assume that users are willing and able to look. Yet many situations challenge this assumption. Someone cycling through traffic cannot continuously monitor a display. A pedestrian navigating a crowded city may already be dividing attention between multiple tasks. Bright sunlight can render screens difficult to read. Some users experience visual impairments. Others simply have more pressing demands on their attention than a device in their hand. Rather than treating these situations as exceptions, Brewster suggested they reveal a limitation in conventional interface design. If visual attention is unavailable, how else might information be communicated?

    This question has shaped much of his research. Rather than viewing sound as decoration or enhancement, Brewster approaches it as a communication channel. Sound can operate while users look elsewhere. It can communicate information rapidly. It can support accessibility. It can function alongside vision rather than competing with it. The goal is not to replace screens entirely. Instead, it is to make fuller use of the sensory capabilities people already possess. Multimodal interaction, as Brewster described it, involves designing systems that acknowledge how people actually experience the world rather than assuming that vision should always dominate.

    Mobile devices provided an especially important motivation throughout the lecture. Traditional desktop computing emerged within relatively controlled environments. Users sat at desks, faced screens, and focused primarily on a single task. Mobile computing transformed those assumptions. People now interact with technology while moving through complex environments filled with competing demands upon their attention. A larger display cannot solve every problem. In many situations, the challenge is not the quantity of visual information available. The challenge is finding ways to communicate information without requiring users to look at all. Brewster argued that interaction design should respond to these realities rather than simply shrinking desktop interfaces onto smaller screens.

    Attention emerged as a recurring concern throughout the lecture. Many interface designs implicitly assume that information should compete for attention whenever it becomes available. Notifications flash. Windows appear. Alerts demand immediate responses. Yet everyday life rarely operates in this way. People constantly manage multiple streams of information simultaneously. Conversations continue while walking. Music plays while working. Environmental sounds remain present while attention shifts elsewhere. Brewster’s work asks whether digital systems might learn from these patterns. Rather than repeatedly interrupting users, could information move fluidly between foreground and background depending upon circumstances? Sound appears particularly well suited to this challenge. Unlike visual displays, which generally require direct attention, auditory information can remain available while users focus elsewhere. The question is not simply whether sound can communicate information. It is whether sound can communicate information without constantly demanding attention.

    One reason sound becomes attractive in this context is its efficiency. Speech can communicate detailed information, though it requires time. A spoken message unfolds word by word. Non-speech audio can often communicate information much more rapidly. Brewster compared the relationship between speech and non-speech sound to the relationship between text and icons. A paragraph may describe an object in detail. An icon can often communicate a similar idea almost instantly. Carefully designed sounds can function in much the same way. Rather than reading information aloud, they communicate status, warnings, activity, trends, or relationships through concise auditory cues.

    Much of the lecture explored different approaches to designing these cues. One of the earliest involved earcons, structured auditory messages built from musical elements such as rhythm, pitch, timbre, and tempo. Unlike everyday sounds, earcons are abstract. Their meaning must be learned. Yet this abstraction also provides flexibility. Brewster demonstrated how simple auditory components can be combined to create larger structures capable of communicating increasingly complex information. A particular rhythm might signal an error. Changes in timbre or pitch might identify different categories of error. Much like language, the system develops a vocabulary from smaller building blocks. Users invest effort in learning the code, though once acquired it can support sophisticated communication through relatively simple sounds.

    Auditory icons take a rather different approach. Instead of relying upon abstract structures, they exploit familiar sounds drawn from everyday experience. Brewster discussed William Gaver’s influential SonicFinder project, which mapped computer operations onto recognisable environmental sounds. Selecting a folder might produce the sound of paper. Dragging an object across the desktop might generate a scraping sound. Deleting a file might end with breaking glass. Such sounds often require little training. Their meaning emerges from existing associations. Yet the approach also reveals interesting limitations. Everyday life contains only a finite number of obvious metaphors. As software functions become more specialised, finding intuitive sonic equivalents becomes increasingly difficult. What sound represents copying a file rather than moving it? What sound represents a menu hierarchy? Questions such as these expose the challenges that emerge when designers depend upon metaphor alone.

    A third approach, sonification, shifts attention away from interfaces and towards data. Here, numerical values are mapped onto auditory parameters such as pitch, rhythm, or timbre. Brewster compared the process to visualisation. Graphs provide rapid access to patterns that would be difficult to identify within tables of numbers. Sonification attempts to achieve something similar through listening. By converting data into sound, listeners can often identify trends, anomalies, peaks, and relationships that might otherwise remain hidden. Rather than replacing detailed numerical information, sonification provides an overview. It allows users to perceive the broader shape of a dataset before examining specific values.

    Questions from students helped illuminate this distinction further. One example involved pollen data transformed into sound through changing pitches. The goal was not to communicate precise measurements. Instead, listeners could quickly identify whether levels were increasing, decreasing, or remaining stable. Brewster argued that this reflects the real strength of sonification. A graph rarely succeeds solely through precision. It succeeds by revealing patterns. Sonification can achieve a similar outcome through auditory perception. Numerical detail remains available when required, though sound offers a rapid way of monitoring change over time.

    Several studies discussed during the lecture demonstrated how even relatively simple sounds can influence interaction. One experiment examined numerical data entry on mobile devices. Participants entered information using either large visual buttons or substantially smaller alternatives. Predictably, performance declined when the buttons became smaller. Yet when simple auditory feedback was added, performance improved dramatically. Users working with the smaller controls performed almost as well as those using larger buttons. The sounds themselves were uncomplicated. Their value lay not in complexity but in the additional information they provided. By reducing uncertainty during interaction, they made the task easier to perform.

    Another particularly elegant example involved progress indicators. Most software communicates progress visually through bars that gradually fill across a display. Brewster and colleagues explored whether similar information could be represented spatially through sound. As a task progressed, a sound moved around the listener’s head. Position communicated completion. Movement communicated change. Without looking at a screen, users could estimate how far a process had progressed and whether activity had stalled.

    During the discussion period, students questioned whether such displays might become intrusive. Brewster responded by drawing attention to forms of ambient awareness that already exist within everyday life. People rarely focus continuously on air-conditioning systems, distant traffic, rainfall, or background conversations. Such sounds remain available without demanding constant attention. Auditory displays, he suggested, can function in a similar way. Information remains present when required, fading into the background when it is not. This idea runs through much of his research. Sound is not always most effective when it occupies the foreground. Sometimes its greatest strength lies in supporting awareness without interruption.

    Spatial audio appeared repeatedly throughout the lecture as a particularly rich area for exploration. Rather than treating sound as something emitted from a single speaker, Brewster investigated how information might be organised around listeners in three-dimensional space. Progress indicators could move around the head. Calendar entries could occupy positions corresponding to times of day. Menu items could exist within an auditory environment rather than a visual one. These systems exploit the human ability to localise sound sources, transforming listening into a form of navigation. Information acquires location. Interaction becomes spatial rather than purely symbolic.

    Some of the most imaginative projects discussed during the lecture extended these principles into everyday environments. AudioFeeds transformed social media activity into ambient soundscapes. Twitter, Facebook, news feeds, and other information streams occupied different locations within auditory space, represented through distinct families of sounds. Rather than repeatedly checking a screen, users could maintain a broader awareness of activity through listening. Detailed information remained available when required, though constant checking became unnecessary.

    The significance of AudioFeeds extends beyond social media. The project raises broader questions about how digital information should occupy everyday life. Many contemporary systems assume that awareness requires direct inspection. Brewster’s work suggests alternatives. Awareness may emerge gradually. Information may remain peripheral until circumstances make it relevant. In this respect, auditory displays resemble many naturally occurring environmental sounds. People rarely monitor rainfall continuously, though they remain aware that it is raining. They do not focus constantly on traffic outside a window, though they often notice when conditions change. Sound supports forms of awareness that differ from the all-or-nothing relationship often associated with visual attention.

    Pulse extended these ideas into urban environments. During the Edinburgh Festival, geolocated tweets became spatial audio cues distributed around the city. The project transformed social activity into something that could be heard rather than viewed. Participants were not presented with lists of events ranked by popularity, nor were they required to consult maps repeatedly. Instead, they developed a sense of where activity was occurring through listening.

    One of the most interesting aspects of the project is that it occupied a space between navigation and exploration. Traditional navigation systems attempt to guide users towards predetermined destinations. Pulse encouraged discovery instead. Participants moved towards sounds that suggested activity, curiosity, or interest. Information became something encountered rather than simply retrieved. In doing so, the project demonstrated how auditory displays can support forms of engagement that differ substantially from conventional graphical interfaces.

    The lecture concluded with one of Brewster’s more recent ideas: musicons. Earcons require designers to construct sounds from scratch. Musicons instead draw upon music that listeners already know. Research revealed that people consistently identify particular moments within familiar songs as especially representative. Often these moments involve vocals, choruses, or distinctive melodic features. By extracting such fragments, it becomes possible to create recognisable auditory cues from a user’s existing music collection. The appeal lies partly in familiarity. Rather than learning a completely new auditory language, users rely upon associations they already possess. Recognition emerges from memory rather than training.

    Musicons reveal another recurring theme in Brewster’s work. Successful interfaces rarely begin from technology alone. They begin from existing human abilities. Earcons ask users to learn a new auditory language. Musicons exploit knowledge that listeners already possess. A few notes from a familiar song may be recognised almost instantly. Years of listening experience become part of the interface itself.

    Looking across the different projects discussed during the lecture, it becomes clear that Brewster is addressing a much larger question than how to design better sounds. The deeper issue concerns the relationship between people and technology. Modern computing frequently competes with the surrounding world for attention. Screens draw the eye away from streets, conversations, environments, and other people. Brewster’s work suggests that alternative relationships may be possible.

    Sound occupies a distinctive position within this discussion. It can communicate information while allowing users to continue looking elsewhere. It can support awareness without requiring constant inspection. It can reveal patterns within data, provide feedback during interaction, and create new forms of accessibility. Most importantly, it can coexist with other activities rather than replacing them.

    None of this means that sound should replace vision. Brewster repeatedly emphasised the value of multimodal design rather than sensory competition. Different senses possess different strengths. The challenge for interaction designers is understanding how those strengths can complement one another. Sound becomes most useful not when it attempts to imitate visual displays, but when it contributes capabilities that vision alone cannot easily provide.

    For many people, digital interaction has become almost synonymous with looking at screens. Brewster’s lecture offered a reminder that computing does not need to be confined to vision. Human beings hear, touch, move, and orient themselves within space. Designing for those abilities opens possibilities that extend far beyond the display. In that sense, the lecture was not really about sound alone. It was about recognising the full range of ways people experience the world.