Category: ADR

  • How Does a Crowd Find Its Voice? David Monteath on Crowd ADR, Performance, and Creating Believable Worlds

    David Monteath

    How does a crowd find its voice?

    When audiences watch a film or television programme, their attention naturally settles upon the principal actors. Far less notice is taken of the countless background voices that transform a collection of images into a believable social world. Conversations drifting through a restaurant, murmured discussions in an office, distant arguments in a crowded street or the indistinct atmosphere of a busy marketplace all contribute to the impression that life continues beyond the central characters. Remove those voices, and even the most carefully photographed scene can feel strangely artificial. During his online guest lecture for Edinburgh Napier University, David Monteath returned to the University as a Sound Design alumnus to explore the specialised craft of crowd ADR. Drawing upon more than three decades working as an actor and voice artist, he demonstrated that believable crowd performances depend upon observation, improvisation and an understanding of dramatic context rather than simply recording large numbers of voices. One principle underpinned the discussion. Context is king.

    Rather than replacing the dialogue of principal actors, crowd ADR creates the sense that an entire world exists beyond them. A small group of performers may become the customers in a restaurant, the spectators at a football match, the passengers waiting on a railway platform or the crowd gathered in a courtroom. Individual conversations overlap, reactions ripple through the group and emotional responses emerge at precisely the right moments, creating the impression that every person visible on screen possesses a life extending beyond the immediate story. Audiences rarely notice these performances consciously, yet they immediately recognise when they are missing. Scenes that lack convincing crowd performances often feel unexpectedly empty, regardless of how carefully they have been photographed or edited.

    Monteath repeatedly challenged the assumption that this work consists simply of creating background noise. Crowd ADR is first and foremost a form of acting. Every performance responds to the circumstances of the scene, the relationships between characters and the emotional atmosphere established by the director. People waiting quietly in a hospital corridor behave differently from supporters leaving a football stadium. Conversations in an expensive restaurant differ from those heard in a busy café, while voices surrounding a royal procession carry a very different energy from those accompanying a political protest. Every reaction, interruption and fragment of conversation exists to support the dramatic reality of the scene rather than to attract attention in its own right. Authenticity emerges from understanding how people genuinely behave in different situations, not from making scenes louder or busier.

    This emphasis upon dramatic context shaped every practical discussion throughout the lecture. Monteath encouraged students to think beyond individual words and instead consider the circumstances in which those words are spoken. Before deciding how loudly to speak, how quickly to react or even what might be said, performers first need to understand where they are, who surrounds them and what is happening within the story. The same phrase may require entirely different delivery depending upon whether it takes place in a library, an airport, a football ground or the middle of a battlefield. Successful crowd performers therefore begin by observing people. Everyday behaviour, casual conversations, shared laughter, hesitation, disagreement and excitement all provide material that can later be adapted naturally within the recording studio. The objective is not to invent behaviour, but to recognise and recreate it convincingly.

    Perhaps the most revealing insight from this opening part of the lecture concerned the relationship between realism and audibility. Many beginning sound designers instinctively assume that important sounds should always be heard clearly. Monteath argued for almost the opposite approach. Successful crowd ADR often succeeds precisely when audiences remain largely unaware of it. Background voices should usually be felt rather than heard, contributing movement, texture and emotional energy without competing with the principal dialogue. Monteath returned repeatedly to the idea that audiences should sense the presence of a living world long before they consciously identify individual voices. Crowd ADR achieves its greatest success not when listeners admire the performance, but when they accept the world on screen without ever questioning how it came to life.

    One of the most valuable themes running through the lecture concerned the difference between sounding natural and sounding believable. These ideas are not always identical. Performers working in crowd ADR rarely speak at the same level they would use in everyday conversation, yet exaggeration can become equally unconvincing. Monteath described the continual process of judging how voices should sit within the perspective of the scene. A performer passing close to the camera requires a different vocal presence from someone crossing the background several metres away, while conversations taking place outdoors demand a different energy from those occurring in confined interior spaces. Every decision depends upon dramatic perspective rather than fixed performance rules. Context, once again, determines everything. For sound designers, these distinctions become equally important during editing and mixing. A crowd recording that sounds entirely convincing in isolation may feel unexpectedly prominent once placed alongside production dialogue, Foley and ambience. Perspective therefore emerges through the relationship between every element of the soundtrack rather than through any individual recording considered on its own.

    This attention to perspective extends beyond volume alone. Monteath discussed the subtle adjustments people make instinctively when speaking in different environments. Outside, voices naturally rise in level before settling into an appropriate projection as people unconsciously judge the surrounding space. He compared this process to a form of echolocation. Speakers continually test their surroundings, modifying projection almost instantly until their voice feels appropriate for the environment. Recording inside a studio removes many of the environmental cues that normally guide these unconscious adjustments, requiring performers to recreate them deliberately. The challenge is not simply to speak more loudly for an exterior scene, but to reproduce the natural behaviour that accompanies speaking outdoors. Audiences rarely analyse these details consciously, though they recognise immediately when they feel unconvincing. Successful crowd ADR therefore depends upon recreating patterns of human behaviour rather than merely increasing vocal intensity.

    The physical demands of crowd ADR also proved far greater than many students had expected. Scenes involving panic, conflict or large-scale action often require sustained shouting over many hours, placing considerable strain on performers’ voices. Monteath reflected upon sessions in which actors had pushed themselves to the point of temporary vocal exhaustion, particularly when recording intense battle scenes. Curiously, he observed that shouting repeatedly inside a recording studio often proves more tiring than raising the voice naturally outdoors. In everyday life, people instinctively project according to their surroundings. Within the artificial environment of a studio, performers can find themselves holding unnecessary tension in the throat in ways that feel surprisingly unnatural. Maintaining vocal health therefore becomes an important professional skill alongside acting itself. It also reflects another aspect of professional sound design that audiences rarely consider. Recordings capable of conveying fear, excitement or urgency often depend upon performers sustaining physically demanding work throughout lengthy recording sessions while preserving consistency from one take to the next.

    The discussion of large battle sequences illustrated another revealing aspect of the profession. Crowd performers may spend an entire day creating layers of screams, reactions and movement for scenes involving hundreds or even thousands of people, fully aware that much of their work will eventually disappear beneath music, sound effects and the principal action. Monteath recalled recording material for a major battle sequence in Game of Thrones, where hours of physically demanding vocal performances ultimately became almost imperceptible within the finished soundtrack. Rather than expressing disappointment, he presented this as an inevitable consequence of professional sound design. The objective was never for individual performances to stand out. Their purpose was to contribute energy, scale and credibility to the scene, even if audiences remained almost entirely unaware of their presence. The irony is that some of the hardest work in post-production often becomes the least conspicuous in the finished mix.

    Monteath’s recurring phrase, “Context is king,” captures this philosophy particularly well. Every vocal decision derives from the dramatic situation rather than from the performer. Voices rise or fall according to the surrounding environment, emotional reactions emerge in response to the unfolding action and every fragment of conversation exists to reinforce the illusion that life extends beyond the principal characters. Successful crowd ADR is therefore measured not by how clearly individual voices are heard, but by how convincingly they allow audiences to believe in the world unfolding around them. Like many aspects of professional sound design, its greatest achievement lies in remaining almost invisible while making the fictional world feel entirely real.

    The lecture concluded with a discussion that moved beyond recording techniques and towards the broader decisions that shape professional sound design. One student described the challenge of creating the atmosphere for a bank robbery scene. Adding more and more voices had seemed the obvious solution, yet the result quickly became cluttered and distracted from the drama. Monteath’s response illustrated once again why crowd ADR depends upon judgement rather than quantity. Real crowds rarely behave as a single, unified group. Even in moments of fear, surprise or excitement, different people react at different times and in different ways. Some remain silent, others whisper, a few call out, while many simply watch events unfold. Attempting to represent every visible person with an equally prominent vocal performance often produces a soundtrack that feels less realistic rather than more so. Believability emerges through carefully judged variation, allowing individual reactions to appear and disappear naturally instead of competing continuously for the listener’s attention.

    This observation extends well beyond crowd ADR. Throughout post-production, sound designers continually decide what deserves the audience’s attention and what should remain part of the wider acoustic environment. A convincing soundtrack is not created through the accumulation of detail, but through the careful organisation of that detail into a coherent dramatic experience. Crowd performances occupy a role similar to ambience, Foley and environmental sound. They establish context, scale and emotional texture without constantly demanding attention. Their purpose is not to demonstrate how much work has been carried out, but to convince audiences that the world extending beyond the principal characters already exists. Like every other element of a soundtrack, their success depends upon supporting the story rather than competing with it.

    Towards the end of the lecture, discussion turned to the growing influence of artificial intelligence within the voice industry. Monteath acknowledged that AI is already beginning to affect areas such as commercial voice-over, where some clients have started experimenting with synthetic voices. He regarded crowd ADR rather differently. While aspects of the work may eventually become automated, authentic crowd performance depends upon subtle variations that emerge naturally whenever people work together. Voices change over the course of a recording session as performers become tired. Emotional intensity shifts between takes. Individual personalities influence rhythm, timing and vocal colour in ways that are difficult to predict or reproduce consistently. These variations might appear inconvenient from a purely technical perspective, yet they contribute directly to the richness, unpredictability and authenticity that audiences instinctively recognise as human. Technology will continue to evolve, though observation, collaboration and performance remain at the heart of believable sound design.

    For sound design students, perhaps the most valuable lesson lay in the way Monteath described his profession. Crowd ADR may appear to occupy the margins of post-production, hidden beneath dialogue, music and sound effects, yet it influences how audiences perceive almost every scene they watch. Every murmur in the background of a restaurant, every distant conversation in a station concourse and every carefully judged reaction during a moment of crisis contributes to the illusion that life continues beyond the frame. These performances do not simply fill silence. They create social spaces that feel inhabited, allowing viewers to concentrate on the story without questioning the reality of the world surrounding it.

    Throughout the lecture, Monteath returned repeatedly to one deceptively simple principle: “Context is king.” Crowd ADR succeeds not through memorable performances or individually recognisable voices, but through creating the impression that every environment extends beyond the limits of the frame. Every carefully judged laugh, argument, whispered conversation and fleeting reaction reinforces a believable social world without distracting from the principal narrative. For sound designers, this represents a broader lesson that reaches far beyond dialogue replacement. Successful audio is rarely measured by how noticeable it becomes. More often, it is measured by how completely it allows audiences to believe in the world they are experiencing. Crowd ADR exemplifies that philosophy. It remains one of the least visible aspects of professional sound design, yet it is also one of the crafts that most quietly transforms moving images into convincing places inhabited by believable people.

  • How Does a Film Speak Every Language? George Mikrogiannakis on Film Localisation, Dubbing, and International Sound Production

    George Mikrogiannakis

    How does a film speak every language?

    Most audiences rarely stop to consider the question. A film appears in a cinema or on a streaming service, characters speak naturally in the local language, performances feel convincing, and the soundtrack appears entirely coherent. Nothing suggests that thousands of individual decisions, spread across months of work and involving specialists in numerous countries, have contributed to what appears to be a seamless experience. During his online guest lecture for Edinburgh Napier University, George Mikrogiannakis drew back the curtain on that process. Drawing upon many years supervising international localisation for Walt Disney Studios, DreamWorks Animation, and other major productions, he revealed that dubbing represents only one small part of a much larger undertaking. Localisation combines translation, performance, dialogue editing, sound design, recording, mixing, quality control, and project management into a production process whose success depends upon remaining almost completely invisible.

    Before discussing localisation itself, Mikrogiannakis addressed a question that many sound design students might reasonably ask. Why should someone interested in sound effects, Foley, ambience, or mixing concern themselves with dubbing? His answer challenged the assumption that localisation belongs solely to translators or dialogue editors. Film sound does not end when the final mix has been approved. Modern productions are expected to travel internationally, and that expectation influences decisions made throughout post-production. Deliverables, session organisation, music and effects mixes, dialogue editing, documentation, and recording practice all determine whether a soundtrack can later be adapted successfully into dozens of different languages. Understanding localisation therefore provides a wider understanding of professional sound production itself.

    The distinction between dubbing and localisation formed the starting point for the discussion. Dubbing describes the replacement of spoken dialogue with performances recorded in another language. Localisation encompasses everything required to ensure that a film functions naturally within another culture while preserving the creative intentions of the original production. Dialogue must communicate the same dramatic ideas, fit the visible movements of actors’ mouths, respect timing, preserve emotional performances, and integrate seamlessly into the original soundtrack. A successful localisation should never feel like a compromise. Audiences should simply experience the film as though it had always belonged in their own language.

    Commercial realities make this work indispensable. For many major studio productions, international audiences account for the majority of ticket sales. A film that performs well domestically may still depend upon worldwide distribution for its overall commercial success. Localisation therefore becomes an essential stage of production rather than an optional addition. Mikrogiannakis illustrated the scale of this work with one striking example. Pirates of the Caribbean eventually required more than six hundred separate versions to satisfy different languages, territories, exhibition formats, airlines, and distribution requirements. Once work reaches this scale, localisation no longer resembles a straightforward translation exercise. It becomes an international production pipeline operating alongside the creation of the original film.

    Maintaining consistency across so many versions requires careful coordination between numerous creative and technical disciplines. Scripts pass from translators to dialogue adaptors, from recording directors to voice actors, from editors to mixers, and through repeated rounds of quality control before final approval. Every participant contributes something different while working towards the same objective. The audience should experience the same characters, the same emotional performances, and the same dramatic pacing regardless of which language they hear.

    Translation itself proved far more creative than many students had expected. Mikrogiannakis explained that translators receive extensive supporting documentation describing characters, situations, cultural references, jokes, and dramatic context. Their task is not to reproduce individual words as literally as possible. Instead, they seek to preserve the intention behind the dialogue. Humour frequently illustrates this challenge. A joke that depends upon an English idiom or a cultural reference may simply fail when translated directly. Rather than forcing audiences to decode unfamiliar expressions, adaptors reconstruct the underlying comic idea so that viewers in another country experience a similar moment of humour, even if the dialogue itself changes substantially.

    Lip synchronisation introduces further complications. Different languages occupy different amounts of time. A short English sentence may require considerably more syllables elsewhere, while other languages express the same meaning much more concisely. Dialogue therefore undergoes continual adjustment until it satisfies several competing requirements simultaneously. It must sound natural, preserve the original dramatic meaning, fit within the available time, and remain synchronised with the visible movements of the actor’s mouth. Accuracy alone is never enough. Rhythm, emphasis, breathing, pacing, and performance all contribute to whether audiences believe what they are watching.

    The recording process reflects the same attention to detail. Unlike dramatic productions in which actors frequently perform together, dubbing sessions normally record performers individually, allowing every voice to remain completely controllable throughout the final mix. Consistency becomes one of the engineer’s principal responsibilities. Mikrogiannakis emphasised the importance of using the same recording environment, microphone, and acoustic conditions throughout an entire production. Local studios are generally expected to deliver clean recordings with minimal processing. Equalisation, dynamics processing, reverberation, and other creative treatments remain the responsibility of the originating production rather than the individual dubbing facility. Every language version therefore begins from comparable source material before being shaped into the finished soundtrack.

    One particularly memorable example demonstrated just how carefully these productions preserve even the smallest creative details. During the localisation of How to Train Your Dragon, one character briefly speaks while wearing a leather mask. Rather than leaving each territory to interpret the scene independently, the production required two separate recordings of every affected line. One version was performed normally. The second was recorded with an obstruction placed in front of the actor’s mouth to recreate the acoustic effect of speaking through the mask. Such requests may appear unusually specific, though they illustrate a broader principle running throughout the lecture. Localisation seeks to reproduce the experience of the original production as faithfully as possible, even when that requires remarkably detailed technical preparation.

    For sound design students, the most revealing discussion centred upon the music and effects mix, more commonly known as the M&E. At first glance, creating an international version might appear straightforward. Remove the original dialogue, record new voices, and place them into the existing soundtrack. Mikrogiannakis demonstrated why this assumption quickly breaks down. Production dialogue rarely contains voices alone. Clothing movement, footsteps, room reflections, environmental ambience, prop handling, breathing, incidental vocalisations, and countless other sounds often exist within the same recordings. Removing dialogue therefore removes far more than speech.

    Producing a convincing M&E requires many of these elements to be rebuilt separately before localisation can even begin. Foley artists recreate physical actions. Ambience editors restore the acoustic character of locations. Sound editors recover or redesign details that disappear when production dialogue is removed. Every reconstructed element must integrate naturally with the remaining soundtrack so that audiences remain unaware that significant parts of the scene have effectively been recreated. Localisation therefore exposes something that audiences rarely notice. Successful dialogue replacement depends upon the invisible work of many other sound professionals whose contributions make the reconstructed world feel complete.

    The same attention to detail extends into the organisation of production sessions. Mikrogiannakis explained that major studios prescribe how dialogue sessions should be structured long before recording begins. Lead characters, supporting roles, incidental dialogue, and background voices occupy predetermined locations within Pro Tools sessions so that material arriving from different countries can be assembled without confusion. Track layouts, naming conventions, file structures, and version numbers follow equally strict standards. These systems may appear administrative rather than creative, though they exist for a practical reason. Hundreds of dialogue files may pass between translators, recording studios, editors, mixers, and quality-control teams before a film reaches cinemas. Small inconsistencies introduced at the beginning of the process can rapidly become expensive problems once productions begin moving between countries.

    For students accustomed to working alone, this offers an interesting perspective on professional practice. Large productions depend upon predictability as much as originality. Other members of the production team must be able to identify recordings immediately, locate the correct version of every file, and understand how sessions have been organised without needing lengthy explanations. Good organisation does not restrict creativity. It allows creativity to survive within projects involving hundreds of contributors working across multiple continents.

    The recording sessions themselves reflect similar priorities. Actors rarely record together, even when their characters share a conversation. Instead, every performance is captured independently under carefully controlled acoustic conditions. Recording engineers seek consistency above all else, maintaining the same microphones, recording chains, and studio environments wherever possible. Performances can then be balanced, edited, and integrated into the soundtrack with considerably greater precision than would otherwise be possible. The objective is not simply to record dialogue. It is to provide material that remains flexible throughout every subsequent stage of post-production.

    Security introduces another layer of complexity. Long before a film reaches cinemas, localisation teams may already be working on dialogue in numerous languages. Scripts, images, and recordings therefore become highly confidential. Mikrogiannakis described productions protected through extensive non-disclosure agreements, secure online workflows, watermarked media, and carefully controlled distribution systems. In particularly sensitive cases, even the picture supplied to dubbing studios may reveal only a small area surrounding a character’s mouth while the remainder of the image remains concealed. The performers receive enough visual information to synchronise their dialogue without exposing details of the story before release.

    These precautions reveal another aspect of contemporary sound production that audiences rarely encounter. Localisation frequently begins while visual effects continue to evolve, editorial changes remain possible, and marketing campaigns have yet to reveal significant elements of the film. Sound departments therefore work within productions that remain in constant development. Flexibility becomes as valuable as technical expertise. Dialogue may require revision, scenes may be shortened, and editorial decisions may continue long after recording has begun. Every change must then be reflected consistently across every language version.

    Once recording has finished, another phase begins. Every performance is reviewed against the original production to evaluate synchronisation, pronunciation, dramatic intention, technical quality, and consistency with previously approved material. Recordings that satisfy one requirement may still require revision for another. A technically perfect recording may not match the emotional intensity of the original actor. A convincing performance may reveal a slight synchronisation problem. A translation may preserve meaning while sounding unnatural when spoken aloud. Each stage of review narrows these differences until the finished soundtrack supports the same dramatic experience as the original production.

    The process depends upon specialists whose expertise overlaps rather than duplicates. Translators evaluate language. Dialogue directors shape performances. Recording engineers concentrate on technical quality. Editors refine timing and synchronisation. Mixers integrate new dialogue into the existing soundtrack. Supervisors compare each completed version with the original production before granting approval. None of these roles can replace another. The finished film emerges through collaboration between people whose responsibilities remain distinct while contributing towards a shared creative objective.

    One aspect of the discussion resonated particularly strongly for sound design students. Many university projects naturally emphasise creating interesting sounds. Professional productions require that creativity to coexist with organisation, documentation, planning, and consistency. A beautifully designed soundtrack that cannot be delivered reliably to another department quickly becomes difficult to maintain. Localisation demonstrates this reality with unusual clarity. Every recording created during production may later support dozens of additional versions distributed across the world. Decisions made while organising sessions, preparing stems, documenting edits, or recording apparently insignificant details may continue influencing the production years after the original mix has been completed.

    The relationship between creativity and organisation also changes the way professional sound departments approach collaboration. Rather than treating editing, Foley, dialogue, sound effects, ambience, and mixing as isolated activities, localisation reveals how closely each depends upon the others. Replacing dialogue successfully requires carefully prepared music and effects mixes. Those mixes depend upon dialogue editors separating production material accurately. Dialogue editors depend upon clean recordings, consistent session management, and comprehensive documentation. Every department inherits decisions made by the departments before it. Strong workflows therefore support creative outcomes rather than competing with them.

    Audiences rarely recognise any of this work, and perhaps they should not. Successful localisation draws attention towards the story rather than the production process. Viewers become absorbed in performances, relationships, humour, and dramatic tension without considering how many different versions of the soundtrack exist or how many specialists contributed to the one they happen to hear. The technical achievement lies precisely in making reconstruction disappear.

    For sound design students, localisation offers an unusually clear picture of contemporary professional practice. It demonstrates that sound production extends well beyond recording and mixing. Projects continue to evolve after the original soundtrack has been completed, passing through new languages, cultures, technologies, and distribution platforms while preserving a coherent creative identity. Every carefully organised session, every clean recording, every reconstructed ambience, and every accurately prepared deliverable helps make that possible.

    A film may begin life in a single language, though its soundtrack is often expected to communicate with audiences across much of the world. Making that transition successfully depends upon considerably more than translation. It depends upon planning, technical precision, collaboration, and a shared commitment to preserving the creative intentions embedded within the original production. The better those foundations have been established, the more naturally the film speaks to audiences, regardless of which language they hear.