Category: Uncategorised

  • Why Has Pitch Remained So Difficult to Explain? Prof. William Yost on Hearing Science’s Enduring Mystery

    Professor William Yost

    Few aspects of hearing feel more straightforward than pitch. Sounds seem high or low, rising or falling, stable or changing. We recognise familiar voices, detect changes in intonation, distinguish one alarm from another, and effortlessly judge whether one sound is higher than the next. Most of the time, pitch appears so natural that it barely attracts attention. It feels less like an interpretation and more like a property of the world itself. Prof. William Yost’s guest lecture revolved around a deceptively simple question: if pitch feels so obvious, why has it proved so difficult to explain?

    The question has occupied researchers for more than two thousand years. During that time, scientific understanding of hearing has advanced dramatically. Researchers can measure acoustic signals with extraordinary precision, investigate the mechanics of the inner ear in remarkable detail, record neural activity throughout the auditory pathway, and construct increasingly sophisticated computational models of perception. Yet despite all of this progress, pitch remains one of the most debated topics in hearing science. What makes this especially intriguing is that the difficulty does not arise from a shortage of ideas. The history of pitch perception is filled with elegant theories. Again and again, researchers have proposed explanations that appeared capable of accounting for the available evidence. Again and again, new observations have emerged that complicated the picture. The resulting history is not one of scientific failure. It is a story of a percept that repeatedly proves more complicated than researchers initially expect.

    Discussion of pitch often begins with Pythagoras. Observations of vibrating strings suggested that simple mathematical relationships corresponded to differences in perceived pitch, encouraging the belief that pitch might ultimately be explained through measurable physical properties. Centuries later, researchers such as Hermann von Helmholtz developed increasingly sophisticated accounts linking acoustics, physiology, and perception. Progress seemed to move steadily towards a complete explanation. Yet the history presented by Prof. Yost repeatedly demonstrated that confidence in any single account rarely lasted for long. New experiments continued revealing sounds that behaved in unexpected ways, exposing limitations in explanations that had previously appeared convincing.

    Among the most influential examples is the phenomenon now known as the missing fundamental. Listeners can perceive a pitch corresponding to a frequency that is physically absent from the sound itself. Telecommunication systems have long benefited from related principles. Even when low frequencies are poorly transmitted, listeners can often continue perceiving aspects of the missing pitch through the remaining harmonics. At first encounter, the phenomenon sounds almost impossible. How can listeners hear a pitch that is not present within the signal? Its importance extends far beyond its novelty. The missing fundamental revealed that pitch could not be explained simply by identifying which frequencies were physically present. The auditory system appeared capable of generating a stable perceptual experience even when information that seemed essential was absent. A phenomenon that initially appeared to be a curious exception gradually became evidence that researchers might be asking the wrong questions. Observations of this kind repeatedly changed the direction of pitch research. Their significance lay not merely in producing unusual percepts but in exposing hidden assumptions about how hearing operates.

    Researchers naturally searched for the critical variable that would finally explain pitch. Sometimes that variable appeared to be frequency. Sometimes it appeared to be temporal periodicity. Sometimes it appeared to be harmonic structure. Each candidate captured something important about hearing. Each eventually encountered observations that it struggled to explain. Scientific explanations often become more satisfying as they become simpler, and much of the history of pitch research can be understood as a search for a unifying principle capable of accounting for a wide range of perceptual experiences. Every successful theory illuminated part of the phenomenon while leaving other aspects unresolved. Rather than steadily converging towards a single universally accepted explanation, the field gradually accumulated evidence that several different forms of information contribute to what listeners experience as pitch.

    Contemporary hearing science reflects this growing recognition. Spectral information contributes to perception. Temporal patterns contribute as well. Envelope cues can also play a role under certain conditions. Each source of information appears capable of supporting aspects of pitch perception, though each also possesses limitations. The challenge therefore becomes less about identifying a single cue and more about understanding how different forms of information interact to produce a percept that listeners experience as unified and stable. That challenge becomes even greater once the nature of sound itself is considered. Sound unfolds rather than existing all at once. A listener cannot determine the intonation of a spoken sentence from a single instant of sound. The rising contour that transforms a statement into a question only becomes apparent once enough of the signal has unfolded. Similar constraints apply more broadly throughout hearing. A pressure wave travels through the environment, reaches the ear, interacts with the auditory system, becomes encoded into neural activity, and undergoes further processing before contributing to conscious experience. Many of the acoustic cues associated with pitch require information to accumulate before they become meaningful. The auditory system therefore cannot rely upon an instantaneous snapshot of the world. Instead, it must integrate information distributed across time while maintaining perceptual stability.

    Listening never occurs under perfectly controlled conditions. Every listener brings a unique physiology, a unique listening history, and a unique set of experiences to the task of perception. Consider how differently a trained musician, a young child, and an older listener with age-related hearing loss may encounter the same sound. Languages emphasise different pitch patterns. Hearing changes across the lifespan. Exposure to speech and environmental sounds varies considerably. Acoustic environments introduce further variability through reverberation, masking, reflections, distance, and background activity. Signals reaching the ear are therefore rarely identical from one situation to the next. Yet despite this variation, listeners often arrive at strikingly similar perceptual judgements. Understanding how such stability emerges remains part of the broader challenge confronting pitch research.

    Relative pitch offers another perspective on why the problem remains difficult. Most people can readily determine whether one sound is higher or lower than another, recognise familiar patterns across changing circumstances, and detect subtle shifts in vocal expression. A familiar voice remains recognisable whether the speaker is tired, excited, whispering, or shouting. A simple melody can remain recognisable when sung by different people starting on different notes. The acoustic details change, though listeners continue perceiving stable relationships. Absolute pitch, by contrast, appears comparatively uncommon. Hearing therefore seems especially effective at identifying patterns and relationships rather than fixed reference points. Such relational perception proves enormously useful in a world where sounds vary continuously according to source, context, and environment, though it complicates attempts to explain pitch through simple correspondences between physical signals and perceptual experience.

    Contemporary research continues uncovering percepts that challenge established assumptions. Prof. Yost’s discussion of Iterated Ripple Noise provided a particularly striking example. Such stimuli can generate robust pitch percepts despite possessing characteristics that many traditional theories would not predict. Listeners report clear pitches even when the sounds themselves bear little resemblance to the simple tones often associated with textbook demonstrations. Their importance lies in showing that the history of pitch research is not merely a story of earlier misunderstandings corrected by later discoveries. New observations continue emerging. New stimuli continue revealing unexpected aspects of perception. More than two thousand years after researchers first began asking how pitch works, the auditory system still has surprises to offer.

    Scientific uncertainty occupied an unusually prominent place throughout the lecture. Many guest lectures focus on a particular discovery, method, or contribution. Prof. Yost approached the subject differently. Rather than presenting pitch as a problem approaching resolution, he presented it as a continuing scientific conversation extending across centuries. Researchers separated by generations appeared not as competitors replacing one another but as participants in a shared effort to understand a percept that repeatedly exceeds expectations. There was a striking absence of triumphalism in this account. Successive theories were not presented as failures to be discarded, nor as final answers waiting to be celebrated. Instead, they became stages in an ongoing attempt to understand one of the most familiar yet elusive aspects of hearing. New methods produce new insights while simultaneously revealing new complications. Better measurements reduce some uncertainties while exposing others. The enduring value of pitch research therefore lies not only in the answers it has generated but also in the questions it continues to produce.

    Seen from this perspective, pitch becomes something larger than a specialised topic within psychoacoustics. Researchers can measure sounds with extraordinary accuracy. They can investigate increasingly detailed aspects of auditory physiology and neural processing. Connecting those physical processes to lived perceptual experience remains challenging. More than two thousand years of investigation have not diminished the significance of the problem. If anything, they have expanded it. New methods reveal additional layers of hearing, while new explanations illuminate important aspects of auditory processing without resolving every question.

    What emerged most clearly from Prof. Yost’s lecture was that the enduring importance of pitch lies not merely in discovering which theory ultimately proves correct. Its value lies in what the search has revealed about hearing itself. The history of pitch perception is often described as a succession of competing theories, though the lecture presented something richer than that. It revealed generations of researchers grappling with the same fundamental puzzle, each contributing part of a much larger conversation. The closer researchers look at pitch, the less it resembles a single problem waiting to be solved and the more it resembles a window into the sophistication of auditory perception. Few scientific questions have persisted for so long. Fewer still continue generating new experiments, new explanations, and new debates. That persistence suggests that researchers are not simply refining an existing answer. They are continuing to uncover new dimensions of the question itself.

  • Playing Along: When Music Is Part of the Game World

    “We talk about music that originates from within the diegesis — and not from some non-diegetic player outside of it.”
    — Axel Berndt

    In a guest lecture on game audio, Dr.-Ing. Axel Berndt examined the role of diegetic music — music that exists within a game’s fictional world and can be heard, performed, or even disrupted by its characters. This kind of music, Berndt argued, is not background or emotional subtext. It is part of the world itself.

    Berndt, is a member of the Center of Music and Film Informatics within the Detmold University of Music, working at the intersection of sound design, musical interaction, and adaptive systems. His lecture brought together commercial examples, music-theoretic distinctions, and design considerations to illustrate how music behaves differently when it belongs to the world rather than framing it from outside.

    Dr. -Ing. Axel Berndt

    Inside the World: What Makes Music Diegetic

    Diegetic music refers to music that originates within the game’s diegesis — its fictional environment. Berndt described it as everything “within this world”: sounds that characters can hear and react to, including wind, speech, and music performed or played through in-world devices.

    “If someone switches the radio on, triggers the music box, sings a song, or plays an instrument… their music is also diegetic.”

    Examples included a street musician in The Patrician, a pipe player at a party, and the bard at the start of Conquest of the Longbow. In Doom 3, a gaming machine plays music within the scene; in Oceanarium, a robot performs in a clearly defined virtual space. These are not aesthetic flourishes — they anchor music in the logic of the world.

    Berndt contrasted this with non-diegetic music, which accompanies a scene without being part of it — such as a film score swelling during a battle. “There is no orchestra sitting on an asteroid during the space battle,” he remarked, highlighting the artificiality of non-diegetic scoring in game environments that otherwise strive for realism.

    Sound That Can Be Interrupted

    Once music is part of the world, it becomes subject to physical space, interruption, and interaction.

    “The simplest type of interaction may be to switch a radio on and off, but there is much more possible.”

    Berndt categorised musical interactions as either destructive — disrupting a performance — or constructive, where player input enriches or alters the musical output. In Monkey Island 3, players must stop their crew from singing an extended shanty by choosing responses that are woven into the rhyme scheme. Each interruption is musical and interactive.

    “The sequential order of verses and interludes is arranged according to the multiple choice decisions the player makes.”

    Such scenes turn performance into a mechanic. Music is not a layer applied to gameplay — it is the gameplay.

    When Music Isn’t Polished — And Why That Matters

    Berndt emphasised that diegetic music should not always sound flawless. Live performance in reality includes irregularities: tuning fluctuations, missed notes, imperfect timing. Simulating this can enhance believability.

    “Fluctuations of intonation, rhythmic asynchrony, wrong notes — these things simply happen in life situations. Including them brings a gain of authenticity.”

    He cited the harmonica player in Gabriel Knight, whose wavering tone subtly reinforces the impression of a street musician with limited technical control. Imperfection isn’t failure — it is context-aware design.

    Berndt also warned against repetitive loops that expose the limits of a system. When the player leaves and re-enters a scene, and the same music starts again from the beginning, the world appears frozen. “We reached the end of the world,” he said. “There is nothing more to come.”

    To counter this, he advocated techniques such as generative variation, asynchronous playback, and music that continues even when not audible — preserving the impression of an autonomous, living environment.

    Games Where Music Is the Environment

    Berndt’s second category of diegetic music is visualised music — where players engage not just with music in the scene, but with music as the environment itself. This includes rhythm games like Guitar Hero, Dance Dance Revolution, and Crypt of the Necrodancer, where music structures time, space, and action.

    “What we actually interact with is music itself. The visuals are just a transformation — an interface that eases our visually coined interaction techniques.”

    In Audiosurf, players import their own tracks and race through colour-coded lanes shaped by the waveform. In Rez, players shoot targets that trigger rhythmic events. These games represent a shift from music as accompaniment to music as system.

    “The diegesis is the domain of musical possibilities. The visual layer follows the routines of the music.”

    Berndt emphasised that this kind of interaction demands careful timing, expressive range, and sometimes even simplification to make musical gameplay accessible.

    From Instruments to Systems

    Not all music-based interaction takes the form of traditional games. Electroplankton allowed Nintendo DS users to create sound patterns through direct manipulation — drawing curves, arranging nodes, or triggering plankton-like agents.

    “Interestingly, all these concepts don’t really need introduction. Give it to the players, let them try it out, and they will soon find out by themselves how it works.”

    Berndt distinguished between note-level interaction (e.g. triggering individual sounds, as in Donkey Konga) and structural interaction, where players influence arrangement, progression, or generative systems. Both approaches are valid, but they ask different things of the player — and of the designer.

    Designing with Music in Mind

    Berndt’s lecture underscored a recurring principle: if music is situated in the world, it should behave accordingly. It must continue when out of frame, shift based on player presence, and reflect changes in the environment. When music is visualised or systematised, it should offer feedback and form, not simply decoration.

    “Music as part of the world has to be interactive, too.”

    This is not a stylistic preference — it is a design commitment. When music is embedded in the rules of the world, it becomes not only more believable, but more meaningful. It can reflect character, reinforce consequence, and establish rhythm within both narrative and mechanics.

    Berndt’s examples — from Monkey Island to Rez, from ambient performance to interactive music toys — show how music can operate on multiple levels at once: as texture, mechanic, and presence. His lecture made clear that diegetic music in games is not a solved problem or a historical curiosity. It remains a rich site for experimentation and design.