The video explores the complexities of our sense of hearing using various audio illusions. It highlights how our brains interpret sound beyond just detecting vibrations between 20 Hz and 20,000 Hz, explaining phenomena like the missing fundamental and shepard tone illusion. These illustrate how our brains perceive music, tones, and pitches in intriguing ways that differ from pure physical explanations.
The video further delves into historical and modern applications of sound perception, such as how organists use harmonics to create full soundscapes and how technological advances enhanced sound localization during warfare. Additionally, it covers cognitive processes like the phantom word illusion and the cocktail party effect, showcasing our brain's remarkable ability to pick out specific auditory information from seemingly chaotic environments.
Key takeaways from the video:
Please remember to turn on the CC button to view the subtitles.
Key Vocabularies and Common Phrases:
1. sine wave [saɪn weɪv] - (noun) - A mathematical curve that describes a smooth periodic oscillation. - Synonyms: (sinusoidal wave, trigonometric function, wave function)
To me, sound a is clearly higher, but that's strange because sound a was just a 100 hz sine wave.
2. overtones [ˈoʊvərˌtoʊnz] - (noun) - Higher frequencies produced along with the fundamental frequency of a sound, which contribute to its timbre. - Synonyms: (harmonics, resonance, accompanying tones)
That's because each pipe produces a distinct set of higher frequencies called overtones.
3. timbre [ˈtæmbər] - (noun) - The character or quality of a musical sound or voice as distinct from its pitch and intensity. - Synonyms: (tone color, sound quality, texture)
They affect the quality of the sound called timbre.
4. harmonics [hɑːrˈmɑːnɪks] - (noun) - Frequencies that are integer multiples of the fundamental frequency, contributing to the sound's timbre and perceived pitch. - Synonyms: (resonance, overtones, sound spectra)
For a lot of instruments, the most common overtones are integer multiples of the fundamental frequency. These are known as harmonics.
5. mondegreens [ˈmɒndɪˌɡriːnz] - (noun) - Misheard song lyrics or phrases that give an incorrect interpretation. - Synonyms: (mishearings, auditory illusions, misinterpretations)
These are called mondegreens after a misheard poem in which there's a line, they have slain the Earl Amurry and Lady Montagreen.
6. shepard tone [ˈʃɛpərd toʊn] - (noun) - An auditory illusion of a tone that continually ascends in pitch, yet never gets any higher. - Synonyms: (auditory illusion, perpetual pitch, sound illusion)
Now listen carefully to the music. The scale sounds like it keeps going up and up and up, just like the endless staircase. This is the shepard tone illusion.
7. glissando [ɡlɪˈsændoʊ] - (noun) - A continuous slide upward or downward between two notes. - Synonyms: (slide, sweep, scale)
And here's a Shepard glissando on its own.
8. cocktail party effect [ˈkɑːktɛɪl ˈpɑːrti ɪˈfɛkt] - (noun) - The ability to focus one's auditory attention on a particular stimulus while filtering out a range of other stimuli. - Synonyms: (auditory discrimination, selective hearing, perceptual focus)
So researchers started looking into the so called cocktail party effect because this problem resembled focusing on a single voice in a noisy room.
9. spatial audio [ˈspeɪʃəl ˈɔːdioʊ] - (noun) - Sound that is perceived as coming from different directions relative to the listener. - Synonyms: (3D audio, surround sound, immersive audio)
Pinochet is so key to an immersive sound experience in virtual reality that companies like Apple and Sony actually scan your ears to create personalized spatial audio.
10. binaural beats [baɪˈnɔrəl bits] - (noun) - An auditory illusion perceived when two slightly different frequencies are presented separately to each ear. - Synonyms: (audio illusion, brainwave synchronization, neural beats)
When your brain mixes these frequencies together, it's called binaural beats.
These Illusions Fool Almost Everyone
I want you to listen to these two sounds and decide which is higher. So this is sound a and this is sound b. Okay. So to me, sound a is clearly higher, but that's strange because sound a was just a 100 Hz sine wave. Sound b had that same 100 Hz frequency, but also 150. So we added higher frequencies, but the sound was lower. How does that work? I think there's this idea that what our ears do is simply detect the frequency of vibrations in our environment that are between 20. But there is so much more to hearing than that.
And in this video, we're going to go through a series of audio illusions that illustrate how our sense of hearing actually works. Most of these effects will work on a phone or laptop speakers, but if you have headphones handy, well, I'd recommend putting them on for the full experience. It's like a whole body instrument, isn't it? Absolutely. Yeah. Yeah. This is the Sydney Town Hall pipe organ. When it was built in 1890, it was the largest organ in the world. Something I didn't realize about organs is that they were meant to sound like many different instruments playing together. Organs are sort of a one person orchestra.
Very flutey, right? Yeah, you can tell. Compare that to a trumpet the oboe sound. So you can hear the orchestral sounds on the organ. We could get inside the instrument, too. Should we go look? Let's have a look. Yeah. Okay. For each instrument, there are a series of pipes in the organ which play all the different notes for that instrument. I mean, there are 8000 pipes in this organ. 8000. 8000, yeah. Why do you need that many? To create all the different sounds of the orchestra. What you see on the outside is just a tiny fraction of the organ itself.
Whoa, look at all these. They're all hidden in here. They are, yeah. And some are wooden, some are metal, some have resonators at the bottom of them to create the more reedy sounds, the brassy sounds. But then these wooden ones are more of the deep, fluty sounds as well. So this is like what, a keyboard? It's a keyboard, yeah, that's right. A keyboard layout. Yeah, of pipes. When two pipes of the same length vibrate, they both play the same note. That's because they're both producing the same fundamental frequency. That is the lowest and usually highest amplitude vibration they produce.
But if the pipes are made of different materials, they will sound different. So you can tell they are different instruments, and that's because each one produces a distinct set of higher frequencies called overtones. They're not as loud as the fundamental and we don't hear them as distinct tones, but they affect the quality of the sound called timbre. It's how you can tell apart a trumpet from, say, a flute. They have overtones of different frequencies and relative amplitudes. For a lot of instruments, the most common overtones are integer multiples of the fundamental frequency. These are known as harmonics.
If this was your fundamental note, the notes that you're going to be hearing with it would be. So all of those notes are within that fundamental note. Now, harmonics can be useful when you're trying to play really low notes. The Sydney Town Hall pipe organ is one of only two in the world that has a 64-foot-long pipe. It's actually so large that it has to be folded over itself. Where is the 64? Can we, can we see it? You got the grand question that I don't know. I know it's somewhere here. Like, that's a really big chunk of wood right here. Oh, yeah. Could that be it?
This pipe is used to produce a frequency of 8 Hz. When you get to that level, it's more something that you feel rather than something that you hear. For sure. The lowest note most big pipe organs can play is 16 Hz, which is just at the limit of human hearing. But even this requires a 32-foot pipe, which is too big or expensive for many organs. Do you know where the 32-footers are? Well, the ones at the front, they're the ones that you see at the front. That one. That's the 32-footer. Yeah. So that's nearly 10 meters. Nearly 10 meters, yeah. Pretty scary to think about from the top of it, actually.
In 18th century Europe, Georg Joseph Vogler was a popular organist. He wanted to tour the continent, but that would require building a compact, portable organ that he could take with him on the road. He obviously couldn't haul around the huge 32-foot pipe required to produce 16 Hz. So how could he still create the low frequencies that make the organ so powerful? Vogler realized that if he played the harmonics of 16 Hz using shorter pipes, your brain would hear this missing fundamental. Can we try the trick and see if. Yeah, sure, yeah. The fifth is that sort of most common fundamental, which you're going to hear to get the low sound.
But basically, the quint gets used with a 16-foot and it creates the lower resultant tone. So that's just the 16 metal. Mmm. It's so funny because you add it, and I do not hear it going up. You're playing a fifth above. Yes, but I'm hearing it go down. Like you just pull that out. And I'm like, oh yeah, the note dropped. That's the trick. Yeah. With the two sounds I played at the beginning, the first was a pure 100 Hz, but the second sound was made up of the harmonics of 50 Hz. So you actually heard this fundamental frequency even though it wasn't there.
That's how higher frequencies together can sound lower than low frequencies if they are harmonics of a low fundamental. Now, this might not be as weird as it seems. If you look at the waveform of the harmonics, you find that adding the higher frequencies changes the period of the sound. It makes the period longer so that it's actually the same as the missing fundamental. If you kind of recreate some of those harmonic pitches, you're actually going to bring out more bass in the sound. So the idea of like, you could play the harmonics and hear the fundamental. Exactly, even if you're not playing the fundamental? That's right, yeah.
So different frequency sounds can combine to make notes that aren't there, but they can also do something even hello, it's me, Mario. In Super Mario 64, there's a staircase that seems to go on forever. Players can't level up until they collect enough coins. Now listen carefully to the music. The scale sounds like it keeps going up and up and up, just like the endless staircase. This is the shepard tone illusion. And here's a Shepard glissando on its own.
An ever-increasing tone should be impossible because we can't hear anything beyond the 20,000 Hz limit. And yet this sound keeps going, always ascending. The trick is a shepard tone isn't just one note. There are multiple frequencies being played, all separated by octaves. All of these frequencies are increasing, but as they do, their volumes change so the high notes get quieter and the low notes get louder. High notes soon fade out and new low notes are faded in. This gives the illusion of an ever-rising pitch.
Like the audio version of a barbershop pole, Shepard tones can also evoke emotional or physical responses in some listeners. A 2016 study found that after listening to Shepard tones, participants reported feeling nervous, anxious, and disturbed. Perhaps that's why during an intense bombing scene in the film Dunkirk, Shepard tones feature in the accompanying score. Hopefully this won't make you uneasy, but I want you to try to figure out which well-known tune this is. All of the notes have been kept the same, but they've been mixed up into different octaves.
Did you recognize the song? Well, here is the unscrambled melody. But now that you've heard that, can you follow the scrambled version to me? It's fascinating how the second time I heard the scrambled melody, the tune seemed obvious, which is very different from how it sounded the first time. Our brains can find patterns in random sounds, too. This is the phantom word illusion created by Doctor Diana Deutsch. Listen to this audio and try to figure out which words are being said. You can put what you heard in the comments.
When one speaker plays a word, the second speaker plays a different word at the same time. According to Doctor Deutsch, because the signals are mixed in the air before they reach your ears, you're given a pile of sounds to choose from so you can create words in your mind. A lot of what we hear depends not on the frequencies of sound, but on how our brains process them. Doctor Deutsch noticed that when she played this illusion near exam week, students reported hearing words like no brain, I'm tired, or no time, and we can actually prime the brain to hear what we want it to hear. For example, using text.
Take the case of this crowd chanting. You're primed to hear the lyrics. You see, these are called mondegreens after a misheard poem in which there's a line, they have slain the Earl Amurry and Lady Montagreen. Except in the real poem, the earl dies alone, and his killers actually laid him on the green. Sometimes mondegreens happen when sounds are divided logically but incorrectly, such as hearing Pulitzer Prize instead of Pulitzer Prize. Language familiarity would help you hear the correct one from the start. So while UK football fans might hear the common chant, that is embarrassing, an American football fan might not.
What's even more amazing is how subtle visual cues can affect what we hear. What am I saying in this clip? Bear, bear, bear. If you heard the word bear, that's because that's what I was saying. But what am I saying in this clip? Bear, bear, bear. Now, I bet you heard fair. But if you play back both those clips without looking, you'll find it's the exact same audio. All we changed was the mouth movement. And I can prove it to you by playing those two clips at the same time, and what you hear will change depending on which clip you focus on. Bear, bear, bear. Bear, bear, bear.
So what we see affects what we hear. And the reverse is also true. In this illusion, if no sound is played, it looks like the two circles are passing through each other. But at a sound when they intersected, and immediately it seems like they're bouncing off each other. What we see and hear are intrinsically linked, because in the real world, one sense can reliably inform the other. But what if there are no visual cues to go on? In the 1950s, air traffic controllers were communicating with multiple pilots simultaneously in the same room.
Unfortunately, messages from all of the pilots would play from a single loudspeaker, and the overlapping audio made it really difficult to pick out just one voice. So researchers started looking into the so-called cocktail party effect because this problem resembled focusing on a single voice in a noisy room. Most of us can do this with little effort, but how? It's kind of like taking the recording of the entire party and pulling out a particular voice's waveform. The sound waves interfere with each other before reaching your ears, so this should be a difficult task.
In this recording, try to find the voice. Talking about a flight in this crowd, I find that really hard. But if you hear the voice first, then the rest of the conversation is easier to follow. This is much easier because you can predict what words will come next based on context and language structure. The second way we can focus on one voice is by identifying where the sound is coming from. Listen again, but this time focus on the pilot played in your left ear. In a cocktail party, you can focus on your friend by ignoring sounds that come from other locations.
Once researchers realized this, they advocated that different pilots be broadcast through different speakers spread out throughout the control room. This allowed air traffic controllers to more successfully tune in to their pilot. But how do we actually locate the source of a sound? Well, I'm gonna put on this blindfold and ask my wife to walk around me and clap in different locations. And I am going to try to point to the location of the sound. So let's give it a try. Normally, you can pinpoint a sound to within a degree or two, and there are actually four different cues that help me identify the location of the sound.
How was it? The first is volume. A sound on my right will be louder in my right ear. My head sort of casts a sound shadow over my left ear. And the second cue is that this shadow attenuates higher frequencies more than low frequencies. It's kind of like when your neighbor is having a party. You can't really hear the high frequencies like the lyrics, but you can hear the bass because low frequencies are less attenuated by distance and obstacles. The third is time delay. It takes a sound half a millisecond to your head, so sound will usually arrive at one ear before the other. Listen to a beep on your left and then on your right.
Now, as the delay between those two beeps is shortened, it's less of an echo and more just one sound that's really on your left. The fourth cue we use to identify the source of a sound is at what point in the wave cycle the sound arrives at each ear or the phase of the wave. Is it arriving at a peak or a trough? The phase of the wave at one ear will typically be different than the phase at the other ear. Now you run into a bit of trouble when the sound is either directly in front or behind you or on any point in a vertical plane that passes through the middle of your head.
And that's because the distance from the sound to both of your ears is the same and therefore those four cues aren't very useful. Owls solve this issue with asymmetrical ears. Their left ear is actually lower on their head than their right ear, so sounds from below are louder in their lower left ear. Humans typically have symmetric ears, but their shapes are important. This is where the outer part of your ear comes in. I mean, what we'd normally just refer to as the ear. Technically this is called the pinna. Depending on the location and the frequency of sound, it will bounce off these ridges and bumps on your ear and end up inside your ear actually going into the eardrum.
And those reflections will actually change some frequencies differently than others depending on the location. Scientists placed tiny microphones inside volunteers' ears to measure this. They could see, for example, that a 6000 Hz sound located above you might be amplified by ten decibels, but that same sound below you would be attenuated by ten decibels. These figures depend on the unique bumps and ridges of cartilage in your ear. So each person's ears have a unique response curve to different frequencies at different locations. And over the course of our lives, our brains learn the way different frequencies reflect off our ears and we use that information to identify the source of the sound.
Now every person has a unique pinna shape. So what if our ears changed? In a 1998 study, researchers placed small molds into the ears of a group of participants, changing the shape of their pinnas. Here's one subject's data. The rigid background grid represents where sounds were actually played. The dots are the subjects guesses and the darker warped grid is the average of those guesses. Before the study, they were fairly good at locating sounds, but after changing their pinna shape, they were downright terrible.
Over a series of days and weeks with their new pinnas, the participants all adjusted and became better at locating sound. So it is something your brain can adapt to. Thankfully, after the molds were removed, participants had no trouble reverting back to their original ears. Pinochet is so key to an immersive sound experience in virtual reality that companies like Apple and Sony actually scan your ears to create personalized spatial audio. And for a long time, people have been trying to harness and amplify our ability to locate sounds. In 1880, Professor Alfred Mayer presented a device called a topophone to locate ships in the fog.
It was made from two adjustable hearing cones. By changing the distance and angle between them, sailors could narrow down the direction of a ship's foghorn. Unfortunately, they weren't very useful because sound waves interact with fog. But then during World War one, locating bombing planes on approach was of central importance. So armies developed special equipment called sound mirrors to amplify sound. In Britain, sound mirror stations coordinated together to locate an enemy up to 15 minutes in advance. But as planes became faster, sound mirrors couldn't detect them early enough and they were eventually abandoned after the invention of radar.
But even though the technology became obsolete, the system was not. The radar team used the coordinating station's idea that was first developed from the sound mirror program. Linked radar stations were a critical defense in the Battle of Britain.
This is the vox Angelica on the Sydney town hall organ. When two pipes are slightly out of tune, there's this pulsing effect to the sound. You can hear that in a more pronounced way. If I play pure tones, here is a pure 261 Hz sine wave and a pure 263 Hz sine wave. When both of these tones play, those compression and refraction waves interfere with each other. Sometimes the peaks line up to produce a louder sound and when a peak lines up with a trough they cancel out. Because these frequencies are separated by 2 Hz, you hear two louder pulses every second. This is known as beating.
Now the beats are really clear, and this makes sense when the two waves are interfering in the air. But what happens if a 261 Hz tone is played in one ear and a 263 Hz tone is played in the other? What did you hear? Well, the tones never had a chance to interact, but you can still hear some subtle beating. Your brain is firing at a rate corresponding to the phase difference, causing the beat perception. When your brain mixes these frequencies together, it's called binaural beats. And maybe you've already heard of binaural beats, as a quick search of YouTube shows that some people claim they can improve focus or memory.
But a 2023 review was inconclusive and emphasized the need for more standardized testing methods. Audio illusions aren't a sign that our sense of hearing is faulty. I mean, the world is a messy, noisy place, and our brains have developed complex methods to deal with ambiguity. You fill in the gaps with your past experiences or expectations. Without your brain making these subconscious adjustments, a cocktail party would always just sound like a total mess. Audio illusions show us where our perception goes wrong, but the system as a whole is pretty good at getting to the truth.
Now illusions remind us that we can't always take the world at face value. And while our unconscious minds might fill in the gaps from time to time, it's our critical thinking skills that do the heavy lifting of separating fact from fiction.
Science, Technology, Education, Audio Illusions, Hearing Perception, Sound Waves, Veritasium