Thinking about doing a music tech and engineering degree? Here is part of an exemplar report from Hugo - who studies music tech and Engineering.
"Acoustics & Psychoacoustics - Room Design Exercise
A brief guide to room acoustics & acoustic treatment
How sound works
Sound travels as a longitudinal wave moving in a series of compressions and rarefactions
through its medium, in most cases air particles. Sound is generated initially through the
vibration of an object, for example in the case of a drum, the drum skin vibrates, which
causes the air particles directly surrounding the material to vibrate and collide with their
neighbouring particles. This pattern of air particles colliding moves through the air in the
way of the โgolf ball and springโ model, as shown in figure 1. In this case the golf balls
themselves are the air particles (the propagating medium) and the springs are the
intermolecular forces between the particles. When there is a collision between particles,
or a point of compression, in air there will be a point of rarefaction that either follows or
precedes it so that the medium may return to its original state. In each of these collisions,
energy is exchanged from one particle to another, with some energy lost in various
mediums in each collision until the sound entirely dissipates when the energy transmitted
is negligible. It will also take time for the compression to propagate through the medium
and therefore there will be a lag from source to receiver. However, when a material with
boundaries is introduced, such as a guitar string, then the vibration that stimulates sound
to propagate through the medium moves laterally along the length of the material, in this
case a string. This means that the disturbance to the air particles will be a similarly lateral
disturbance in the form of a transverse wave.
Sound amplitude is mainly measured in terms of its Sound Pressure Level (SPL), or the
amplitude of the pressure of the sound wave. This is because the human ear is sensitive
to pressure.
Room Acoustics
Room acoustics are crucial to the way in which sound is experienced as they greatly
affect the tone of the sound that is in the room. The main acoustic characteristics of a
room are made up of a balance of the direct sound (sound taking the direct path from
source to receiver), early reflections (the shortest reflected paths to the receiver around
the room and therefore the amplitude of reflections is still fairly large) - see figure 2 - and
the reverberant field (sound at the receiver has been reflected a great number of times
and is therefore much more diffuse) - see figure 3. The balance of these reflections
Figure 1 - Diagram of the golf ball and
spring model of sound propagation [1]
determines the tone of the sound at the receiver, for example in a large church, such as
the York Minster [3], the reverberant field has a great contribution to the overall sound,
since the size of the church means there will be a huge number of reflections, and as
such the receiver will hear a very reverberant version of the sound transmitted by the
source. This is incredibly important when considering the purpose for which rooms are
designed as a large reverberant field may not be suitable for all purposes, for example it
will make a choir sound much more powerful, but it is not suitable for conferences as a
large number of reflections will reduce how clearly the sound is received. This means that,
when designing a room, the factor of room size must be considered for purpose, but also
the material that the room is made out of, as harder materials will have a lower absorption
coefficient (the amount of sound that is absorbed by a material) and will therefore
encourage more reflection and reverberation. For example, in the prior example of a
church, it is constructed from stone which will have a very low absorption coefficient.
Another factor that means room acoustics are very important in the experience of sound
is the presence of room modes. There are 3 kinds of room modes, axial, tangential and
oblique modes, as shown in figure 4. Room modes occur when a reflection path is of a
length that is a factor of a half-wavelength of a frequency of the sound that was
transmitted into the room. This means that a standing wave will be formed at that
frequency, such that, if the waves are in phase then they will interfere constructively so
that any sound that is produced in that room will have a point at which there is a node (a
point of maximum amplitude) where sound at that frequency is louder. Similarly, if the
waves are in anti-phase then there will be an antinode in the room, where sound at that
frequency will be much quieter. This is also the reason why rooms such as the York
Minster are desirable in many cases, as it is a complicated shape and therefore it is much
more unlikely for standing waves to form due to its irregular shape.
Acoustic Treatment
Acoustic treatment can be used to improve the quality of sound through the targeting of
problem areas in the acoustic characteristics of the room. For example, in the case of a
studio, room modes can greatly affect the monitoring quality as they can ruin the desired
frequency response of high quality speakers. Moreover, in the studio it is not desirable for
there to be much, if any, reverberation. Therefore, in acoustically treating the room, as
Figure 2 - Diagram of early reflected paths taken
from source to receiver [1]
Figure 3 - Diagram of the reflected paths taken
from source to receiver in the reverberant field
[1]
shown in figure 5, the vast majority of
reflections can be absorbed and therefore
all room modes will also be removed.
The main method of acoustic treatment is
through using absorbers and diffusers
positioned specifically so that they cancel
or reduce reflections. This is shown in figure
5, as absorber panels are placed at the
points at which sound reflects off the room
surfaces to remove reflections. There are 2
main types of absorber, porous absorbers
and resonant absorbers. Porous absorbers
are materials such as carpets and curtains,
where the sound wave loses energy due to
work done by the friction in the absorbers -
the propagation of a sound wave is reduced
as the collisions between air particles lose
energy due to the work done by the friction
in the porous material. This means at higher
velocities, given that velocity increases with
frequency, the porous absorber will absorb
more energy at high frequencies., Resonant
absorbers work by the sound energy
causing the absorber structure itself to vibrate, and as
such there are then frictional losses
within the absorbing material in the
resonant absorber (see figure 6). As
such, they are sensitive to the
pressure amplitude of the the sound
wave rather than the sound velocity,
as in porous absorbers. Diffusers
work in the opposite way, in that they
spread sound into many directions (see figure 7). This is done using uneven convex
surfaces on panels that allow the reverberation of a room to be maintained while
preventing room modes to occur, as it means standing waves cannot be formed when
they are reflected directly back in the direction they were moving previously.
Figure 5 - Model showing how acoustic
treatment can reduce reflections [4]
Figure 6 - Structure of a resonant absorber [1]
Figure 7 - Convex structure of a
A guide for treating different types of rooms
Things to consider:
- The materials which the room is made from - e.g. if the floor is made from concrete -
and how reflective these are
- The shape of the room - a more complex shaped room favours a more diffuse reverb in
the room, if the room is a cuboid shape (especially for small rooms) this will cause room
modes (see Room Modes section) to form between the parallel walls. These modes can
be treated through absorptive or reflective panels.
- The materials inside the room - consider their size and how reflective or absorptive the
surfaces are. In cuboid rooms or rooms with parallel walls large objects in the room can
help to break up the room modes - e.g. a church partially has very diffuse reverb
because lots of objects such as supporting pillars reflect the sound waves in different
directions.
- The purpose of the room - this can be widened by simply considering whether the
room is meant for speech or music. In the case of a medium sized cuboid room, room
modes are treated differently for speech and music - for speech generally using
absorptive panels is best whereas for music diffusion is preferable as a lack of
reverberation can reduce the effect of music greatly.
- The current features of the acoustics of the room - this means identifying whether there
is any part of sound that is more prominent when in the room, such as low or high pitch
ranges, or whether there is a flutter echo in the room. By identifying this it will give an
idea of which room modes are causing the most trouble in the acoustics of the room
and therefore where to put treatment to stop them.
An example:
In a studio (usually situated in small cuboid rooms) generally you want the acoustic
response to be as flat as possible in the frequency range, and to have a very short
reverberation time so that the features of music played in the room can be monitored to
the highest possible quality. This means that absorptive panels are usually the preferred
treatment, positioned especially in the vicinity of the monitors to prevent early reflections
from them into the room as they will distort the frequency range of the sound transmitted
into the room most.
Further Reading
- D Howard, J Angus - Acoustics & Psychoacoustics, 3rd Edition - for more detailed
reading
- St udi o SOS Gu ide To Mo n i t o r i n g & Ac o u s t i c Tre a tme n t , h t t p s : / /
www.soundonsound.com/techniques/studio-sos-guide-monitoring-acoustic-treatment
โ practical tips on acoustic treatment in a studio environment
- Studio SOS: Building A DIY Vocal Booth, https://www.soundonsound.com/techniques/
studio-sos-building-diy-vocal-booth - An in depth example of identifying the current
acoustic characteristics of a room, and their conversion to a specific end
Technical Report on Proposed Acoustic Treatment
Introduction
In order for this to be a complete guide to room acoustics and acoustic treatment, an
example room will be analysed and a impulse response measurement will be taken, so
that as well as objectively analysing the properties of that room physically, the subjective
sound experience that results can also be analysed through convolving the impulse
response with anechoic speech and instrumental recordings.
The Room
The room analysed was moderately sized, of dimensions 7 : 5.3 : 2.3 m (Length : Width :
Height), intended as a lecture room for maths students. The room has thin (1/2โ thick)
medium sized plasterboard suspended ceiling panels on the ceiling and the walls are also
made entirely out of thin (although more robust than the ceiling) plasterboard panelling.
Given that this thin panelling has hollow space behind it, the walls and ceiling will act
partially like a resonant absorber in that they will resonate with the incoming waves to an
extent (although not very much due to the size of the materials used) and therefore, as in
a resonant absorber, it will absorb some of the low end frequencies. Besides this, given
that the panelling is a fairly hard surface it will have a low absorption coefficient and
therefore will reflect a lot of the sound in the room. Table 1 shows this pattern in the
absorption coefficients of the walls and ceiling, as above 250 Hz their absorption
coefficient decreases greatly. The wall on the far side on entry (holding the windows) is
Material/
Frequency (Hz)
Surface
Area (m2)
125 250 500 1k 2k 4k
Carpet on
Concrete
[Floor]
37.1 0.02 0.06 0.15 0.4 0.6 0.6
Glass Window 0.19 0.3 0.2 0.2 0.1 0.07 0.04
Concrete
block (painted)
[Wall]
12.9 0.1 0.05 0.06 0.07 0.1 0.1
Plasterboard
(Suspended
Ceiling grid)
[Ceiling]
37.1 0.15 0.11 0.04 0.04 0.07 0.08
Plasterboard
(Panelling on
studs) [Walls]
39.1 0.29 0.1 0.06 0.05 0.04 0.04
Table 1 - Table of absorption coefficients of materials at relevant
frequencies
more solid, made out of painted concrete, and therefore has a more level absorption
coefficient across the frequency spectrum, in that it is a very reflective surface at all
frequencies with a maximum coefficient of 0.1. The floor is glue-down carpet (this is
generally roughly 1 cm thick) on concrete. This has a much higher absorption coefficient,
specifically at high frequencies as the friction in the carpet will cause energy loss through
friction in the sound, and will be the largest factor in dissipating reverberation in the room
- above 2 kHz it has an absorption coefficient of 0.6. Two of the walls are covered with
chalkboards, and there are a number of synthetic wooden desks, chairs as well as a
computer and desk at the front. Although these will serve, to an extent, to break up room
modes, they are extremely reflective. Finally, on the left side of the room there are 4 glass
windows (roughly 0.25m wide) made up of 2 panels, extending from floor to ceiling, which
are relatively reflective, however given they are quite old this may mean there will be
significant leakage between the panels.
Estimation of first 10
room mode frequencies
(Hz)
1) - 2250
2) - 3236
3) - 3945
4) - 4513
5) - 5553
6) - 6472
7) - 6769
8) - 6854
9) - 7457
10) - 7503
There is a high concentration of
room modes around 6500 and
7500Hz. This will be problematic as it will give quite a large boost to frequencies in the
upper end of the medium range of human hearing, which will be tangible in any
reverberation in the room. Furthermore, a flutter echo was quickly identified, as the room
is made up of parallel hard, low absorption surfaces.
Calculated overall reverberation time
The RT60 can be calculated using the Sabine formula:
$
RT60 = 0.161 * V
Sa
Figure 8 - Plot of all room modes for the room analysed, where blue
represents axial modes, red represents tangential modes and orange
oblique modes
Where V is the volume of the room and Sa is the total absorption over the surface area
over the room at 500Hz, since this is the industry standard. This gives an RT60 of 1.35
seconds, which is quite long for a small room, demonstrating how reflective the room is,
although at higher frequencies this will decreases as higher frequency sound energy is
absorbed more due to a higher velocity component and therefore greater frictional losses.
The main acoustic properties of the room are, firstly, the lack of low and low-mid
frequency presence in the impulse response recorded, which can be attributed to the
resonant nature of the walls and ceiling mentioned previously, as well as being notable for
the difference between tangible early reflections and the reverberant field, as there is a
marked loss in low and low-mid frequencies across the impulse response as the balance
of sound moves from direct sound and early reflections to the reverberant field
dominating. The room has a relatively long reverberation time for a room of the size
measured. This is largely because of the low absorption coefficients of the walls, floor and
ceiling of the room. Furthermore, since it is a small room, the SPL of the reverberant field
is also quite high. As shown by Pop and Cabrera [6], this is one of the defining factors in
terms of the human perception of differences between a large and small room, since the
sound field level decreases with distance (generally -6 dB as the distance doubles). The
balance of the 3 main features of room acoustics also largely determines human
perception of room size. This is because the main methods with which the human ear can
localise sound is through Interaural Intensity Differ-=ence (IID) and Interaural Time
Difference (ITD). This is the method through which the time difference between the arrival
of sound to each ear allows the brain to locate sound as by arriving earlier to one ear this
means the sound will be closer to that ear, and the extent of the difference determines the
sourceโs position compared to the head. Similarly IID is based on the idea that the
intensity of sound will decay with distance and therefore if the sound is less intense at
one ear than the other then this means it must have travelled further to reach it, therefore
the source must be positioned closer to the ear it arrived at with greater intensity. In the
case of direct sound, the brain will immediately be able to isolate its position based on
these methods, and similarly the early reflections of sound will be distinct in arriving from
the general direction of the source. By contrast, the reverberant field consists of sound
Figure 9 - Plot of values for C50 and D50 from 125 Hz to 8
kHz and A, C and L frequency weightings [16]
arriving that has been reflected multiple times and therefore will come from all directions,
so the brain cannot distinguish the source direction from this. Through this sense of
โsurroundingโ and itโs balance against the direct sound and early reflection level the brain
can perceive the size of a room.
Clarity is the measure of the ratio of early (shorter than 50ms) to late (greater than 50ms)
sound energy, derived on the basis that late reflections (those arriving after 50ms) cause
speech to merge making it unclear, whereas those before 50ms will constructively add to
the speech intelligibility. Given that the clarity index (or C50, as seen on figure 9) across
the main range of human speech (the main harmonics of which lie between 1 kHz to 2.5
kHz depending on the fundamental frequency) is between 6.4 dB and 7.2 dB this means
that the room has high clarity, since generally the minimum required clarity for speech is
roughly 3 dB [7]. The value of C50 increases with frequency, as absorption increases with
frequency and therefore there is less reverberation, meaning that the early to late sound
ratio is higher, and therefore at low frequencies the clarity is very poor (below 3 dB) but
the human voice does not have much presence at these frequencies so it is not greatly
important.
Definition (D50 on figure 9) is similarly the measure of early sound energy (from 0 to 50ms)
as a ratio of the total sound energy of the impulse response. It is used to analyse the
suitability of a space for music - i.e. a higher definition means that tones are clearer to the
listener - , expressed as a percentage, and therefore a space does not need to have such
high definition as clarity if it is intended for music as the same level of intelligibility is not
required as for speech. Similar to C50, the D50 value is moderately poor at low
frequencies, with values ranging from 52% to 68% from 125 to 500 Hz, and good at the
mid-high range with values around and above 82% from 1 kHz. The listener experience is
dominated by the range from 500 Hz to 4 kHz, with any D50 value below 50% considered
poor, but ideally D50 values should be above 85% [15]. However, unlike in speech, the
low frequency range is much more important for music as there are many instruments
that mainly play within the low frequency range, while the low end frequency spectrum
also lends richness of tone to many other instruments.
High values for Clarity and Definition parameters are to be expected in a small room as
that analysed, since there are not as many late reflections (those arriving after 50 ms for
clarity/definition respectively) to reduce the intelligibility of the input signal to the room as
in a large space. Therefore, although the space is functional as a classroom (as it as
intended to be on construction) for students to understand a lecturer, it is not pleasant or
desirable for any kind of music, largely due to the flutter echo that is present in the room.
The flutter echo is identified in the convolved โMaths Clapsโ [8] audio (auralised clap
recording in the room) as a hollow sounding repeated reflection at a set tone. This occurs
when sound played into a room with parallel walls is reflected and trapped in a repeating
pattern between opposite parallel walls, which causes the sound received to fluctuate, or
โflutterโ. It is also more common in spaces, such as this one, where the parallel walls are
less absorptive than the floor/ceiling, as this means that oblique and tangential modes are
damped where axial modes are not and as such the axial modes are more prominent. In
terms of listener perception of this flutter echo, it is immediately identified as seeming
unnatural for the room due to itโs obvious standing out from the natural reverberant field
level. This is also partially due to the fact that the reverberation is largely set up in the
horizontal plane in the room, and therefore the reverberant field is not diffuse.
Treatment
Since the room itself has satisfactory clarity and definition values, clearly any acoustic
treatment on the space to turn it into a small music venue would be focused on reducing
or removing the flutter echo and generating a frequency response of the room itself to suit
music, i.e. music is often more desirable with more prominent mid band frequencies and
flattened high and low frequency spectra.
Firstly, flutter echoes can in general be treated through the use of diffusion or absorber
panels placed on the parallel walls from which the echo arises, making sure not to
increase absorption on the floor or ceiling as it can actually increase the prominence of
the flutter echo in the room [10]. Diffusion panels on the walls can remove the mid-high
band pass filter that is associated with flutter echoes, as generally the wavelength of midhigh
frequency signals are much more susceptible to small unevenness in the surface of
the wall (as they have shorter wavelengths) and therefore shallow diffusion panels, with
modulation of around 1โ can remove the flutter echo. Similarly, porous absorbers, as
described in the introduction to acoustic treatment, are known to do more work on high
frequency sound waves and as such absorb more high frequency sound, as well as
reducing the reverberation time and prominence of the flutter echo. In converting this
space to small music venue, it is assumed that the performer would stand at the front of
the room furthest from the entrance and listeners would be collected facing the performer
around the rest of the room. Therefore, ideally, absorber panels would be dispersed
around each of the front 3 walls such that the likelihood of room modes occurring is
reduced, with less absorption and more diffusion in the rear of the room. This is so that
listeners are not exposed to as much of the early reflections from the performer, and more
of a diffuse reverberant field is allowed to collect in the rear of the room and circulate.
However, the front and back walls of the room (from the perspective on entry to the room)
are largely taken up by chalkboards and therefore positions of panels on these walls are
limited. To make up for this, I would place 3 absorptive panels of moderate size regularly
spaced along the right side wall of the room, with 2 on the left wall (with one in each gap
between the windows) and bass traps in the corners behind the performer (see figure 10).
Since low frequencies build at hard boundary surfaces, this means that in the corners
(meeting between two hard boundaries) bass frequencies will intensify. By placing a bass
trap at 45 degrees to this boundary (corner), this means that since the velocity
component of the sound wave instantaneously as it hits the corner will be 0, when it is a
quarter wave length from the corner it will be at a maximum and therefore placing a bass
trap panel a quarter wave length from the corner will allow maximum bass absorption. On
the opposite side of the room, a diffuser panel will be placed beneath the chalkboard in
order to prevent axial modes occurring directly between the parallel walls at the ends of
the room. If diffuser panels were placed at the sides of the chalkboard, this could possibly
cause problems as it may encourage sound to reflect into the corner of the room, which
will boost bass frequencies in the room. Since a single diffuser panel only gives limited
diffusion in the rear of the room, it may also be beneficial to place a diffusion panel
centrally on the ceiling to encourage this rear room diffuse field, as well as to discourage
room modes to occur between the floor and ceiling.
Cost and Effect
The specific absorber panel to be used would ideally be similar to that designed by GiK
Acoustics, the 242 Acoustic Panel [10]. These are framed panels where the absorber
within the panel is fibreglass and are designed to absorb a wide range of frequencies -
the space behind the panel allows some resonance and therefore low frequency sound
absorption. Alongside the diffusion panel, which largely treats the mid-high frequency
bandpass filter associated with the flutter echo, this allows a balanced response across
the room so that all frequencies are absorbed equally. The 600 x 600mm panel would be
used for the panels, which are ยฃ36.50 from GiK Acoustics so the total price of the
absorber panels is ยฃ182.50. The bass traps used would be similar to the Tri-Trap Corner
Bass Trap [11], preferably using GiK Acousticโs range limiter option so that it does not
affect upper range frequencies as much. These cost ยฃ109 each, so together would be
ยฃ218. The diffusor panels used would be similar to the GiK Acoustics VersiFusor [12], as,
although individually the panels will only scatter sound in one axis, by placing multiple
panels in different orientations next to each other, the panels will diffuse sound in all axes.
This panel also works over a large range of frequencies, from 600 to 7.5k Hz. The panels
come in a pack of 4, costing ยฃ131 (ยฃ32.75 each) which can be used for the ceiling panel
as well as beneath the chalkboard. Therefore the total price of acoustic treatment should
be ยฃ531.50.
As a result of this acoustic treatment, the most obvious desired change to the
characteristics of the room is the removal of the flutter echo. However, perhaps more
significantly in the rooms functionality is the general improvement to the acoustics. By
introducing treatment to prevent room modes this means that unevenness in the
acoustics of the room will be largely removed, which is important for the room particularly
in its everyday use. For example, since room modes will produce both nodes and
antinodes in different positions the room, this means that if a student is positioned at the
point of a node they will have significantly reduced reception of sound, which is clearly
detracting from learning. In terms of treatment to the end of converting the room to a
small venue, the removal of room modes will reduce resonance in the room, while diffuser
panels will also lend spaciousness to the room so that music played in it will be much
more enjoyable. In a study by Mo, Wu and Horner [13] it was found that a greater
reverberant field has the effect of increasing the emotion of music, specifically when
Figure 10 - Prospective floor plan of room with acoustic
treatment implemented
listeners were positioned at the back of a large hall. This is largely the aim of the
imbalance in the acoustic treatment of the room, in having absorber panels at the front of
the room and diffuser panels in the back, as it will give a greater reverberant field and,
importantly for the listener, will simulate the more natural diffuse field that is given by large
spaces that lends this emotion to music.
Benefits of Acoustic Treatment
Clearly acoustic treatment is beneficial to this space as it widens the possible uses of the
room greatly, particularly since features of the room currently such as the mid-high band
frequency resonance of the room mean that for uses with high noise levels it will not be
workable. For example, debates and discussions where multiple people will be talking at
once would cause high levels of resonance and as a result what is a high clarity index will
be masked by the noise caused by this resonance. Acoustic treatment allows control over
the reverberation time and the specific treatment used will allow a diffuse field over the
whole room, rather than the field being dominated by the vertical and horizontal axes, and
so will have a lower SPL and therefore direct sound will be more prominent in debate
scenarios and similar. Furthermore, this ignores the fact that for the room to be used as a
teaching space/discussion area it requires a much lower acoustic quality than for a music
venue, as even if the acoustics are bad, in a teaching space this is undesirable but not
catastrophic. By contrast, the sole purpose of a music venue is for music to be heard in
detail, and therefore it is imperative that it is heard to the highest possible quality and in
the most favouring circumstances. Depending on the genre of music performed in the
venue, e.g. rock or classical, acoustic treatment is needed for different reasons. For
example, in the case of rock music, generally a gig will include the setup of drums, bass
guitar, guitar and vocals, with the bass and guitar going through amplifiers which,
combined, will have a large low frequency component. In an untreated, cuboid, small
room, axial modes at low frequencies are often the most prominent, as they have the high
pressure component, which means that the bass frequencies will be distorted and
therefore this distortion will be very prominent.
Auralisation of speech & music in the room
When studying the accompanying auralisation of a bassoon recording the effect of room
reverberation is not as prominent as identified when initially recording the impulse
response (this was done by clapping in the room and seeing the response). This is largely
because the note builds up more gradually than a clap input to the room, and the
reverberation of the room is masked by the steady state tone that follows the initial note.
However, the reverberation is still noticeable in that it gives quite a hollow quality to the
bassoon tone, which is largely due to the bandpass filter effect caused by the flutter echo
in the room. Regardless of this, the tone of the bassoon is still perceived as โnasalโ and
โroughโ by the human ear. According to an experiment by John Grey (1977) [14], timbre is
perceived and grouped through 3 factors:
1 - The spectral energy distribution, ie. where energy is greatest within the harmonic
spectrum of a tone and how much it is distributed across the harmonic spectrum
2 - The synchronicity of the harmonics against each other across onset, steady-state and
offset periods
3 - The amount of high frequency, low amplitude energy during the onset phase
Various combinations of these 3 factors determine listener perceptions of how sounds
should be grouped together and how they can be described in terms of scales such as
from dull to bright. In the case of the bassoon tone, the spectral energy is distributed
widely such that the harmonics are largely distinct up to 2 kHz, as well as having a very
synchronous onset. It is important to note that the human ear will only resolve up to the
5th or 6th harmonic as separate sine waves, as after this the harmonics have bandwidths
that are greater than that of the fundamental frequency, and so 2 adjacent harmonics will
be resolved into the same filter in the basilar membrane. This means that when there are a
great number of harmonics distributed above the 5th or 6th, as in the case of the
bassoon, they will be perceived as โroughnessโ in the tone as they form striations rather
than distinct harmonics when received by the human ear (demonstrated in figure 13 -
striations are shown in the form of vertical lines in the spectrogram). The synchronous
onset of the tone is shown in figure 14, where it shows that prior to the steady state, the
harmonics present in the onset phase are very weakly represented, apart from the
fundamental frequency, before the sound enters its steady state phase very quickly and
synchronously, where it appears almost simultaneous that the full range of harmonics are
represented. However, research shows that the onset phase is crucial for differentiation
between 2 tones of the same pitch, and the onset here shows that odd harmonics are
more present within the tone (in the onset phase), which causes the onset phase of the
noteโs timbre to be perceived as nasal. The cause of the characteristics of the tone can
be demonstrated through comparison with a tuba toneโs (A#2) perceived timbre and
relevant spectrogram, shown in figure 12. The tuba is largely perceived as a rich, full
sound in comparison to the rough, nasal perception of the bassoon tone. By contrast to
the bassoon, figure 12 shows that the tuba has a high concentration of harmonics in the
lower harmonics and it is not very widely distributed, and so the lower harmonics
dominate the tone giving it a more rich sound.
It is noticeable, when compared with the auralisation of speech in the room, how much
more prominent reverberation of the room is over speech than music. In the same way
that when a bassoon was auralised in the room the reverberation was lessened by the
slow attack of the notes in the recording and the following steady state, the opposite is
true of speech. This is because, generally, speech is made up of a pattern of heavily
weighted syllables followed by lesser syllables. This pattern encourages the high attack
on the weighted syllables to cause greater reverberation in the room to mask the following
syllables. Specifically in the recording where there is a brief overlap of 2 voices this
significantly reduces the intelligibility of each individually, which serves to demonstrate the
need for acoustic treatment in the room as previously described. Furthermore, through
comparison of the spectrograms of the auralised bassoon tone (fig. 13) and the 3 second
clip of auralised speech (fig. 14), there is clearly a much higher intensity of energy in the
higher harmonics in the speech recording. Given the earlier explanation that more
harmonics above the 7th will increase the intensity of striations in the sound, it follows
that the perceived tone will increase in โroughnessโ proportionally to the intensity of the
striations. The effect of the acoustics of this particular room are to magnify the sound
energy within a set bandwidth in the mid-high frequency range and, as a result, the
striations within any speech in the room will be similarly magnified to dominate the
listenerโs perception of speech in the room. In addition to this, the increased presence
alone of higher harmonics in the
speech recording compared to the
bassoon tone and their respective
resultant magnification by the room
acoustics could partially explain why
(alongside previous explanations)
reverberation in the room seems much
more prominent over speech than the
recorded bassoon.
Figure 12 -Spectrogram of onset, steady state and offset of
auralised tuba A#2"