top of page

Music Tech and Engineering Degree - Example Work

Thinking about doing a music tech and engineering degree? Here is part of an exemplar report from Hugo - who studies music tech and Engineering.

"Acoustics & Psychoacoustics - Room Design Exercise

A brief guide to room acoustics & acoustic treatment

How sound works

Sound travels as a longitudinal wave moving in a series of compressions and rarefactions

through its medium, in most cases air particles. Sound is generated initially through the

vibration of an object, for example in the case of a drum, the drum skin vibrates, which

causes the air particles directly surrounding the material to vibrate and collide with their

neighbouring particles. This pattern of air particles colliding moves through the air in the

way of the ‘golf ball and spring’ model, as shown in figure 1. In this case the golf balls

themselves are the air particles (the propagating medium) and the springs are the

intermolecular forces between the particles. When there is a collision between particles,

or a point of compression, in air there will be a point of rarefaction that either follows or

precedes it so that the medium may return to its original state. In each of these collisions,

energy is exchanged from one particle to another, with some energy lost in various

mediums in each collision until the sound entirely dissipates when the energy transmitted

is negligible. It will also take time for the compression to propagate through the medium

and therefore there will be a lag from source to receiver. However, when a material with

boundaries is introduced, such as a guitar string, then the vibration that stimulates sound

to propagate through the medium moves laterally along the length of the material, in this

case a string. This means that the disturbance to the air particles will be a similarly lateral

disturbance in the form of a transverse wave.

Sound amplitude is mainly measured in terms of its Sound Pressure Level (SPL), or the

amplitude of the pressure of the sound wave. This is because the human ear is sensitive

to pressure.

Room Acoustics

Room acoustics are crucial to the way in which sound is experienced as they greatly

affect the tone of the sound that is in the room. The main acoustic characteristics of a

room are made up of a balance of the direct sound (sound taking the direct path from

source to receiver), early reflections (the shortest reflected paths to the receiver around

the room and therefore the amplitude of reflections is still fairly large) - see figure 2 - and

the reverberant field (sound at the receiver has been reflected a great number of times

and is therefore much more diffuse) - see figure 3. The balance of these reflections

Figure 1 - Diagram of the golf ball and

spring model of sound propagation [1]

determines the tone of the sound at the receiver, for example in a large church, such as

the York Minster [3], the reverberant field has a great contribution to the overall sound,

since the size of the church means there will be a huge number of reflections, and as

such the receiver will hear a very reverberant version of the sound transmitted by the

source. This is incredibly important when considering the purpose for which rooms are

designed as a large reverberant field may not be suitable for all purposes, for example it

will make a choir sound much more powerful, but it is not suitable for conferences as a

large number of reflections will reduce how clearly the sound is received. This means that,

when designing a room, the factor of room size must be considered for purpose, but also

the material that the room is made out of, as harder materials will have a lower absorption

coefficient (the amount of sound that is absorbed by a material) and will therefore

encourage more reflection and reverberation. For example, in the prior example of a

church, it is constructed from stone which will have a very low absorption coefficient.

Another factor that means room acoustics are very important in the experience of sound

is the presence of room modes. There are 3 kinds of room modes, axial, tangential and

oblique modes, as shown in figure 4. Room modes occur when a reflection path is of a

length that is a factor of a half-wavelength of a frequency of the sound that was

transmitted into the room. This means that a standing wave will be formed at that

frequency, such that, if the waves are in phase then they will interfere constructively so

that any sound that is produced in that room will have a point at which there is a node (a

point of maximum amplitude) where sound at that frequency is louder. Similarly, if the

waves are in anti-phase then there will be an antinode in the room, where sound at that

frequency will be much quieter. This is also the reason why rooms such as the York

Minster are desirable in many cases, as it is a complicated shape and therefore it is much

more unlikely for standing waves to form due to its irregular shape.

Acoustic Treatment

Acoustic treatment can be used to improve the quality of sound through the targeting of

problem areas in the acoustic characteristics of the room. For example, in the case of a

studio, room modes can greatly affect the monitoring quality as they can ruin the desired

frequency response of high quality speakers. Moreover, in the studio it is not desirable for

there to be much, if any, reverberation. Therefore, in acoustically treating the room, as

Figure 2 - Diagram of early reflected paths taken

from source to receiver [1]

Figure 3 - Diagram of the reflected paths taken

from source to receiver in the reverberant field


shown in figure 5, the vast majority of

reflections can be absorbed and therefore

all room modes will also be removed.

The main method of acoustic treatment is

through using absorbers and diffusers

positioned specifically so that they cancel

or reduce reflections. This is shown in figure

5, as absorber panels are placed at the

points at which sound reflects off the room

surfaces to remove reflections. There are 2

main types of absorber, porous absorbers

and resonant absorbers. Porous absorbers

are materials such as carpets and curtains,

where the sound wave loses energy due to

work done by the friction in the absorbers -

the propagation of a sound wave is reduced

as the collisions between air particles lose

energy due to the work done by the friction

in the porous material. This means at higher

velocities, given that velocity increases with

frequency, the porous absorber will absorb

more energy at high frequencies., Resonant

absorbers work by the sound energy

causing the absorber structure itself to vibrate, and as

such there are then frictional losses

within the absorbing material in the

resonant absorber (see figure 6). As

such, they are sensitive to the

pressure amplitude of the the sound

wave rather than the sound velocity,

as in porous absorbers. Diffusers

work in the opposite way, in that they

spread sound into many directions (see figure 7). This is done using uneven convex

surfaces on panels that allow the reverberation of a room to be maintained while

preventing room modes to occur, as it means standing waves cannot be formed when

they are reflected directly back in the direction they were moving previously.

Figure 5 - Model showing how acoustic

treatment can reduce reflections [4]

Figure 6 - Structure of a resonant absorber [1]

Figure 7 - Convex structure of a

A guide for treating different types of rooms

Things to consider:

- The materials which the room is made from - e.g. if the floor is made from concrete -

and how reflective these are

- The shape of the room - a more complex shaped room favours a more diffuse reverb in

the room, if the room is a cuboid shape (especially for small rooms) this will cause room

modes (see Room Modes section) to form between the parallel walls. These modes can

be treated through absorptive or reflective panels.

- The materials inside the room - consider their size and how reflective or absorptive the

surfaces are. In cuboid rooms or rooms with parallel walls large objects in the room can

help to break up the room modes - e.g. a church partially has very diffuse reverb

because lots of objects such as supporting pillars reflect the sound waves in different


- The purpose of the room - this can be widened by simply considering whether the

room is meant for speech or music. In the case of a medium sized cuboid room, room

modes are treated differently for speech and music - for speech generally using

absorptive panels is best whereas for music diffusion is preferable as a lack of

reverberation can reduce the effect of music greatly.

- The current features of the acoustics of the room - this means identifying whether there

is any part of sound that is more prominent when in the room, such as low or high pitch

ranges, or whether there is a flutter echo in the room. By identifying this it will give an

idea of which room modes are causing the most trouble in the acoustics of the room

and therefore where to put treatment to stop them.

An example:

In a studio (usually situated in small cuboid rooms) generally you want the acoustic

response to be as flat as possible in the frequency range, and to have a very short

reverberation time so that the features of music played in the room can be monitored to

the highest possible quality. This means that absorptive panels are usually the preferred

treatment, positioned especially in the vicinity of the monitors to prevent early reflections

from them into the room as they will distort the frequency range of the sound transmitted

into the room most.

Further Reading

- D Howard, J Angus - Acoustics & Psychoacoustics, 3rd Edition - for more detailed


- St udi o SOS Gu ide To Mo n i t o r i n g & Ac o u s t i c Tre a tme n t , h t t p s : / /

— practical tips on acoustic treatment in a studio environment

- Studio SOS: Building A DIY Vocal Booth,

studio-sos-building-diy-vocal-booth - An in depth example of identifying the current

acoustic characteristics of a room, and their conversion to a specific end

Technical Report on Proposed Acoustic Treatment


In order for this to be a complete guide to room acoustics and acoustic treatment, an

example room will be analysed and a impulse response measurement will be taken, so

that as well as objectively analysing the properties of that room physically, the subjective

sound experience that results can also be analysed through convolving the impulse

response with anechoic speech and instrumental recordings.

The Room

The room analysed was moderately sized, of dimensions 7 : 5.3 : 2.3 m (Length : Width :

Height), intended as a lecture room for maths students. The room has thin (1/2” thick)

medium sized plasterboard suspended ceiling panels on the ceiling and the walls are also

made entirely out of thin (although more robust than the ceiling) plasterboard panelling.

Given that this thin panelling has hollow space behind it, the walls and ceiling will act

partially like a resonant absorber in that they will resonate with the incoming waves to an

extent (although not very much due to the size of the materials used) and therefore, as in

a resonant absorber, it will absorb some of the low end frequencies. Besides this, given

that the panelling is a fairly hard surface it will have a low absorption coefficient and

therefore will reflect a lot of the sound in the room. Table 1 shows this pattern in the

absorption coefficients of the walls and ceiling, as above 250 Hz their absorption

coefficient decreases greatly. The wall on the far side on entry (holding the windows) is


Frequency (Hz)


Area (m2)

125 250 500 1k 2k 4k

Carpet on



37.1 0.02 0.06 0.15 0.4 0.6 0.6

Glass Window 0.19 0.3 0.2 0.2 0.1 0.07 0.04


block (painted)


12.9 0.1 0.05 0.06 0.07 0.1 0.1



Ceiling grid)


37.1 0.15 0.11 0.04 0.04 0.07 0.08


(Panelling on

studs) [Walls]

39.1 0.29 0.1 0.06 0.05 0.04 0.04

Table 1 - Table of absorption coefficients of materials at relevant


more solid, made out of painted concrete, and therefore has a more level absorption

coefficient across the frequency spectrum, in that it is a very reflective surface at all

frequencies with a maximum coefficient of 0.1. The floor is glue-down carpet (this is

generally roughly 1 cm thick) on concrete. This has a much higher absorption coefficient,

specifically at high frequencies as the friction in the carpet will cause energy loss through

friction in the sound, and will be the largest factor in dissipating reverberation in the room

- above 2 kHz it has an absorption coefficient of 0.6. Two of the walls are covered with

chalkboards, and there are a number of synthetic wooden desks, chairs as well as a

computer and desk at the front. Although these will serve, to an extent, to break up room

modes, they are extremely reflective. Finally, on the left side of the room there are 4 glass

windows (roughly 0.25m wide) made up of 2 panels, extending from floor to ceiling, which

are relatively reflective, however given they are quite old this may mean there will be

significant leakage between the panels.

Estimation of first 10

room mode frequencies


1) - 2250

2) - 3236

3) - 3945

4) - 4513

5) - 5553

6) - 6472

7) - 6769

8) - 6854

9) - 7457

10) - 7503

There is a high concentration of

room modes around 6500 and

7500Hz. This will be problematic as it will give quite a large boost to frequencies in the

upper end of the medium range of human hearing, which will be tangible in any

reverberation in the room. Furthermore, a flutter echo was quickly identified, as the room

is made up of parallel hard, low absorption surfaces.

Calculated overall reverberation time

The RT60 can be calculated using the Sabine formula:


RT60 = 0.161 * V


Figure 8 - Plot of all room modes for the room analysed, where blue

represents axial modes, red represents tangential modes and orange

oblique modes

Where V is the volume of the room and Sa is the total absorption over the surface area

over the room at 500Hz, since this is the industry standard. This gives an RT60 of 1.35

seconds, which is quite long for a small room, demonstrating how reflective the room is,

although at higher frequencies this will decreases as higher frequency sound energy is

absorbed more due to a higher velocity component and therefore greater frictional losses.

The main acoustic properties of the room are, firstly, the lack of low and low-mid

frequency presence in the impulse response recorded, which can be attributed to the

resonant nature of the walls and ceiling mentioned previously, as well as being notable for

the difference between tangible early reflections and the reverberant field, as there is a

marked loss in low and low-mid frequencies across the impulse response as the balance

of sound moves from direct sound and early reflections to the reverberant field

dominating. The room has a relatively long reverberation time for a room of the size

measured. This is largely because of the low absorption coefficients of the walls, floor and

ceiling of the room. Furthermore, since it is a small room, the SPL of the reverberant field

is also quite high. As shown by Pop and Cabrera [6], this is one of the defining factors in

terms of the human perception of differences between a large and small room, since the

sound field level decreases with distance (generally -6 dB as the distance doubles). The

balance of the 3 main features of room acoustics also largely determines human

perception of room size. This is because the main methods with which the human ear can

localise sound is through Interaural Intensity Differ-=ence (IID) and Interaural Time

Difference (ITD). This is the method through which the time difference between the arrival

of sound to each ear allows the brain to locate sound as by arriving earlier to one ear this

means the sound will be closer to that ear, and the extent of the difference determines the

source’s position compared to the head. Similarly IID is based on the idea that the

intensity of sound will decay with distance and therefore if the sound is less intense at

one ear than the other then this means it must have travelled further to reach it, therefore

the source must be positioned closer to the ear it arrived at with greater intensity. In the

case of direct sound, the brain will immediately be able to isolate its position based on

these methods, and similarly the early reflections of sound will be distinct in arriving from

the general direction of the source. By contrast, the reverberant field consists of sound

Figure 9 - Plot of values for C50 and D50 from 125 Hz to 8

kHz and A, C and L frequency weightings [16]

arriving that has been reflected multiple times and therefore will come from all directions,

so the brain cannot distinguish the source direction from this. Through this sense of

‘surrounding’ and it’s balance against the direct sound and early reflection level the brain

can perceive the size of a room.

Clarity is the measure of the ratio of early (shorter than 50ms) to late (greater than 50ms)

sound energy, derived on the basis that late reflections (those arriving after 50ms) cause

speech to merge making it unclear, whereas those before 50ms will constructively add to

the speech intelligibility. Given that the clarity index (or C50, as seen on figure 9) across

the main range of human speech (the main harmonics of which lie between 1 kHz to 2.5

kHz depending on the fundamental frequency) is between 6.4 dB and 7.2 dB this means

that the room has high clarity, since generally the minimum required clarity for speech is

roughly 3 dB [7]. The value of C50 increases with frequency, as absorption increases with

frequency and therefore there is less reverberation, meaning that the early to late sound

ratio is higher, and therefore at low frequencies the clarity is very poor (below 3 dB) but

the human voice does not have much presence at these frequencies so it is not greatly


Definition (D50 on figure 9) is similarly the measure of early sound energy (from 0 to 50ms)

as a ratio of the total sound energy of the impulse response. It is used to analyse the

suitability of a space for music - i.e. a higher definition means that tones are clearer to the

listener - , expressed as a percentage, and therefore a space does not need to have such

high definition as clarity if it is intended for music as the same level of intelligibility is not

required as for speech. Similar to C50, the D50 value is moderately poor at low

frequencies, with values ranging from 52% to 68% from 125 to 500 Hz, and good at the

mid-high range with values around and above 82% from 1 kHz. The listener experience is

dominated by the range from 500 Hz to 4 kHz, with any D50 value below 50% considered

poor, but ideally D50 values should be above 85% [15]. However, unlike in speech, the

low frequency range is much more important for music as there are many instruments

that mainly play within the low frequency range, while the low end frequency spectrum

also lends richness of tone to many other instruments.

High values for Clarity and Definition parameters are to be expected in a small room as

that analysed, since there are not as many late reflections (those arriving after 50 ms for

clarity/definition respectively) to reduce the intelligibility of the input signal to the room as

in a large space. Therefore, although the space is functional as a classroom (as it as

intended to be on construction) for students to understand a lecturer, it is not pleasant or

desirable for any kind of music, largely due to the flutter echo that is present in the room.

The flutter echo is identified in the convolved ‘Maths Claps’ [8] audio (auralised clap

recording in the room) as a hollow sounding repeated reflection at a set tone. This occurs

when sound played into a room with parallel walls is reflected and trapped in a repeating

pattern between opposite parallel walls, which causes the sound received to fluctuate, or

‘flutter’. It is also more common in spaces, such as this one, where the parallel walls are

less absorptive than the floor/ceiling, as this means that oblique and tangential modes are

damped where axial modes are not and as such the axial modes are more prominent. In

terms of listener perception of this flutter echo, it is immediately identified as seeming

unnatural for the room due to it’s obvious standing out from the natural reverberant field

level. This is also partially due to the fact that the reverberation is largely set up in the

horizontal plane in the room, and therefore the reverberant field is not diffuse.


Since the room itself has satisfactory clarity and definition values, clearly any acoustic

treatment on the space to turn it into a small music venue would be focused on reducing

or removing the flutter echo and generating a frequency response of the room itself to suit

music, i.e. music is often more desirable with more prominent mid band frequencies and

flattened high and low frequency spectra.

Firstly, flutter echoes can in general be treated through the use of diffusion or absorber

panels placed on the parallel walls from which the echo arises, making sure not to

increase absorption on the floor or ceiling as it can actually increase the prominence of

the flutter echo in the room [10]. Diffusion panels on the walls can remove the mid-high

band pass filter that is associated with flutter echoes, as generally the wavelength of midhigh

frequency signals are much more susceptible to small unevenness in the surface of

the wall (as they have shorter wavelengths) and therefore shallow diffusion panels, with

modulation of around 1” can remove the flutter echo. Similarly, porous absorbers, as

described in the introduction to acoustic treatment, are known to do more work on high

frequency sound waves and as such absorb more high frequency sound, as well as

reducing the reverberation time and prominence of the flutter echo. In converting this

space to small music venue, it is assumed that the performer would stand at the front of

the room furthest from the entrance and listeners would be collected facing the performer

around the rest of the room. Therefore, ideally, absorber panels would be dispersed

around each of the front 3 walls such that the likelihood of room modes occurring is

reduced, with less absorption and more diffusion in the rear of the room. This is so that

listeners are not exposed to as much of the early reflections from the performer, and more

of a diffuse reverberant field is allowed to collect in the rear of the room and circulate.

However, the front and back walls of the room (from the perspective on entry to the room)

are largely taken up by chalkboards and therefore positions of panels on these walls are

limited. To make up for this, I would place 3 absorptive panels of moderate size regularly

spaced along the right side wall of the room, with 2 on the left wall (with one in each gap

between the windows) and bass traps in the corners behind the performer (see figure 10).

Since low frequencies build at hard boundary surfaces, this means that in the corners

(meeting between two hard boundaries) bass frequencies will intensify. By placing a bass

trap at 45 degrees to this boundary (corner), this means that since the velocity

component of the sound wave instantaneously as it hits the corner will be 0, when it is a

quarter wave length from the corner it will be at a maximum and therefore placing a bass

trap panel a quarter wave length from the corner will allow maximum bass absorption. On

the opposite side of the room, a diffuser panel will be placed beneath the chalkboard in

order to prevent axial modes occurring directly between the parallel walls at the ends of

the room. If diffuser panels were placed at the sides of the chalkboard, this could possibly

cause problems as it may encourage sound to reflect into the corner of the room, which

will boost bass frequencies in the room. Since a single diffuser panel only gives limited

diffusion in the rear of the room, it may also be beneficial to place a diffusion panel

centrally on the ceiling to encourage this rear room diffuse field, as well as to discourage

room modes to occur between the floor and ceiling.

Cost and Effect

The specific absorber panel to be used would ideally be similar to that designed by GiK

Acoustics, the 242 Acoustic Panel [10]. These are framed panels where the absorber

within the panel is fibreglass and are designed to absorb a wide range of frequencies -

the space behind the panel allows some resonance and therefore low frequency sound

absorption. Alongside the diffusion panel, which largely treats the mid-high frequency

bandpass filter associated with the flutter echo, this allows a balanced response across

the room so that all frequencies are absorbed equally. The 600 x 600mm panel would be

used for the panels, which are £36.50 from GiK Acoustics so the total price of the

absorber panels is £182.50. The bass traps used would be similar to the Tri-Trap Corner

Bass Trap [11], preferably using GiK Acoustic’s range limiter option so that it does not

affect upper range frequencies as much. These cost £109 each, so together would be

£218. The diffusor panels used would be similar to the GiK Acoustics VersiFusor [12], as,

although individually the panels will only scatter sound in one axis, by placing multiple

panels in different orientations next to each other, the panels will diffuse sound in all axes.

This panel also works over a large range of frequencies, from 600 to 7.5k Hz. The panels

come in a pack of 4, costing £131 (£32.75 each) which can be used for the ceiling panel

as well as beneath the chalkboard. Therefore the total price of acoustic treatment should

be £531.50.

As a result of this acoustic treatment, the most obvious desired change to the

characteristics of the room is the removal of the flutter echo. However, perhaps more

significantly in the rooms functionality is the general improvement to the acoustics. By

introducing treatment to prevent room modes this means that unevenness in the

acoustics of the room will be largely removed, which is important for the room particularly

in its everyday use. For example, since room modes will produce both nodes and

antinodes in different positions the room, this means that if a student is positioned at the

point of a node they will have significantly reduced reception of sound, which is clearly

detracting from learning. In terms of treatment to the end of converting the room to a

small venue, the removal of room modes will reduce resonance in the room, while diffuser

panels will also lend spaciousness to the room so that music played in it will be much

more enjoyable. In a study by Mo, Wu and Horner [13] it was found that a greater

reverberant field has the effect of increasing the emotion of music, specifically when

Figure 10 - Prospective floor plan of room with acoustic

treatment implemented

listeners were positioned at the back of a large hall. This is largely the aim of the

imbalance in the acoustic treatment of the room, in having absorber panels at the front of

the room and diffuser panels in the back, as it will give a greater reverberant field and,

importantly for the listener, will simulate the more natural diffuse field that is given by large

spaces that lends this emotion to music.

Benefits of Acoustic Treatment

Clearly acoustic treatment is beneficial to this space as it widens the possible uses of the

room greatly, particularly since features of the room currently such as the mid-high band

frequency resonance of the room mean that for uses with high noise levels it will not be

workable. For example, debates and discussions where multiple people will be talking at

once would cause high levels of resonance and as a result what is a high clarity index will

be masked by the noise caused by this resonance. Acoustic treatment allows control over

the reverberation time and the specific treatment used will allow a diffuse field over the

whole room, rather than the field being dominated by the vertical and horizontal axes, and

so will have a lower SPL and therefore direct sound will be more prominent in debate

scenarios and similar. Furthermore, this ignores the fact that for the room to be used as a

teaching space/discussion area it requires a much lower acoustic quality than for a music

venue, as even if the acoustics are bad, in a teaching space this is undesirable but not

catastrophic. By contrast, the sole purpose of a music venue is for music to be heard in

detail, and therefore it is imperative that it is heard to the highest possible quality and in

the most favouring circumstances. Depending on the genre of music performed in the

venue, e.g. rock or classical, acoustic treatment is needed for different reasons. For

example, in the case of rock music, generally a gig will include the setup of drums, bass

guitar, guitar and vocals, with the bass and guitar going through amplifiers which,

combined, will have a large low frequency component. In an untreated, cuboid, small

room, axial modes at low frequencies are often the most prominent, as they have the high

pressure component, which means that the bass frequencies will be distorted and

therefore this distortion will be very prominent.

Auralisation of speech & music in the room

When studying the accompanying auralisation of a bassoon recording the effect of room

reverberation is not as prominent as identified when initially recording the impulse

response (this was done by clapping in the room and seeing the response). This is largely

because the note builds up more gradually than a clap input to the room, and the

reverberation of the room is masked by the steady state tone that follows the initial note.

However, the reverberation is still noticeable in that it gives quite a hollow quality to the

bassoon tone, which is largely due to the bandpass filter effect caused by the flutter echo

in the room. Regardless of this, the tone of the bassoon is still perceived as ‘nasal’ and

‘rough’ by the human ear. According to an experiment by John Grey (1977) [14], timbre is

perceived and grouped through 3 factors:

1 - The spectral energy distribution, ie. where energy is greatest within the harmonic

spectrum of a tone and how much it is distributed across the harmonic spectrum

2 - The synchronicity of the harmonics against each other across onset, steady-state and

offset periods

3 - The amount of high frequency, low amplitude energy during the onset phase

Various combinations of these 3 factors determine listener perceptions of how sounds

should be grouped together and how they can be described in terms of scales such as

from dull to bright. In the case of the bassoon tone, the spectral energy is distributed

widely such that the harmonics are largely distinct up to 2 kHz, as well as having a very

synchronous onset. It is important to note that the human ear will only resolve up to the

5th or 6th harmonic as separate sine waves, as after this the harmonics have bandwidths

that are greater than that of the fundamental frequency, and so 2 adjacent harmonics will

be resolved into the same filter in the basilar membrane. This means that when there are a

great number of harmonics distributed above the 5th or 6th, as in the case of the

bassoon, they will be perceived as ‘roughness’ in the tone as they form striations rather

than distinct harmonics when received by the human ear (demonstrated in figure 13 -

striations are shown in the form of vertical lines in the spectrogram). The synchronous

onset of the tone is shown in figure 14, where it shows that prior to the steady state, the

harmonics present in the onset phase are very weakly represented, apart from the

fundamental frequency, before the sound enters its steady state phase very quickly and

synchronously, where it appears almost simultaneous that the full range of harmonics are

represented. However, research shows that the onset phase is crucial for differentiation

between 2 tones of the same pitch, and the onset here shows that odd harmonics are

more present within the tone (in the onset phase), which causes the onset phase of the

note’s timbre to be perceived as nasal. The cause of the characteristics of the tone can

be demonstrated through comparison with a tuba tone’s (A#2) perceived timbre and

relevant spectrogram, shown in figure 12. The tuba is largely perceived as a rich, full

sound in comparison to the rough, nasal perception of the bassoon tone. By contrast to

the bassoon, figure 12 shows that the tuba has a high concentration of harmonics in the

lower harmonics and it is not very widely distributed, and so the lower harmonics

dominate the tone giving it a more rich sound.

It is noticeable, when compared with the auralisation of speech in the room, how much

more prominent reverberation of the room is over speech than music. In the same way

that when a bassoon was auralised in the room the reverberation was lessened by the

slow attack of the notes in the recording and the following steady state, the opposite is

true of speech. This is because, generally, speech is made up of a pattern of heavily

weighted syllables followed by lesser syllables. This pattern encourages the high attack

on the weighted syllables to cause greater reverberation in the room to mask the following

syllables. Specifically in the recording where there is a brief overlap of 2 voices this

significantly reduces the intelligibility of each individually, which serves to demonstrate the

need for acoustic treatment in the room as previously described. Furthermore, through

comparison of the spectrograms of the auralised bassoon tone (fig. 13) and the 3 second

clip of auralised speech (fig. 14), there is clearly a much higher intensity of energy in the

higher harmonics in the speech recording. Given the earlier explanation that more

harmonics above the 7th will increase the intensity of striations in the sound, it follows

that the perceived tone will increase in ‘roughness’ proportionally to the intensity of the

striations. The effect of the acoustics of this particular room are to magnify the sound

energy within a set bandwidth in the mid-high frequency range and, as a result, the

striations within any speech in the room will be similarly magnified to dominate the

listener’s perception of speech in the room. In addition to this, the increased presence

alone of higher harmonics in the

speech recording compared to the

bassoon tone and their respective

resultant magnification by the room

acoustics could partially explain why

(alongside previous explanations)

reverberation in the room seems much

more prominent over speech than the

recorded bassoon.

Figure 12 -Spectrogram of onset, steady state and offset of

auralised tuba A#2"