Chapter 4: ENHANCING THE
EMOTIONAL IMPACT OF A TEXT THROUGH ELECTRONIC MANIPULATION.
Emotional
connections between music and language
Textual
manipulation of speech for emotional effect
Approaches to
creating musical settings for text
Examples of
electronic manipulation of speech to enhance the emotional content of text
The Vanity of Words by Roger Reynolds
The computer and
composition processes used in Under An Open Minded Sky
The ISPW MAXª patches
created for Under An Open Minded Sky
Creating the
performance space for Under An Open Minded Sky
Editing of the affected
voice segments and of the reading, for presentation on compact disc
This chapter looks at the
process taken in creating a musical composition to be performed in conjunction
with a poetry reading. The result is a text setting in which the composition
and the accompanying reading share equal importance in performance. Like Someone, the
IRCAM Signal Processing Workstation was also used, as discussed above in
Chapter Three. However there are some significant differences in both
programming techniques and philosophies between the use of the ISPW in Under
An Open Minded Sky and Someone. In Someone a single
algorithm, made up of a number of sub-algorithms was used to create the piece.
In Under An Open Minded Sky, however, a number of different and purpose
specific algorithms were created in order to produce very specific results.
The entire sound track was made
up of segments of readings of the text and other vocal utterances by Felix
Nobis. These segments were adjusted in the ISPW Max program, using pitch
shifting, ring modulating, sampling and delaying algorithms. The resulting
modified voice sounds were assembled using the digital audio editing program
ProToolsª to create the final piece.
Under An Open Minded Sky is
a poem written by Felix Nobis specifically to be presented as part of an
electro-acoustic performance. It was commissioned for the opening of the
Hawthorn Literary Festival in 1995. The piece is designed to be performed live
with a degree of improvisation by the reader and the audio technician.
In all performances so far the
sound track was presented on a single four track open reel tape, to ensure
proper coordination between each of the tracks. In each performance the tape
was played through a four channel mixer which allowed individual panning,
channel selection, speaker selection, volume and equalisation for each track. A
diagram of the stage design is given below. The rendition presented here is a
mix of a studio recording of the text and the soundscape used in performance.
This rendition is based on the performance given at the opening of the
Australian Computer Music Conference, Melba Hall, 1995.
The goal of this piece is to
produce a composition in which the emotional aspects of the text used is reinforced
and commented on through computer-based electro-acoustic manipulation.
Words, when uttered, contain certain meanings not
related to their lexical meanings. Physical, emotional and mental states affect
the manner of utterances and therefore their perceived meaning. These effects
are usually in the pitch, timing and spectrum of the utterance, that is, the
musical aspects of the utterance. For complex communication differences in
pitch, timing and spectrum alone are not as useful as utterances containing
meanings agreed upon by the communicants. However it is often the pitch, timing
and spectrum of an utterance that is most effective and most noticed when
speech is the avenue of communication. This can be tested by saying "Pass
the salt please" at the highest possible amplitude, the highest possible
pitch and over the longest possible time when next dining with friends. It is
the pitch, timing and spectrum, the inflection or intonation, of speech that
gives it its emotional impact.
In the paper Emotional Patterns in Intonation and
Music[1] Ivan
F—nagy and Klara Magdics describe the melodic patterns of ten different
emotions or emotional attitudes. To do this they look at the similarities and
differences between pitch contours of speech in the Hungarian, German, English
and French languages when particular emotional states are being expressed.
Their finding is that there are, in general, similarities in the intonational
properties of emotional expression between these different European cultures
and languages. These intonational properties are, however, modified by the
natural constraints of the particular language of the speaker.
Fonagy and Magdics list ten attitudes, or feelings, and
co-relate them to musical gestures. These ten attitudes are summarised in the
list below.
1) Joy is signified by a wide pitch range, arhythmical
stress placement and portamentos which leap up followed by a smaller leap down,
at which level the pitch of the utterance remains.
2) Tenderness is signified by a gently undulating, high
pitch level which descends slightly at the end of the phrase. The phrasing is
legato and the articulation is soft, labial and slightly nasal.
3) Longing is signified by short, legato phrasing which
rises to the stressed syllable and then falls. The tempo and amplitude are
generally restrained, with the amplitude diminishing as the melody rises. The
voice production is generally breathy.
4) Coquetry is signified by a lively tempo and staccato
phrasing, the emphasised syllables being whispered and the final syllable
gliding up.
5) Surprise is expressed by the beginning of the phrase
being stressed and gliding up or down. The tempo is restrained and the
articulation breathy. F—nagy and Magdics do not specify the type of surprise in
their text.
6) Fear is signified by a very narrow pitch range, even,
unstressed syllables and occasionally a slight rise at the end of the phrase.
7) Complaint is signified by a 'floating' melody which
slowly descends at regular levels and by intonation which ascends.
8) Scorn is signified by an even and slightly descending
melodic line, a slow tempo and long stressed syllables.
9) Anger is signified by large leaps at the beginning of
phrases, imperfect articulation and stress on the secondary syllables.
10) Sarcasm is signified by portamento and by lengthening
of the stressed syllables.
F—nagy and Magdics took these examples from the
Hungarian language. It was found that other western languages follow similar
intonational forms, however these intonational forms are modified by the
natural constraints of the particular language.
Johan Sundberg quotes Sedl‡cek and Sychra as showing
that understanding a language or culture is not necessary in identifying the
emotional state of a speaker[2], and goes
on to say that the emotional state of a speaker can be found in the frequency,
breathing pattern, amplitude and the glottal voice source; that is, how the
glottis is used in speech production.
Sundberg makes a similar list of the speech
characteristics of emotional states to that of F—nagy and Magdics:
ANGER: High
phonation frequency, almost half an octave above the normal level for neutral
speech. Some syllables are pronounced with high emphasis (increased intensity
and sudden increases in phonation frequency). The articulation is almost
excessively distinct.
FEAR: The
phonation frequency is lower compared with anger. Sudden peaks and
irregularities are seen in the phonation frequency. The articulation is more
precise than in the neutral situation.
SORROW:
Little variability in the phonation frequency. The articulation is slow and
vowels, consonants, and pauses are long; irregularities can be found in the
voice (traces of hoarseness) and the phonation frequency is almost falling
monotonously towards the end of the phrase and shows traces of tremor.
NEUTRAL:
Neutral speech is generally faster than the above mentioned states of emotion.
The consonants are often pronounced imprecisely but the vowels show a well
defined pattern with few examples of those irregularities which bear witness to
lack of voice control[3].
It is involuntary physiological actions that cause
the vocal tract to alter when in different emotional states. Singers and actors
are usually aware of these actions, intuitively or consciously, and use these
actions, again intuitively or consciously, to produce inflections or other
speech characteristics in order to evoke emotional sympathy in their audiences.
English does not rely heavily on the inflection used,
that is pitch and temporal distinctions, when speaking to convey the lexical
meaning of the word. For the most part conveying emotional content or the
emotional state of the speaker is done with inflection. There are, of course,
many exceptions such as the word "pervert", where inflection is
critical to conveying the lexical content of the word. In this case accenting
the first syllable conveys a different meaning to that conveyed if the second
syllable is accented.
Tone
languages, on the other hand, require pitch and temporal distinctions to have
meaning. Tone languages make up the majority of languages used in
communication. Thai is an example of a tone language and Table 4.1 below shows
five different meanings of the utterance [naa], each meaning is dependent on
the inflection used when saying the word[4].
|
Utterance
|
Inflection |
Meaning |
|
naa |
with a
low tone |
a
nickname |
|
naa |
with a
mid tone |
rice
paddy |
|
naa |
with a
high tone |
young
maternal uncle or aunt |
|
naa |
with a
falling tone |
face |
|
naa |
with a
raising tone |
thick |
According to Levman[5] expressive
intonation, when used in the Chinese language, is independent of the tones and
dialects of the language. When Chinese and Thai children learn language,
intonational patterns for expressive purposes are learned before correct tonal
production, then comes segmental production[6]. This is
similar to the order in which children learn non-tone languages.
The work of F—nagy and Magdics, and that of Sundberg
show that in speech there are certain recognisable and universal attributes in
intonation which convey meaning outside of the meaning of the words spoken.
Leveman puts forward that this non-lexical understanding is learned before
lexical understanding, even where the same intonational attributes are required
for lexical understanding. This points to an understanding of the sound of
speech that can be used in creating an emotional effect in the listener.
In her book Music, Archetype and the Writer: A
Jungian View Bettina Knapp[7] discusses
James Joyce's use of the emotional content of particular phonemes in his short
story Eveline. Knapp's analysis reveals the use of music as a metaphor
and an organising agent in Joyce's story. She looks upon the work as an
auditory experience, to be read either aloud or within the reader's mind[8].
The musical experience of the character Eveline is
"accentuated by [Joyce's] complex system of figures of speech and by his
use of stressed and unstressed phonemes of beguiling sonority"[9]. By using
the word "system" Knapp implies that Joyce could have used musical
techniques in the structuring of his sentences. An example of this system can
be seen in the sentence: "Eveline sat at the window watching the evening
invade the avenue." Knapp sees this phrase as employing very careful use
of alliteration: "Eveline", "evening", "invade"
and "avenue" are used to encase "window" and
"watching".
Knapp places particular meaning on phonemes. She has
contrasted the w, to which she ascribes "an airy, breezy quality, indicative
of the need to move about", with the v, to which she ascribes
opposite qualities[10]. The
structure of this phrase could be seen as an inversion of the
tonic-dominant-tonic technique used in musical composition. Here the tonic, or
place of rest or release is created by the w sound and the dominant, or
place of action or tension, is created by the v sound.
The relationships of phoneme to mood are context
based and, according to Knapp, are used by Joyce to support the mood he is
trying to create in the story: first of squalor, then of hope, dreams and
expectation, and finally a return to hopelessness, the status quo for the
protagonist Eveline. The sentence "The man out of the last house passed on
his way home; she heard his footsteps clacking along the concrete pavement and
afterwards crunching along the cinder path before the new red houses"[11] reveals
more alliteration and onomatopoeia: the sounds "clacking",
"concrete" and "crunching" contribute a harsh, aggressive
and dissatisfied atmosphere, including the reader in the oppressiveness of
Eveline's situation[12].
Knapp has other, highly subjective, suggestions for
giving emotional meaning to phonemes. These include: h as being harsh
and violent; r as being free flowing; o and oo as being
reminiscent of the past; cr as choking; d as harsh, cold and
jarring[13]. These
meanings can be attributed to the physical and physiological production of the
phonemes. For example:
h is produced by opening and closing the
glottis. When closing the glottis the air stream is abruptly cut, giving a
sense of violence, also the expulsion of air can reinforce this sense of
violence;
d is voiced and produced by the tongue
releasing an explosion of air, this could create the harsh, jarring effect
Knapp alludes to;
cr
sounds like the utterance one makes when choking and therefore may momentarily
create that sensation in the listener and speaker.
These
theories are highly subjective and open to argument; however it is difficult to
deny that certain speech sounds can, and usually do, elicit particular emotions
in both listeners and speakers alike. According to Knapp, Joyce deliberately
used particular sounds to elicit particular emotions to reinforce the readers'
emotional experience of the story. This process, of using particular sounds to
elicit particular emotions, is used by composers and poets as a matter of
course in the expression of their ideas.
With
the advent of electronic processes which can manipulate sounds, composers have
explored the manipulation of the human voice as a means of manipulating the
emotional responses of their audience. Three examples are given below.
Setting
music to a text is one of the most integral and important traditions in the
composition processes in the music of all ethnicities. Text setting has roots,
and is continually used as a composition strategy, in both the academic and
folk styles of any culture. It is assumed that, by adding a musical element to
a text, that text will have its meaning enhanced and its effectiveness
increased. In doing this the text is interpreted through the filter of the
composer, who applies music as a commentary on the text.
In
his paper Text and Music: Some new directions Lawrence Kramer discusses
Christian Wolff's piece, Leaning Forward. Kramer uses this piece as an
example through which to look at the parallelisms of traditional text
setting, in particular those in which there are direct and perceivable
correlations between the text and music. According to Kramer, Leaning
Forward does not "underwrite the possibility of false textual
coherence by attaching it arbitrarily to musical coherence"[14] and
therefore, according to Kramer, Wolff questions the validity of relating the
formal aspects of a given text to the formal aspects of the accompanying music[15].
In
saying this Kramer questions the validity of text setting as a compositional
assistant. He takes the opposing position, that the musical aspect gives
coherence to the textual aspect of a text set to music. This is not the usual
approach. More often than not the text is used as the structural backbone,
mostly in a semantic sense, for a musical composition in which the text is of
paramount importance.
In
some cases, such as opera and the popular song, the text is devised in such a
way as to fit the structures of the medium; it is rare that the music is
structured around the needs of the text. For example: the popular song usually
follows an a-a'-b-a'' form with each section lasting eight measures. This requires
that the text fit the constraints of the form, reaching some kind of rhythmic
and semantic conclusion every eight measures.
Kramer
does not acknowledge this opposing approach to creating a relationship between
text and music. In his conclusion he suggests that there is a continuum in this
relationship. At one end music corresponds to the structure of the text, which
Kramer calls a text driven approach, and at the opposite end music corresponds
to an understanding of the text, which Kramer calls a reading driven approach.
These
two limits to the approaches of relating text to music appear self evident. It
is assumed that the composer will make some attempt to forge an obvious
relationship between the two media.
Peter
Stacey explores different methods of text setting in his paper Towards the
analysis of the relationship of music and text in contemporary composition.
In this exploration he offers a list of eight primary techniques commonly used
in forging a relationship between text and music, that is text setting[16].
It
is impossible for any of these eight techniques to be unique in a particular
composition: each technique will inform and affect others throughout the
composition, gathering different interpretations as to which technique a
particular musical event belongs to as the composition unfolds. These
techniques will then be interpreted differently by different listeners.
Prior
to setting off on this exploration Stacey examines language and sees that it
can be separated into three separate categories:
The pragmatic; concerned with use,
The scientific; concerned with description, and
The poetic; concerned with ambiguity and symbolism[17].
Below
is Stacey's list of the eight primary techniques used for relating music and
text[18], with some
amplifications and examples of the techniques. In some instances examples from
contemporary popular songs are given, because in this genre the relationships
between the text and the music are often extremely obvious.
1)
Direct Mimesis, where the mood, image or icon, rhythm, style and so on,
of the text is reflected and reinforced in the music. He sees that this can be
at a high level, where the musical accompaniment can be seen as relating to,
informing, or being informed by the whole text. For example: where the
structural or rhythmic aspects of the text are reiterated in the musical
accompaniment as faithfully as possible. Or at a low level, relating to a
single word or phrase where, for example, a specific melodic or rhythmic motif
is used to reinforce the meaning of a single word or phrase in the text.
An
example of this can be seen in the song A Wide Open Road composed by
David McComb and performed by the Triffids[19]. The mood
of the text describes the sense of being at a loose end that comes with the end
of a romantic relationship. The repeating, chant like, melody and repeating
harmonic structure of: I IV I vi ii, are shown in Figure 4.1 below. By avoiding
a traditional cadence the sense of being at a loose end is reflected and reinforced.
In the example only the first line of text in the chorus is shown.

2) Displaced
Mimesis: where similar notions external to the text generate features in
the music. For example: the use of a well-known melody to create the mood of
the text or to inform the text in some way.
An
example of this technique can be seen in the chorus of Sylvester Stone’s song Everyday
People[20], performed
by Sly and the Family Stone, the first part of which is shown in Figure 4.2
below. The text of the song is about racial and class intolerance, and the main
composition device is swapping the two main melodies over different
instruments. In the chorus the melody of a schoolyard song is used to create an
ironic reference within the text.

3)
Non-mimetic Relationship: where the musical composition and the text do
not relate at any analytical level at all. In this case any relationship
between the text and the accompanying music is drawn by the listener.
4)
Arbitrary Association: where two divers textual and musical icons,
subjects or elements are related only because they are often heard together.
Television and radio advertising use this technique to have the audience relate
a particular melody, tune or sound to the product being sold. An example of this
is the use of Ennio Morricone's theme music for the film The Magnificent
Seven in advertising Marlboro cigarettes.
5)
Synthetic Relationship: where the two media are so closely knit as to
create the impression of a single medium. One example of this can be seen in
early liturgical music. Here the use of elision in the voice parts, resulting
in few fast amplitude attacks breaking up the continuous flow of sound, and the
use of the subsequent reverberations produced within the cathedral, create
musical compositions in which the text and the music are blended to the point
where the denotative aspect of the text is obscured. This heightens the
semantic aspect in some opinions, by subjugating it to musical needs.
Another
example can be seen in Franz Schubert's setting of Johann Goethe's Erlkšnig,
in which there is an almost onomatopoeic relationship between the semantic
notions of the text and the musical accompaniment. This piece has been
described as "an ideal and very rare example of music and literature combining
to form one indivisible art work"[21].
6)
Anti-contextual Relationship: where there is a deliberate contrast of
the music and the text.
7)
Incidence and Application of the Techniques: Any aspect of the text is
mapped to a musical equivalent. Two examples of this are: using the size of
typeface to affect the rendition of a composition, as in John Cage's Sixty-two
Mesostics Re Merce Cunningham, and Guido's approach of using the vowels as
markers in melodic construction for text setting.
In
Cage's approach, the typeface style and size is changed for each letter of the
text and, although there are no specific instructions, other than that the
typeface should not be prescriptive when giving a rendition of the pieces, the
changes do influence any rendition of the pieces[22].
In
Guido's approach the placement of the vowels in the text determines the
direction and interval structure of the melody set to that specific text.
8)
Contiguity and Musical Meaning: where there are semiological
correspondences between the text and the music. An example of this is Gustav
Mahler's Des Antonius von Padua Fischpredigt (St. Anthony and the Fishes) from
Des Knaben Wunderhorn[23]. In this
piece one of Mahler's techniques is to reproduce the flowing motion of water in
his musical accompaniment. He does this by using a number of undulating
melodies in the strings and woodwinds. This corresponds to the text in
enhancing its irony.
These
approaches can be used for text setting or as ways of generating and organising
compositions. By attaching a musical item, such as a pitch or a timbre, to a
lingual item, such as a word or phoneme, it is possible to use the structure of
one to create the structure of the other. How perceivable the link is can be
decided by the composer or, possibly, interpreted by the listener.
Paul Lansky applies various
sampling, synthesis and filtering techniques to the voice of Hannah MacKay to provide
a sonic, non-textual, context for Now and Then.
[MacKay] reads several dozen phrases, typically found in
many children's stories, and all of which refer to time - hence the title of
the piece. Thus they form a kind of story with no content, merely the
chronological underpinnings of one[24].
Lansky also adds percussion like sounds and
synthesised sounds not based on MacKay's voice.
Table
4.1 divides the sonic make up of Now and Then chronologically into nine
different sections. These sections are very broad and do not account for
possible sub-sections. The sections are based on what is perceived as separate
parts of the piece, flagged by pauses, definite changes in instrumentation,
voice placement and so on. In each section the following six attributes are
considered:
Density:
the number of events perceived in each section;
Voice
placement: the horizontal placement of the voice in the stereo field and
the depth placement simulated through use of reverberation;
Voice
treatment: occasional doubling of the voice using a slight
delay;
Instrumentation:
four distinct sounds are used: female voice, percussive (sharp attack and
decay) sounds, sustained synthesizer like sounds, and a filtered voice sound
which follows the intonation patterns of the reading;
Amplitude:
while there are no dramatic changes in the overall amplitude of the piece,
subtle changes are important to its structure. Amplitude is defined around the
normal amplitude of the voice, that is, the amplitude of the normal speaking
voice is considered the middle ground when comparing the amplitude of the other
instruments;
Tessitura:
while there are no dramatic changes in the overall tessitura, subtle changes in
pitch direction are important to the structure of the piece. Tessitura is
defined around the normal tessitura of the speaking voice.
This table does not attempt to
be an in depth analysis of Now and Then, it is a simplified description
of what is heard. Importantly it does not take the actual text into account,
this is because the words are mostly redundant, being synonymous repetitions
regarding placement in time. Instead it concentrates on how Lansky used
computer technology to enhance and exemplify the gist of the text, that is,
placement in time.
By
using reverberation Lansky produces a sense of physical distance from the
listener, this distance is analogous to the present "Now", where the
text sounds close to the listener and the past "Then" where the text
sounds farther away from the listener. This effect is used most dramatically at
1'20", where the intensity of the reverberation, especially in the context
of the occasional voice doubling and left to right stereo placement that has
preceded it, produces an effect of being pulled away from the present.
Voice
doubling and stereo placement produce a sense of motion and displacement. For
example: the motion in stereo space and the voice doubling increases up to
2'34", where the voice, which has previously appeared in one moving area
of the stereo space, now envelopes the listener by taking up the whole stereo
space. The slight delay caused by the doubling of the voice serves to enhance
this impression.
The instrumentation all seems to be based on voice
sounds; for example the synthesizer sounds are reminiscent of the vowel/formant
sounds of speech and the percussion sounds are reminiscent of the clicks that
can be made by the mouth. The arpeggio like filter used on the voice causes a
'musical' accompaniment that follows the voice inflections very closely. These
instrumentation techniques create a context for the text that remove the voice
from its natural state for the listener.
While Improvement (Don Leaves Linda) fits
within the broad heading of electronic music, much of the compositional
processes used are traditional, that is, not reliant on electronic or computer
technology as essential for their existence. For example: there is a definite
harmonic structure for the whole piece that appears to follow those structures
common in functional harmony. This structure has been expressed through
electronic means but could have easily been expressed through traditional
instrumentation. An example of this can be seen in the basic sonority and the
harmonic function of each version of The Airline Ticket Counter; version
one, Don at the counter, is based on the dominant chord, and Linda at the
counter is based on the tonic minor chord, first inversion.
The use of rhythm is also fairly traditional in that
the two protagonists in each version of The Airline Ticket Counter, Don
and Carla his ticket agent, and Linda and Carlo, her ticket agent, have
different basic beat divisions. Don's beat division is crotchet triplets
against the ticket agent's division of semi-quavers. The basic beat division of
both Linda and her ticket agent is semi-quavers, however the two are
differentiated in that the ticket agent's rhythm remains basically static while
Linda's rhythm, in general, slows towards the end of each phrase.
These methods of defining characters and developing
structure can and do work in non-electronic opera and text setting; it is the
electronic musical devices that Ashley uses which enhance the action of The
Airline Ticket Counter and give it a unique character. The most obvious of
these is the implied distance between the two protagonists in each version of The
Airline Ticket Counter: the ticket agent is presented as being physically
closer to the listener than either Don or Linda, who are both treated with
quite strong artificial reverberation and, particularly in Don's case, heard at
a lower amplitude. The effect of this treatment is to reinforce the separation
between Don and Linda and their previous life together.
Reynolds describes the processes used in getting the
source sounds for The Vanity of Words thus:
Philip
Larson reads and sings a text that I extracted from Milan Kundera's novel
"The Unbearable Lightness of Being". [This composition] explores the
effect of spatially controlled differentiation on musical and speech materials
both from structural and expressive perspectives. The basic materials were
recorded as performed so that it would be possible for me to capture and then
use compositionally the interpretive volition which performers superimpose upon
musical notation's objective specifications.
[Three
sections of the text are each read] in a distinctive manner (aspirate, deeply
intoned, declamatory)[26].
Reynolds
uses spatialization techniques to "mitigate the degree to which
coincident, or nearly coincident elements obscure one another."[27] The
techniques he uses (reverberation, stereo placement and amplitude), are similar
techniques to those used by Robert Ashley in the two versions of The Airline
Ticket Counter. Reynolds also stretches vowel sounds to create a background
for the reading
The
difference between Ashley and Reynolds' approaches lie in their compositional
needs. Ashley was defining two separate characters while Reynolds was defining
different textural roles.
Before discussing the
composition processes used in Under An Open Minded Sky it is
important to discuss the construction of the poem itself. The full text is
given, with the poet's page layout, in Appendix 5.
Under
An Open Minded Sky looks at the effect of war on two scales, in the family
and in the world, discussing its causes and effects. The poem describes the war
memorial in Saint James Park, Hawthorn as it stands now, using it as the
starting point for its narrative.
Within
this context the narrative gives an account of one night in the lives of the
three main characters: Sam, a teenager on a night of underage drinking with his
friend Max; Sam's mother, Valerie Maynes, an abandoned victim of spousal abuse
and Mrs December, an elderly lady who dies on this night. As well as describing
the events of the night the narrator makes the audience privy to each of the
characters' histories.
These characters give the poem a point from
which to discuss the effect of the wars Australia has been involved in, as
noted on the War Memorial in St James Park, Hawthorn. It juxtaposes this
discussion with past domestic violence and its resolution, the present joy of
masculine youth, and a final return to loved ones that occurs through death.
A
web of inter-relations is built between each of the three characters, their
past, their present and their future. This is seen in the use of similar
phrases when describing the characters and their actions and an increasing use
of elision in the narration of their actions. This elision is a tool used
extensively in the reading, both in the semantic and sonic aspects of the text
and in the intonation of the reader.
The
primary role of the soundscape, from a structural point of view, is to develop
and amplify this web of inter-relations. This is done by providing a set of
similar sounds which reflect the attitudes of each character and the events of
the night. These sounds are re-used throughout the piece, continually having
their semantic role redefined as the narrative unfolds.
The
carol Silent Night, which is not from the text, but is sung by the
reader, is used to frame and create a context for the narrative. It is woven
through the piece, returning at different points and in different guises,
creating a bridge between the poem and the audience, as does the traditional
Greek chorus. The different harmonies, tempi and phrasings of the virtual
carollers represent the burghers of Hawthorn and the young and old men of the
narrative.
By
using only the voice of the reader a physical connection between the reading
and the soundscape is created. This connection serves two purposes: first, it
gives a freedom to the reader and the audio technician to improvise in
performance, with the knowledge that there will be a constant connection
between the reader's voice and the soundscape. This improvising aspect is an
essential part of the composition. Second, as all of the sounds used have one
single source, the sense of an holistic connection between the text and the
soundscape is created for the listener. This may not be immediately obvious on
a conscious level but does have an effect on a less obvious, unconscious level.
The computer generated
soundscape was recorded onto four track tape for the live performance of the
piece. This was then performed with the poet reading the work. By presenting
the soundscape on this medium its continual, linear progression is ensured,
absolving the audio technician of the need to be concerned with its structural
aspects. This makes the presentation of the soundscape similar to the poem as
it is presented on the page: an immutable, concrete object of definite order
and length, which can only be altered through presentation. By using such a
concrete medium as tape, with its noted similarity to the page, another
connection is provided between the audio technician and the reader. Thus the
audio technician is given a similar degree of freedom in the performance of the
soundscape as the reader has in the performance of the text.
This freedom affords the audio
technician a similar degree of influence over the soundscape as the reader has over
the reading, enabling similar reactive and interpretive capabilities and a
similar degree of expressive influence in the performance of the piece. These
capabilities include using changes in amplitude, timbre, placement of sound and
use of silence, all of which are regularly used by readers to give dramatic
effect to the text they are reading.
There are four main IRCAM Signal
Processing Workstation MAXª modules used in Under An Open Minded Sky.
These modules use stereo harmonising, delay, pitch shifting and ring modulating
algorithms, each of which come as standard signal processing algorithms within
the ISPW MAXª programming environment and can be found in the three help pages
available in that program.
When building the patches there
was no intention to create an integrated instrument such as that used in Someone.
Instead the intention was to build patches that effect the sound of the voice
in such a way that it is still recognisable as the voice of the reader but with
the denotative, lexical qualities of the reading either removed or
significantly reduced.
The selection of text segments
to be effected was made taking into consideration first, how the particular
segment reinforced the attitude of the particular area of the poem from which
it had been extracted; and second, how it reflected the overall attitude of the
entire poem. After a number of segments had been chosen the ISPW modules were
built to effect the voice in a way that enhances the emotional attitude of the
segment.
The parameters for each of the
modules were set intuitively, without any definite structural approaches, such
as those regarding pitch, stereo placement, timbre, duration, or any of the
possible other effects, in mind. Each of the effected voice segments was then
treated as a unique sound source to be used as one may use a note in the
conventional sense of musical composition. The structure of the soundscape was
then firmly rooted in the structure of the poem, from both semantic and sonic
viewpoints, with special regard to creating the web mentioned above.
As
Under An Open Minded Sky was originally devised as a performance, rather
than as a tape piece, much consideration was given to how it should be
performed. Also, it was a commission for a special event with the performance
space well defined.
It
was decided that each character, as presented in the reading, should begin with
a unique virtual position in the aural space. This position should move during
the performance of the piece, to the point where the placement of each
character has merged within the aural space. The 'narrator', or reader, could
move easily within the space, developing a relationship with each character in
their virtual position.
To
this end it was decided to present Under An Open Minded Sky in the
round, with the audience surrounding the performance space. A diagram of the
performance space is given in Figure 4.3.

For presentation on compact disc
many aspects of the composition had to be reconsidered. The most important
aspects are: that the context of performance is removed; that the use of
improvisation is removed; that the listener can revisit the performance at
their leisure; and that extra effects can be added to both the voice and the
soundscape.
When editing Under An Open
Minded Sky for presentation on compact disc there was an attempt to
simulate the ambience of a room. This was done by adding delay, or
reverberation simulated by delay, to both the reading and the soundscape
through the temporal displacement of concurrent repetitions of the text track
and the soundscape tracks. As the implementation of these effects developed,
their value, especially in affecting the text, became apparent. The attempt to
simulate a room was abandoned and the effects were used for their own merit.
The text track was divided into
segments, each segment regarding the character, or the mood of the character or
events being described. These phrases were then temporally displaced, usually
at Fibonacci points from the first iteration of the phrase. The seed value for
the start of the Fibonacci series used was one, two or three, resulting in
wider or closer delays being heard. An example of this process is shown in
Figure 4.4 below.

In most
cases there are eight displaced iterations of each phrase.
The mood of each character and
each situation is enhanced by the use of non-periodic delay. By using different
delay times on each section or phrase, different speech sounds, such as stretching
out the sibilances and adding chorusing or phasing effects to the vowel
spectra. These effects then affect the emotional response in the listener in
the ways suggested by Bettina Knapp and others above.
A sense of depth and breadth is
added to the soundscape by the use of periodic delay. This simulates the effect
of reverberation to some degree but does not attempt to sound like the natural
reverberation of a room. Instead it is used as another effect along with those
used in creating the soundscape.
Panning, with different degrees
of vigour and motion from left to right, was also added. The soundscape tracks
were treated in the same fashion without the panning effect being added. This
allows those tracks to maintain the stereo placements used on the original four
track tape.
A
number of versions of Under An Open Minded Sky were made. The final
version for presentation is track one of Compact Disc Three. Excerpts of two
other versions are included, for reference, as tracks two and three on compact
disc three. Both of these examples show the development of the process of
adding delay to the text and soundscape tracks. They also explore the
relationship between the reading and the soundscape, especially regarding which
of the two is more prominent in the mix.
Excerpt
one, track two, has the text almost fully obscured by the soundscape. Here the
voice becomes another aspect of an almost entirely musical, as opposed to
textual and musical, composition. The rhythm of the reading becomes more
pronounced as the lexical aspect is obscured, adding to the music of the piece
rather than placing a non-musical aspect on top of it. In this version delay is
used on the speech track to dull the textual nature of the piece and enhance
the musical nature.
Excerpt
two, track three, takes the opposite approach. In this version the text is most
prominent, taking the listener's attention and giving the soundscape a more
supportive, secondary, role. While this version is appropriate for many
situations it does not support the idea of a true integration of the two media,
text and music, nor does it highlight the use of particular voice sounds in
reinforcing or enhancing the emotional content of the text.
The
final version, track one on compact disc three, is presented in its entirety.
This version takes aspects of the other two versions to make a piece in which
the text and soundscape are fully integrated. Here the text can be discerned
from the soundscape without being the focal point. Its rhythms add to the musical
nature of the piece, punctuating the soundscape with a more divers set of
transient envelopes and timbres.
Because the text acts as a
continuous point of reference for the listener, Under An Open Minded Sky
is more easily reconciled with the traditions of combining music and text than
either ZOOMING IN or Someone. For this reason, evaluating the
piece within the traditions of text setting is an easy path to take; however,
it is not the most appropriate path when considering how effective the piece is
in discussing, and then illustrating, the notion that the emotional impact of a
text can be enhanced through electronic manipulation.
The delay effects placed on the
text track produce different senses of proximity, creating moods ranging from
intimacy to discomfort, while the panning effect creates a sense of
disorientation. This movement from intimacy to discomfort and the sense of
disorientation reflects the moods of the characters and the text. The
soundscape also has these attributes, primarily providing a sense of place and
enhancing the intentions for the text.
The ideas of Knapp, F—nagy and
Magdics, and Sundberg are explored and used successfully, especially in the
soundscape. For example: in the discussion of valour between approximately 8'
40" and 10 minutes of the piece the high frequency portamentos and the
increasing intensity and repetition of the word "why" reflect
Sundberg's description of speech behaviour when the emotion anger is being
expressed.
The approaches to sound
manipulation of Lansky, Ashley and Reynolds are also reflected in Under An
Open Minded Sky. The delay used on the text reading simulates the
reverberation used in Now and Then and The Vanity of Words to
place the text in time. Panning also is used to place the characters and to
give them a sense of motion through the piece. Ashley's use of amplitude and
panning to place different characters can be seen in the placement of Sam and
Max in the stereo field and the use of amplitude for distance placement.
[1] Ivan F—nagy and
Klara Magdics, 'Emotional Patterns in Intonation and Music.' Intonation.
ed. Dight Bolinger. Ringwood Victoria, Penguin, 1972, pp. 286-312.
[2] Johan Sundberg,
'Speech, Song, and Emotions.' Music, Mind and Brain. Ed. Manfred Clynes.
New York, Plenum Press, 1982, p. 138.
[3] ibid., 143.
[4] Victoria
Fromkin, Robert Rodman and Peter Collins, et al. An Introduction to Language.
Sydney, Holt, Rinehart and Winston, 1990, p. 85.
[5] Bryan G Levman,
'The Genesis of Music and Language.' Ethnomusicology 36.2, Spring/Summer
1992, p. 152.
[6] ibid., p. 161.
[7] Bettina L.
Knapp. Music, Archetype and the Writer: A Jungian View. Pennsylvania,
The Pennsylvania State University Press, 1988.
[8] The discussion
of Knapp's work given here is an expands on previous work submitted for the Graduate
Diploma of Music Technology in my dissertation titled Interrelating the Use
of Computers in Music and Language, 1992, pp. 17-19.
[9] Knapp, op.
cit., p. 95.
[10] ibid., p. 96.
[11]James Joyce, The
Dubliners. Ed. Robert Scholes. New York, Penguin, 1983, p. 35.
[12] Knapp, op cit.,
p. 97.
[13] ibid., pp.
100-103.
[14] Lawrence
Kramer, 'Text and Music: some new directions.' Contemporary Music Review.
5, 1989, p. 145.
[15]This discussion
of Kramer's work expands on previous work submitted for the Graduate Diploma of
Music Technology, La Trobe University, in my dissertation titled Interrelating
the Use of Computers in Music and Language, 1992, p. 19.
[16] The discussion
of Stacey's work given here is an expands on previous work submitted for the
Graduate Diploma of Music Technology, La Trobe University, in my dissertation
titled Interrelating the Use of Computers in Music and Language, 1992,
pp. 16-17.
[17] Peter F.
Stacey, 'Towards the analysis of the relationship of music and text in
contemporary composition.' Contemporary Music Review 5, 1989, p. 17.
[18] ibid., p 22.
[19] David McComb,
'Wide Open Road.' Stockholm. Mushroom Records International, 1989.
[20] S.Stone.
'Everyday People.' Greatest Hits: Sly Stone and the Family Stone. Sony
Music Entertainment, 1992.
[21] Stephen Davies,
Musical Meaning and Expression. London: Cornell University Press, 1994,
p. 117.
[22] John Cage, 'Sixty-two
Mesostics re Merce Cunningham for Voice Unaccompanied using Microphone.'
Henmar Press Inc, 1971.
[23] Gustav Mahler,
'Des Antonius von Padua Fischpredigt (St. Anthony and the Fishes).' Des
Knaben Wunderhorn. Vanguard Classics, 1991.
[24] Paul Lansky,
'Now and Then.' Homebrew. Bridge Records, Inc., 1992, Liner notes.
[25] Robert Ashley,
'The Airline Ticket Counter.' Improvement (Don Leaves Linda). Elektra
Entertainment, 1992, Liner notes.
[26] Roger Reynolds, 'The Vanity of Words.' Computer
Music Currents 4. Wergo, 1986, Liner notes, p 20.
[27] ibid., p. 20.