H        O        M        E

Chapter 3: COMPUTER MANIPULATIONS OF A DIGITIZED AUDIO PERFORMANCE OF A POETRY READING.

Aspects of text reading and music performance

Electro-acoustic examples of speech manipulation

Direct voice: Come Out

Altered voice: I am sitting in a room

Enhanced voice: Smalltalk and Late August

Composition of the piece Someone

The computer and composition processes used in Someone

The algorithms used for Someone and their functions

Sound file playback

Granulation process

Glissandi process

Speed of the different glissandi used in Part One

FFT process

Comb filters

Panning

Construction of the Nine Parts

EVALUATION

Chapter 3

COMPUTER MANIPULATIONS OF A DIGITIZED AUDIO PERFORMANCE OF A POETRY READING

                This chapter looks at the process taken in deriving a musical composition from a poetry reading. To this end a recording of a reading of the poem Saint Dymphna's Bells by its author Barry Dickins, is manipulated using algorithms built with the IRCAM Signal Processing Workstation (ISPW) to produce the musical work Someone (compact discs Three, Four, Five, Six and Seven). The whole poem and a ten second segment from the beginning of the poem are repeated and adjusted in several ways to produce an installation of indeterminate duration.

                The installation is made up of nine separate parts of different durations and incorporating different ways of using similar techniques; the parts and the techniques used to produce them are discussed below in the section titled: The composition processes used in Someone.

Aspects of text reading and music performance

                In the paper Music and speech performance: Parallels and contrasts. Rolf Carlson, and others, put forward that:

in speech many different prosodic factors are mixed together as one single acoustic parameter. [For example] segmental inherent pitch, word tone, sentence type, lexical stress, emphasis etc. can all be signalled in one single parameter, such as the voice fundamental frequency. In the same way, the duration of speech sounds is affected by a variety of conditions including stress, position in the utterance, and local phonetic context.

                The same applies to music. There are many different reasons to lengthen or shorten a note beyond its nominal duration [For example] emphasis, marking of phrase endings, and sharpening the contrast between categories[1].

                Lengthening or shortening note durations are just one of the many tools available to musicians when interpreting music. Other tools include varying the pitch, amplitude and/or timbre of a note or phrase. These tools serve to allow each musician to imbue a composition with their own expressiveness.

                The degree to which a musician can interpret a composition is dependent on the idiom in which he or she is playing. For example: jazz musicians are expected to be able to interpret and extemporise on a pre-existing melody within the jazz idiom and in order to maintain idiomatic correctness. The degree of variations that can be made and the palette of variations available in this idiom are quite broad, depending on the idiomatic subset of the jazz idiom they are playing in.

                On the other hand, non-improvising musicians, such as classical orchestral musicians and classical music soloists, are expected to interpret with a smaller palette. Here the art of interpretation is far more critical. A musician whose role is as part of an orchestra is expected to subject their own interpretation of a composition to that of the conductor, who in turn is subject to the composer's act of self-expression, that is, the composition.

                Table 3.1 lists some of the variables that are available to musicians and speakers in adding a degree of self-expression when performing a text or composition. Under each heading three possible variables are listed; there are, of course, other options available to the performing speaker or musician.

Table 3.1 Variables in music and speech performance and composition.

Possible Speech Variables

Possible Musical Variables

3.1.1 Amplitude variation

a) varying the emphasis placed on certain syllables within a word or phrase;

b) wholly increasing or decreasing the amplitude of a word or phrase;

c) varying the amplitude within a syllable.

 

Amplitude variation

a) varying the emphasis placed on certain notes within a section or phrase;

b) wholly increasing or decreasing the amplitude of a note or phrase;

c) varying the amplitude within a note.

3.1.2 Pitch variation

a) varying the pitch of certain syllables within a word or phrase;

b) wholly raising or lowering the pitch of a word or phrase;

c) varying the pitch within a syllable.

 

Pitch variation

a) varying the pitch of certain notes within a section or phrase;

b) wholly raising or lowering the pitch of a section or phrase;

c) varying the pitch within a note.

3.1.3 Rhythmic variation

a) varying the inter-onset time between certain syllables within a word or phrase;

b) wholly increasing or decreasing the duration of a word or phrase;

c) varying the inter-onset times or durations of syllables.

 

Rhythmic variation

a) varying the inter-onset time between certain notes within a section or phrase;

b) wholly increasing or decreasing the duration of a note, section or phrase;

c) varying the inter-onset times or durations of notes.

3.1.4 Timbre variation

a) varying the timbre of certain syllables within a word, or words within a phrase;

b) wholly changing the timbre of a word or phrase;

c) varying the timbre of a syllable.

 

Timbre variation

a) varying the timbre of certain notes within a section or phrase;

b) wholly changing the timbre of a note or phrase;

c) varying the timbre of a note.

 

3.1.5 Lexical variation

varying certain words or certain syllables within a word or phrase;

Melodic variation

varying the role of certain notes within a section or phrase;

                For the most part these variations are generated intuitively by the performer. The performer's intuitions are rooted in cultural knowledge of the effect of intonation on the listener. This is well observed when listening to Eberhard Blum's performance of John Cage's Sixty-two mesostics re Merce Cunningham.[2] Here the text reading is manipulated in all of the ways given in Table 3.1 and more. Blum uses the text as a vehicle for a wide variety of vocal expressions. Phones, syllables and larger groups of vocal sounds are stretched, bent, constricted and, in general, distorted from their normal use in the English language so as to become a hybrid language existing somewhere between music and English.

Electro-acoustic examples of speech manipulation

                The manipulation of vocal sounds via electronic media has been happening since recording technology became available. Early practitioners, such as Henri Chopin, used the analogue audio tape domain with striking results. As digital electronics became available the palette became broader and composers were able to enhance and alter the vocal, speech or textual input in a wider variety of ways. This can be seen in the more contemporary works of Paul Lansky and Roger Reynolds, among others.

                The use of speech in electro-acoustic composition can be divided into three categories: direct voice, altered voice and enhanced voice. I have used the works of three composers which exemplify each of these categories. The discussion of the pieces below serves to give a background to my approach to using voice and is not intended as a proper or definitive analysis of the pieces. The pieces discussed here are: Come out[3], by Steve Reich; I am sitting in a room[4], by Alvin Lucier; and Smalltalk and Late August,[5] by Paul Lansky.

                In each of these pieces the composer creates an environment, through electro-acoustic media, in which the text mutates without too much guidance from the composer, with the possible exception of Smalltalk and Late August. By reducing the input of the composer during the composition of the piece the changes that occur are created by either the text or the environment used in the recording. This means that what the listener hears is not so much driven by the taste of the composer as by the text itself.

                These works are precedents to the pieces presented here and exemplify the processes of electro-acoustic composition used for my dissertation. For each piece I list important changes as they occur in the form of a timeline.

Direct voice: Come Out

                Steve Reich's Come Out uses an analogue tape recording of a man saying the sentence "I had to let the bruise blood come out to show them". His technique is to overlay repetitions of the words "come out to show them" in such a way that the layers move in and out of phase with themselves. This results in shifting rhythmic patterns which draw the listener's attention away from the lexical meaning of the words and towards the interplay of sonic patterns found within the words. Reich describes the piece thus:

The phrase 'come out to show them' was recorded on both channels, first in unison then channel 2 slowly beginning to move ahead. As the phase begins to shift, a gradually increasing reverberation is heard which slowly passes into a sort of canon or round. Eventually the two voices divide into four and then eight[6].

The main structural and driving element of the piece is the rhythmic counterpoint between the voices. As this counterpoint progresses, through the perceived adding of more iterations, the text is increasingly obscured until it becomes unintelligible as text. Figure 3.1 is a rough melodic and rhythmic transcription of the rhythm and contour of the main motif, "Come out to show them".

Figure 3.1 Melodic structure of Come out.

                This rhythmic motif fits very comfortably into use as a hocket, which could be the main reason for Reich's use of it.

                As the piece progresses the three distinct sections become apparent, as Reich describes. In the first section we hear the voice gradually gain spatial depth through perceived, not actual, reverberation, then lose it's textual characteristics for musical characteristics as repetitions of the voice increase, or "divide", this "division" is the most important compositional process used in Come Out. The entire acoustic signal used for Come Out is made up of simply one, two, four or eight iterations of the phrase "Come out to show them". These iterations operate as a very finely displaced hocket creating illusions of traditional signal processing devices even though signal processing plays no part in the composition.

                Apart from choosing the sound source, setting up the tape machines and then switching them on, Reich's only other compositional input to Come Out was to decide when the piece should "gradually [pass] into a canon or round for two voices, then four voices, and finally eight"[7].

                When listening to one side of the stereo recording the doubling of the voice from one to two to four and finally eight is striking, but is obscured when listening in stereo. Table 3.2 shows an approximate timeline of the perceived changes as Come Out progresses; these changes are what the ear is drawn to over the duration of the piece.

Table 3.2 Timeline of perceived changes in Come Out.

Time

Perceived changes

 

Single voice

0"

Complete phrase "I had to let the bruise blood come out to show them" repeated 3 times; the phasing effect is not used.

21"

Two voices are now heard.

21"

"Come out to show them" phrase begins and phasing effect begins.

 

Slight shifts in stereo placement of the voice.

 

Flanging slowly transferring into a delay;

 

Depth is added through the phasing technique simulating 'reverberation'.

1' 45"

Two voices appear, but their role as distinct voices is not apparent.

1' 50"

"sh" sound becomes prominent.

2' 0"

The two voices become distinct as two voices.

2' 19"

The tempo and beat division of the phrase becomes ambiguous, seeming to slightly slow down and speed up regularly.

2' 59"

Four voices are now heard.

3' 0"

Placement of voice moves in stereo field.

3' 20"

"Come out" and "show them" become two distinct phrases.

3' 55"

The whole piece begins to be heard in a more 'reverberant' space.

4' 30"

Text becomes hard to distinguish, gradually losing its meaning.

4' 71"

'Reverberation' becomes an important part of the overall composition.

5' 6"

The text becomes less intelligible and more like a musical sound source.

6' 0"

Two similar rhythmic motifs appear: come-a-come out and show-de-show them, forming the most obvious hocket.

6' 57"

The two motifs move in stereo space.

7' 10"

"sh" sound becomes prominent again.

7' 20"

The rhythmic motifs fracture.

8' 0"

Each phone has a rhythmic pattern of its own, the piece uses repetition to build intensity and its hocket nature becomes less of a driving force

8' 37"

Eight voices are now heard.

8' 40"

Each phone, particularly the voiced and vowel phones of the text, glissandos downward.

9' 10"

The downward glissandi begin to sound more scale-like.

11' 0"

The piece sounds more like a pulsing timbre than a succession of musical events; the effect of the phasing has reached its peak and a very gradual fade begins.

12' 54"

Piece ends.

                This descriptive, foreground analysis of Come Out shows how Reich's process shows features associated with more usual musical composition processes, namely the use of motivic and phrase repetition and variation.

                While there are the obvious repetitions of the text, other aspects of the sonic palette are also repeated. 'Reverberation', or proximity to the listener, is the foreground feature at 3' 55", 4' 17"; the perception of different numbers of voices at 2' 0", 3' 20", 6' 0"; motion in stereo space at 21" and 6' 57"; the "sh" sound at 1' 50" and 7' 10".

                According to Richard Boulanger "an important aspect of Reich's Come Out is that the natural declamation of the text is preserved yet the speech undergoes a unique and significant transformation"[8]. By taking this approach Reich has maintained the integrity of the text and its reading and in doing so produced music which has evolved out of speech.

                Reich's use of the repetition of a short phrase in Come Out is influential in composing the installation Someone, which is presented here. In Come Out shifting repetitions of a phrase eventually obscure the textual meaning of the phrase, and mutate the phrase from language into music. This method of transformation by repetition of a single phrase is used and extended in Someone. In Someone there are many more iterations of the phrase and the phrase is repeated in many different ways. Its speed, pitch, the number of phrases being repeated at the same time and stereo placement, all vary greatly when compared to Come Out, where there are only two iterations of the phrase being repeated, no pitch or speed variations and the stereo placement of each phrase is panned to hard left and to hard right.

                Throughout Someone the text is heard in part and, in some sections, in full. Each repetition is heard in and out of phase with each other, just as the repetitions of the text are heard in and then out of phase in Come Out. This process serves to spread the text over the listening area both spatially and temporally: segments are heard in one area and then repeated in another, or a series of segments are heard concurrently, depending on the listener's position. Segments may follow in the order of the text or be reordered so as to lose the intended flow of the text. Segments may also play in close temporal proximity to each other, creating an effect similar to that of Come Out depending on the placement of the listener.

Altered voice: I am sitting in a room

                This piece is built on an analogue tape recording of Alvin Lucier describing what he is doing and why. The process Lucier used was:

to record his voice onto one tape player, play it back on another tape player through a loudspeaker and record that rendition onto the first tape player[9].

This process was carried out nine times for the performance of I am sitting in a room, as it is presented in Source: music of the avant garde. The recording was done in Lucier's living room.

                The purpose of Lucier's composition is described in the composition itself. Lucier uses this description as the underlying and driving element of the composition. It is what he says into the first tape recorder, creating the seed of the composition, and is given below.

I am sitting in a room different from the one you are in now. I am recording the sound of my speaking voice and I am going to play it back into the room again and again until the resonant frequencies of the room reinforce themselves so that any semblance of my speech, with perhaps the exception of rhythm, is destroyed. What you will hear, then, are the natural resonant frequencies of the room articulated by speech. I regard this activity not so much as a demonstration of a physical fact, but, more as a way to smooth out any irregularities my speech might have[10].

                As each further rendition is heard the sense of the text disappears. The cascading effect of the resonances causes first distinct pitches to be heard, which create motifs, and then metamorphose into a musical composition. Table 3.3 below looks in more detail at each section, taking each rendition of the text as a section. References to pitches are approximate.

Table 3.3 Timeline of perceived changes in I am sitting in a room.

Rendition

Perceived changes

 

Rendition 1

Normal text reading is recorded.

 

Rendition 2

Reverberation of the room becomes apparent, also the mid-range frequencies of the voice are accentuated. The sound of Lucier's sibilances are accentuated.

 

Rendition 3

Room reverberation increases creating an impression of distance; it is now a distinctive part of the sound of the voice. A single pitch is now heard, triggered by the voice, which creates a harmony for the reading.

 

Rendition 4

The background noise becomes a feature in the piece. The text is becoming obscured but is intelligible. More pitches are heard, still obviously triggered by the voice, which create a melodic motif, or "accompaniment", around the voice.

 

Rendition 5

Text is barely intelligible; whatever remaining intelligibility there is may be a result of previous exposure to the text. The intonational pitch changes of the reading now more obviously affect the melodic pitch changes of the "accompaniment".

 

Rendition 6

The background noise is now a prominent feature and appears to have a number of distinct pitches, forming a non-tempered cluster around C# 5. (F# and C# seem to be the most resonant pitches of the room). The "accompaniment" is now the main feature of the piece; the text serves as part of the timbre of the "notes" of the "accompaniment".

 

From this point on it becomes difficult to separate the sections one from the other. Between sections six and seven there is a tape glitch or break which defines the beginning of the new section.

Rendition 7

The background noise now has two distinct pitches which it glides between regularly. The text is completely obscured but the rhythm of the reading continues to drive the piece. Three distinct parts now run simultaneously: the background noise, the "accompaniment" and now an adjunct to the "accompaniment" follows it but seems to have a different motif.

 

Rendition 8 and Rendition 9

From here there is a general smoothing of the overall sound of the piece. The text has degraded to the point of being unintelligible and it is now difficult to hear even the intonational aspects, which are now obscured as the piece takes on all the aspects of music and loses all aspects of speech.

 

                Alvin Lucier uses repetition as the main structural element in I am sitting in a room. He uses the many and changing resonances that are produced within a room by the intonational aspects of his text reading to create music from speech. This transformation of speech into music using resonance is reflected in Someone. Here forty artificial rooms have been created for the whole text and text segment to resonate in. As in I am sitting in a room the pitch and duration of each resonance is affected by the intonational aspects of the reading, creating melodies and harmonies which transform the text reading into a musical piece. In Someone and I am sitting in a room the resonances that we hear created by the voice are amplified through repetition.

                The difference in Someone is in the characteristics of the artificial rooms. The dimensions of each room change as the intonational aspects (rhythm, timbre, pitch and amplitude) of the reading change, thus changing the qualities of each resonance and its effect on the reading. By doing this the reading becomes the only causal aspect in the piece. In I am sitting in a room the reading and the room have equal roles in creating the piece; in Someone the reading has become the only agent in creating the piece.

                I am sitting in a room influences Someone in two ways. First in the use of repetition of a long phrase: in the case of Someone the whole poem is repeated; and second, by using the intonational aspects of that text reading as the trigger to alter the sound of both the whole poem and the segment of the poem. This process is discussed later in chapter three under the heading "The computer and composition processes used inSomeone."

Enhanced voice: Smalltalk and Late August

                Paul Lansky's Smalltalk and Late August also alter the voice to become unintelligible in a lexical sense. According to Lansky

Conversation [has the] ability to change its nature when one no longer concentrates on the meaning of the words. [What is heard is the] intonations, rhythms and contours of the speech.[11]

                Smalltalk is based on a recording of a domestic conversation between Lansky and his wife, Hannah MacKay. The recording was treated on a DEC Micro Vax II running software Lansky wrote for the project. This resulted in obscuring "the words we spoke while capturing the rhythms, pitches and contours of our conversation"[12].

                Late August resulted from Lansky wondering

what would happen if I tried the same sort of thing with another language, say Chinese, in which pitch and contour have different meanings. [The result] is similar to Smalltalk on the surface, but quite different in substance the sound world of the Chinese language led to a very different kind of music[13].

                While the music may be of a different substance and kind this surface similarity can make distinguishing the two pieces difficult, especially in the first few times they are listened to. This may be due to Lansky's use of conversational rather than highly structured and stylised text, such as a poem. His desire to create tonal centres that do not appear to be based in the pitch field of the voices also obscures the difference between the substance of Smalltalk and Late August.

                Table 3.4 describes the perceived changes over time in both Smalltalk and Late August. In this description of the changes over time in both pieces it is not essential that any of the textual foreground be mentioned. This aspect of the pieces is subject to and obscured by the effects that Lansky applies to it. To include a more detailed description of the foreground melody or harmonic changes would also obscure the description of Lansky's large scale structural composition of the pieces.

Table 3.4 Timeline comparing perceived changes in Smalltalk and Late August.

Smalltalk

Late August

Time

Perceived changes

Time

Perceived changes

0'0"

Melodic foreground and middle tessitura.

0'0"

Melodic foreground with high, wide tessitura. The accompaniment is based on aspirant-like sounds.

38"

Faint single note accompaniment background. This accompaniment is based on back vowel-like sounds[14].

 

 

1'13"

Accompaniment background becomes lower in pitch.

 

 

1'22"

Return to original accompaniment pitch.

 

 

1'35"

Accompaniment becomes more dense.

 

 

1'45"

Accompaniment background an octave lower.

 

 

2'07"

Complete change in background, moves to a different scale/key.

 

 

2'35"

Background accompaniment increases in activity.

 

 

 

 

 

 

 

 

3'02"

Large change in accompaniment.

3'29"

Return to original background accompaniment harmony/scale.

 

 

3'54"

Accompaniment shift.