Chapter 2: USING THE PHONEMIC
STRUCTURE OF A POEM AS THE BASIS FOR MUSICAL COMPOSITION.
The note and phoneme
relationship
Using the rules of a
language in poetry
Similarities
and differences between phonemes and notes
Two approaches to
phoneme and note relationships, Nattiez and Wishart
Creating a musical
composition from the phonemic structure of a poem
Process 1: Translation of the text to phoneme
symbols
Process 2: Phoneme symbol to number conversion
Process 3: Table lookup of note data using an ID
number
Process 4: Creating the source cantus firmus
Process 5: Cantus firmus to core melody modification
Process 6: Composing a polyphonic texture from
the modified melody
This chapter looks at the
process taken in deriving musical compositions based on the phonemic string of
the poem ZOOMING IN, by Alex Skovron. The phonemic string is translated
into MIDI information which is manipulated through algorithms built in Opcode's
MAX environment. The processes shown here can be used with any text; ZOOMING
IN was chosen for illustrative purposes only.
ZOOMING IN ©Alex
Skovron 1991
dot \ sphere \ planet \ continent
\
coastline \ shadow \ city \ chess
\
block \ building \ window \
chamber \
doorway \ table \ figure \ face \
eye \ iris \ pupil \ blood \
cell \ atom \ darkness \ dot
face \ forehead \ bone \ blood \
matter \ motion \ energy \ bliss \
energy / hatred \ terror \ darkness \
energy / light \ bliss \ dot
Skovron describes ZOOMING IN as a sonatina. It
is part of a body of works using the same structural framework.
The
focus here is in using the surface structure of a textual expression, in this
case a poem, as the basis for musical composition. That is, applying the rules
governing the relationships of the phonemes used in a poem to create the
background structure from which a set of musical compositions are generated. To
this end I look at the sonic aspects of music and language that can be
represented with a notation, which is in this case phonemes and notes.
The
notation of phones is highly developed as a method for describing speech
sounds. A subset of phones called phonemes defines the more significant aspects
of the phonetic pool of a given language. This subset is used as a basis for
analysing the relationships between speech sounds in a given language,
discovering the rules that underlie the language and then, if wanted,
generating new phoneme groups that fit within the given language.
The
notation of music is also highly developed as a method for describing the
sounds to be produced by a musical instrument. Because musical notation is not
instrument specific, (that is, the same note can be played on a variety of
musical instruments) it does not give much information about the actual sound
to be produced. By contrast, phonetic notation is instrument specific, in that
it refers only to the voice. Musical notation can, however, be used as a tool
in analysing the relationships of pitches within musical compositions. It is in
the area of analysis that the two notation methods share common ground.
The
sound of the poem can be quite musical. This is obvious when the listener
either disregards the meaning of the text or listens to a poem in a unknown
language. The sonic structure of the text is an integral part of what the poet
is expressing to the listener and the poet's specialised attention to this is
essential, I believe, in order for the text to fit within the realm of poetry.
However, the poet's attention to the sonic structure is often obscured by the
listener’s focus on the lexical meaning of the text. This attention is seen in
the poet's use of alliterations, rhymes and rhythms to enhance the intended
meaning of the text. It is these three aspects which give the musical element
to poetry.
Each
language has a finite set of rules regarding the relations of the sounds within
it and these rules must be adhered to in order for the language to communicate
effectively; therefore the poet is forced to use alliterations, rhymes and
rhythms within the language being used. This means that the sonic structure of
the poem, an important expressive device, is bound by the lexical meaning of
the text. It can be said, when considering only structural aspects, that a
composer of music has an easier job in that he or she does not necessarily have
to contend with reconciling two methods of communication, the lexical and the
sonic.
These
boundaries do not affect areas of textual work such as concrete poetry, where
the emphasis is on the appearance of the text in enhancing the lexical meaning,
or sound poetry, where the emphasis is on the sounds of the text rather than
the lexical meaning.
The
uses of notes and phonemes are, for the most part, quite different. Phonemes
are a set of universally accepted and understood symbols used to describe the
sounds of a language as it is spoken. These symbols are used to describe a
language for analysis. In the act of musical composition the note is often used
as a starting point of the composition from which the instrumentalist generates
the music. This process can be seen in the composition of serial music.
Most
instrumental music can be described as an ordered succession of discrete
sounds. To the casual listener, hearing music as a coherent set of musical
phrases, this may not be immediately obvious. However these phrases are made up
of discrete sounds, each of which can be observed through the use of a visual
representation, such as a score; or, in the case of electronically realised
music, a frequency or amplitude against time graph. Both these forms of
representation, score and graph, come notionally after the fact.
Representing
something in a medium other than its own creates an abstract and symbolic
representation. In this case a purely aural form of expression is represented in
a purely visual way. This creates a continuum from an abstract, conceptual
point, where the object exists only as a description, to an actual point, where
the object exists as a finite, describable, entity.
Notated
music has an interesting place in the abstract to actual continuum. The set of
symbols used to describe a composition can also be used to generate it. This
can be seen in serial composition, where the tone row and its subsequent music
may grow from an abstract, non-musical concept, such as translating the birth
dates of the composer's children into interval sets. Another approach could be
where the composer observes interesting visual relationships between notes
written on a stave, which he or she then develops into a musical composition.
In these two cases the concept of the composition is developed from and
dependent on the symbols used to represent it. On the other hand it is possible
for the composer to notate a melody that springs whimsically from their
imagination and then develops into a serial composition.
Jean-Jacques
Nattiez describes Pierre Schaeffer's "morphology of sound objects É [as
being] founded on a descriptive inventory of their characteristics"[1]. This inventory
is similar to the inventory of descriptions used in phonetics, in that the
inventory comes after the fact. Nattiez discusses this need for a well defined
set of graphic symbols, similar to the symbols used in phonetics, as being
essential for the musical analysis of non-scored music. For anyone attempting
to analyse electronic/computer music a set of universally understood
descriptive symbols would be useful.
Later
Nattiez describes the phoneme and note as "discretized" units rather
than discrete units[2]. This is
true in that each phoneme is removed from its context and has very little
meaning on its own, just as a note has very little meaning when removed from
the context of a piece of music.
Trevor
Wishart draws relationships between the characteristics of music and language,
saying that:
the melodic stream is
pitch-disjunct and may be articulated by timbrel colouration. [And that the]
language stream is timbre-disjunct and may be articulated by pitch
inflections[3].
He goes on to compare the timbre fields of languages
with the harmonic fields of music, equating the chromatic field of tonal,
tempered instruments with the phoneme field of a single language and the
harmonic field of non-tempered instruments with the phoneme field of other
languages[4].
Wishart
offers a brief analysis of an extract of Kurt Schwitter's Ursonata[5], which can
be described as a sound poem or, by looking at the structure of the text in a
musical sense, as a monophonic sonata. In his analysis he looks at the shifting
use of vowel and consonantal "resonances". After the analysis Wishart
goes on to say:
All this differs from our
perception of field characteristics in pitch-lattice music (apart from the
obvious pitch stream/timbre stream distinction) in a number of ways which are,
however, not intrinsic to text sound composition. First of all there is no
counterpoint or chorusing. Secondly, there is no indication of rhythm (which
might however be implied from the printed spacing) or tempo. Adding these, and
other, dimensions we can imagine a sophisticated contrapuntal art based on the
articulation of a multi-levelled timbre or timbre-motif (possibly phonemic)
field structure[6].
Nattiez
and Wishart's approaches deal with the sonic relationships between the phoneme,
as the "discretized" unit of language, and the note, as the
"discretized" unit of music. In doing this they create a continuum of
articulated sounds or utterances in which language, as it is heard, and music,
as it is heard, exist. This continuum is shown in Figure 2.1.

The
relationships drawn by Nattiez and Wishart are to do with the audible surfaces
of music and language. This is a very tangible relationship and treats the
gamut of articulated sounds as a possible pallet for musical or textual based
sonic composition. While increasing the possible sound pallet neither Nattiez
nor Wishart move into the broader area of the possible composition tools made
available by the note/phoneme relationship; one of these tools is discussed
below.
The
idea that music and language have useful commonalities at the structural level
has been of interest to composers and analysts over the history of music. One
of the early uses of language structure as an aid in the composition of music
was applied by Guido of Arezzo[7] in the
early tenth century. Here Guido used a part of the phonemic structure of a text
as a structural device when composing plainsong. He supposed that if the text
itself was well constructed it followed that music built on that text would
also be well constructed. Guido’s system was to attach a number of pitches to a
vowel and then to select from the pitches available to that vowel when it is
encountered in the text. This table is shown in Table 2.1.
|
vowels: |
a |
e |
i |
o |
u |
|
pitches: |
G |
A |
B |
C |
D |
|
|
E |
F |
G |
A |
B |
|
|
C |
D |
E |
F |
G |
|
|
A |
|
|
|
|
While
Guido's process does not account for rhythm, amplitude or timbre it is a very practical
approach for the writing of plainsong. The style of plainsong composition is
rooted in the voice and defines the vocal idiom in Western music, therefore
attending only to the vowel sounds of the text is appropriate as these are the
sounds most useable in singing.
The
approach I am taking here expands on Guido’s method by including the aspects of
rhythm, amplitude and timbre. This is done by applying a unique MIDI[9] note-event
to each unique phoneme used in the poem ZOOMING IN. For example:
when the phoneme /d/ is encountered in the poem it triggers the MIDI note
number 23 with a MIDI velocity of 12, a duration of 28 units of time and an
inter-onset time of 41 units of time. A unit of time can be anything from a
millisecond to however long the composer sees as appropriate for his or her
composition.
These note-events are created using the
computer algorithm "ROWMAKER". How these note-event attributes are
generated is explained under Process 3 below. The duration and inter-onset
times can be multiplied to fit a range more acceptable to the compositional
requirements of the piece.
By
using this process the phoneme string of the poem forms a single line melody,
or cantus firmus, which is used as the structural core of the compositions.
This cantus firmus is filtered through, and enhanced by, two computer
algorithms, called "COMPOSE" and "CANON", to produce the
final compositions.
Table
2.2 below shows the processes that are used in the left column, and the reasons
and the methods for using these processes in the right column. Each section of
the table is explained in greater detail further on in the chapter.
|
PROCESSES |
METHODS |
|
Process 1: Translation of the text to phoneme symbols. |
A set of symbols using keyboard characters available in
any computer is used here. |
|
Process 2: Phoneme symbol to number conversion. |
Each phoneme symbol is assigned a unique ID number.
The benefit of using ID numbers is that they are more suitable for
manipulation by computers. They can be assigned to any type of event, such as
a graphic event or a sample playback, not just a note-event. |
|
Process 3: Table lookup of note data using ID numbers. |
The ID number is used to select a unique pitch, velocity,
duration and inter-onset time from a table, thus creating a unique note-event
for each phoneme character. The pitches, velocities, durations and
inter-onset times are created using an algorithm called ROWMAKER. |
|
Process 4: Creating the source cantus firmus. |
The list of phoneme characters is stepped through in the
order in which they appear in the poem, triggering their accompanying labels
which in turn trigger each note-event. This succession of note-events
produces the cantus firmus. It is this core melody which is used as the
structural basis for the compositions. |
|
Process 5: Cantus firmus to core melody modification. |
The cantus firmus is fed into the COMPOSE algorithm. Here
the composer applies constraints to the pitch, velocity, duration and
inter-onset time to produce a modified melody. |
|
Process 6: Composing a polyphonic texture from the
modified melody. |
The modified melody is fed into the CANON algorithm, which
creates four other related voices. Adding a counterpoint creates a harmonic
and rhythmic context for the melody. |
|
Process 7: Playback. |
The melody and the other related voices are sent to a MIDI
instrument for playback. |
The
translation of text to phoneme symbols used here makes the translation of speech
sounds into a set of symbols that are easily used in a computer MIDI
environment as simple as possible. I use the set of phones given by Alfred
Blatter[10] as the set
of speech sounds. The symbols I chose use the standard International Phonetic
Alphabet (IPA), as given in Blatter, as much as is practical. Where I deviate
from the IPA I have tried to use symbols which relate in some way to the speech
sounds heard.
The
reasons for choosing this set of characters instead of a set of standard
phonetic characters, such as the IPA are that:
(1)
no additional software is required;
(2)
a minimum number of keystrokes is required for each character. For example: the
key strokes required for the symbol /ä/, representing the /ir/ sound in bird,
are option-shift–R; and,
(3):
the ASCII[11] numbers
for each keyboard character can be easily translated into MIDI/computer
information. For example: the ASCII number for /ä/ is 228.
Many
decisions must be made when transcribing a written text into symbols that
represent the sounds of the text. The transcription used here reflects the way
I speak the English language (with an Australian accent), and the way the poem
forces me to speak that language. The poet's use of alliteration, and the
positioning of the text on the page suggests to me a steady, rhythmical reading
style. My choice of phonemes is based on this style.
The
alliterations, and positioning style, can be seen in the first and second
lines:
dot
\ sphere \ planet \ continent \
The words dot, planet and continent are
similar sounding, they each begin and end in stop plosives[12], the
vowels and the nasals are mostly produced in the front of the mouth. Sphere
has none of these attributes, it is made up of continuant fricatives and the
vowels are produced in the back of the mouth.
The
second line:
coastline
\ shadow \ city \ chess \
uses only three plosives, shifting the predominant
consonantal sound from stop plosives to fricatives and the predominant movement
in the mouth transfers from front to back.
A
translation of the poem into the phoneme symbols used here is shown on the
following page and a full list of each phoneme, its sound and its symbol is
given further below in Table 2.3.
dot \ sphere \ planet \ continent
\
dot---sfEr---planet---kontinent---
coastline \ shadow \ city \ chess
\
kOstlIn---SadO---siti---Ces---
block \ building \ window \
chamber \
blok---bildiN---windO---CAmbP---
doorway \ table \ figure \ face \
dHwA---tAbl---figP---fAs---
eye \ iris \ pupil \ blood \
I---Iris---pUpil---blud---
cell \ atom \ darkness \ dot
sel---atom---dRknes---dot---
face \ forehead \ bone \ blood \
fAs---fHhed---bOn---blud---
matter \ motion \ energy \ bliss \
matP---mOSn---enPji---blis---
energy / hatred \ terror \ darkness \
enPji---hAtred---terP---dRknes---
energy / light \ bliss \ dot
enPji---lIt---blis---dot---
The three dashes between each translated word refer
to the whitespace backslash whitespace used in the poem's layout.
This
conversion is done by simply selecting an upper or lower case alphabet
character for each phoneme and using the ASCII number of that character minus
53. The benefits of subtracting 53 from the ASCII number are:
(1)
it puts the labels in the lower end of the MIDI spectrum, allowing the labels
to be used as MIDI information in their own right.
(2)
the lower number makes the labels easier to manipulate when ordering them for
playback. For example: the symbol A is attached to the label 12, which means it
can be used as MIDI note C.
The symbol, ASCII number and identification number attached to each phoneme are shown in Table 2.3 below.
|
Phoneme
sound FRONT
VOWELS |
Symbol |
ASCII# |
ID# |
Phoneme
sound NASALS |
Symbol |
ASCII# |
ID# |
|||||
|
ee -seed |
E* |
69 |
16 |
m - mow |
m |
109 |
56 |
|
||||
|
i - slid |
i |
105 |
52 |
n - no |
n |
110 |
57 |
|
||||
|
a - spade |
A* |
65 |
12 |
ng - sing |
N* |
78 |
25 |
|
||||
|
e - sled |
e |
101 |
48 |
DIPHTHONGS |
|
|
|
|
||||
|
a - had |
a* |
97 |
44 |
o - no |
x* |
120 |
67 |
|
||||
|
a - lamb |
L* |
76 |
23 |
ou -
pound |
W* |
87 |
34 |
|
||||
|
STOP
PLOSIVES |
|
|
|
ai - pail |
B* |
66 |
13 |
|
||||
|
t - to |
t |
116 |
63 |
i - pile |
I* |
73 |
20 |
|
||||
|
p - pat |
p |
112 |
59 |
oy - toy |
Y* |
89 |
36 |
|
||||
|
d - do |
d |
100 |
47 |
SEMI
VOWELS |
|
|
|
|
||||
|
b - bat |
b |
45 |
45 |
w - witch |
w |
119 |
66 |
|
||||
|
g - gone |
g |
103 |
50 |
wh -
which |
M* |
77 |
24 |
|
||||
|
c - cast |
k |
107 |
54 |
y - you |
y |
121 |
68 |
|
||||
|
BACK
VOWELS |
|
|
|
l - law |
L |
108 |
55 |
|
||||
|
a - palm |
R* |
82 |
29 |
r - raw |
r |
114 |
61 |
|
||||
|
o - hot |
o* |
111 |
58 |
CONTINUANT
FRICATIVES |
|
|||||||
|
aw - paw |
H* |
72 |
19 |
f - file |
f |
102 |
49 |
|
||||
|
oo - look |
K* |
75 |
22 |
v - five |
v |
118 |
65 |
|
||||
|
oo - boot |
U* |
85 |
32 |
th - thy |
T* |
84 |
31 |
|||||
|
o - float |
O* |
79 |
26 |
th - bath |
F* |
70 |
17 |
|||||
|
|
|
|
|
s - sue |
s |
115 |
62 |
|||||
|
CENTRAL
VOWELS |
|
|
|
h - hat |
h |
104 |
51 |
|||||
|
ir - bird |
D* |
68 |
15 |
ss -
mission |
Z* |
90 |
37 |
|
||||
|
er -
brother |
P* |
80 |
27 |
s -
vision |
J* |
74 |
21 |
|
||||
|
u - mud |
u* |
117 |
64 |
z - zip |
z |
122 |
69 |
|
||||
|
|
|
|
|
|
|
|
|
|
||||
* These symbols are not used by the IPA.
The speech sound /Q/ is used in the process shown
here, not the traditional two phoneme symbols /k/ and /w/.
The note-event table was created using a MAX[13] patch
called "ROWMAKER". "ROWMAKER" generates a random set of n
unique note-events.
Each note-event consists of four aspects:
(1)
a unique pitch, as opposed to pitch class, for example, the pitch C2 is distinct from the pitch C3;
(2)
a unique velocity, the velocity of a MIDI keyboard key when struck (MIDI
velocities 0 to 127 are used) ;
(3)
a unique duration, the time length of a note; and
(4)
a unique inter-onset time, the time length between note attacks.
In
Table 2.4 below many ID numbers and their note-events are not attached to
phoneme symbols. These ID numbers and note-events are not used in the pieces
presented here. The negative numbers -21, - 200 and -200 are applied to the ID
number, pitch and velocity used for white space and punctuation to ensure that
the equivalent of a rest is generated when punctuation or white space is
encountered in the text.
|
Phoneme symbol |
ID number |
Pitch |
Velocity |
Duration units |
Inter-onset time units |
|
A |
12 |
22 |
1 |
66 |
53 |
|
B |
13 |
3 |
41 |
17 |
52 |
|
C |
14 |
64 |
30 |
4 |
22 |
|
|
15 |
17 |
39 |
8 |
40 |
|
E |
16 |
61 |
33 |
16 |
34 |
|
F |
17 |
11 |
26 |
25 |
29 |
|
|
18 |
54 |
6 |
48 |
36 |
|
H |
19 |
20 |
16 |
23 |
17 |
|
I |
20 |
45 |
2 |
62 |
21 |
|
J |
21 |
16 |
65 |
47 |
27 |
|
|
22 |
56 |
47 |
54 |
58 |
|
L |
23 |
52 |
9 |
19 |
11 |
|
M |
24 |
15 |
56 |
20 |
9 |
|
N |
25 |
69 |
13 |
50 |
39 |
|
O |
26 |
27 |
62 |
52 |
30 |
|
P |
27 |
31 |
45 |
69 |
32 |
|
|
28 |
66 |
10 |
33 |
55 |
|
R |
29 |
26 |
52 |
55 |
42 |
|
S |
30 |
49 |
49 |
6 |
33 |
|
T |
31 |
44 |
40 |
38 |
6 |
|
U |
32 |
47 |
19 |
21 |
13 |
|
|
33 |
21 |
29 |
39 |
20 |
|
W |
34 |
29 |
42 |
34 |
31 |
|
|
35 |
25 |
38 |
26 |
2 |
|
Y |
36 |
59 |
58 |
61 |
43 |
|
Z |
37 |
48 |
11 |
41 |
62 |
|
|
38 |
13 |
57 |
7 |
18 |
|
|
39 |
12 |
18 |
58 |
24 |
|
|
40 |
36 |
35 |
35 |
26 |
|
|
41 |
38 |
46 |
24 |
51 |
|
|
42 |
58 |
63 |
56 |
4 |
|
|
43 |
1 |
14 |
2 |
37 |
|
a |
44 |
43 |
43 |
10 |
38 |
|
b |
45 |
51 |
55 |
15 |
68 |
|
c |
46 |
5 |
25 |
51 |
0 |
|
d |
47 |
23 |
12 |
28 |
41 |
|
e |
48 |
67 |
0 |
3 |
67 |
|
f |
49 |
63 |
5 |
13 |
44 |
|
g |
50 |
6 |
7 |
45 |
60 |
|
h |
51 |
35 |
61 |
1 |
1 |
|
i |
52 |
41 |
69 |
42 |
61 |
|
j |
53 |
46 |
48 |
0 |
3 |
|
k |
54 |
8 |
66 |
14 |
49 |
|
l |
55 |
50 |
17 |
44 |
23 |
|
m |
56 |
30 |
8 |
59 |
16 |
|
n |
57 |
0 |
36 |
60 |
28 |
|
o |
58 |
18 |
23 |
11 |
12 |
|
p |
59 |
9 |
44 |
31 |
5 |
|
|
60 |
4 |
50 |
46 |
50 |
|
r |
61 |
62 |
59 |
64 |
59 |
|
s |
62 |
10 |
68 |
9 |
48 |
|
t |
63 |
2 |
31 |
32 |
63 |
|
u |
64 |
65 |
34 |
57 |
56 |
|
v |
65 |
33 |
53 |
37 |
65 |
|
w |
66 |
7 |
64 |
18 |
7 |
|
x |
67 |
40 |
3 |
63 |
15 |
|
y |
68 |
32 |
67 |
30 |
46 |
|
z |
69 |
55 |
20 |
22 |
69 |
|
Punctuation and white space |
-21 |
-200 |
-200 |
10 |
10 |
To
create the source cantus firmus a phoneme ID number is attached to each
note-event in the table of note-events shown in Table 2.4.
This
composition process borrows heavily from middle to late twentieth century
serial processes in that aspects of each note are pre-ordained and immutable.
However, the pieces do not follow the serial method of cycling through a series
of note-events. Instead, the note-events which make up the source single line
melody are selected from the pool of note-events shown in Table 2.4. As each
phoneme ID is encountered the note-event attached to that ID number is
triggered. In this way a single line melody, referred to here as the cantus
firmus, is created.
Table
2.5 shows the order in which each phoneme symbol, and therefore each ID number
and note-event is triggered. This table lists the first two lines of the poem;
the entire list can be found in Appendix 2. A space of three rests falls
between each word of the poem regardless of line breaks. Lines 175 to 180,
shown in Appendix 2, refer to the break between stanzas and this is represented
by six rests.
|
Event order |
Phoneme symbol |
ID number |
Pitch |
Velocity |
Duration |
Inter-onset time |
|
1 |
d (dot) |
47 |
23 |
12 |
28 |
41 |
|
2 |
o |
58 |
18 |
23 |
11 |
12 |
|
3 |
t |
63 |
2 |
31 |
32 |
63 |
|
4 rest |
|
|
|
|
10 |
10 |
|
5 rest |
|
|
|
|
10 |
10 |
|
6 rest |
|
|
|
|
10 |
10 |
|
7 |
s (sphere) |
62 |
10 |
68 |
9 |
48 |
|
8 |
f |
49 |
63 |
5 |
13 |
44 |
|
9 |
E |
16 |
61 |
33 |
16 |
34 |
|
10 |
r |
61 |
62 |
59 |
64 |
59 |
|
11 rest |
|
|
|
|
10 |
10 |
|
12 rest |
|
|
|
|
10 |
10 |
|
13 rest |
|
|
|
|
10 |
10 |
|
14 |
p (planet) |
59 |
9 |
44 |
31 |
5 |
|
15 |
l |
55 |
50 |
17 |
44 |
23 |
|
16 |
a |
44 |
43 |
43 |
10 |
38 |
|
17 |
n |
57 |
0 |
36 |
60 |
28 |
|
18 |
e |
48 |
67 |
0 |
3 |
67 |
|
19 |
t |
63 |
2 |
31 |
32 |
63 |
|
20 rest |
|
|
|
|
10 |
10 |
|
21 rest |
|
|
|
|
10 |
10 |
|
22 rest |
|
|
|
|
10 |
10 |
|
23 |
k (continent) |
54 |
8 |
66 |
14 |
49 |
|
24 |
o |
58 |
18 |
23 |
11 |
12 |
|
25 |
n |
57 |
0 |
36 |
60 |
28 |
|
26 |
t |
63 |
2 |
31 |
32 |
63 |
|
27 |
i |
52 |
41 |
69 |
42 |
61 |
|
28 |
n |
57 |
0 |
36 |
60 |
28 |
|
29 |
e |
48 |
67 |
0 |
3 |
67 |
|
30 |
n |
57 |
0 |
36 |
60 |
28 |
|
31 |
t |
63 |
2 |
31 |
32 |
63 |
|
32 rest |
|
|
|
|
10 |
10 |
|
33 rest |
|
|
|
|
10 |
10 |
|
34 rest |
|
|
|
|
10 |
10 |
|
35 |
k (coastline) |
54 |
8 |
66 |
14 |
49 |
|
36 |
O |
26 |
27 |
62 |
52 |
30 |
|
37 |
s |
62 |
10 |
68 |
9 |
48 |
|
38 |
t |
63 |
2 |
31 |
32 |
63 |
|
39 |
l |
55 |
50 |
17 |
44 |
23 |
|
40 |
I |
20 |
45 |
2 |
62 |
21 |
|
41 |
n |
57 |
0 |
36 |
60 |
28 |
|
42 rest |
|
|
|
|
10 |
10 |
|
43 rest |
|
|
|
|
10 |
10 |
|
44 rest |
|
|
|
|
10 |
10 |
|
45 |
S (shadow) |
30 |
49 |
49 |
6 |
33 |
|
46 |
a |
44 |
43 |
43 |
10 |
38 |
|
47 |
d |
47 |
23 |
12 |
28 |
41 |
|
48 |
O |
26 |
27 |
62 |
52 |
30 |
|
49 rest |
|
|
|
|
10 |
10 |
|
50 rest |
|
|
|
|
10 |
10 |
|
51 rest |
|
|
|
|
10 |
10 |
|
52 |
s (city) |
62 |
10 |
68 |
9 |
48 |
|
53 |
i |
52 |
41 |
69 |
42 |
61 |
|
54 |
t |
63 |
2 |
31 |
32 |
63 |
|
55 |
i |
52 |
41 |
69 |
42 |
61 |
|
56 rest |
|
|
|
|
10 |
10 |
|
57 rest |
|
|
|
|
10 |
10 |
|
58 rest |
|
|
|
|
10 |
10 |
|
59 |
C (chess) |
14 |
64 |
30 |
4 |
22 |
|
60 |
e |
48 |
67 |
0 |
3 |
67 |
|
61 |
s |
62 |
10 |
68 |
9 |
48 |
|
62 rest |
|
|
|
|
10 |
10 |
|
63 rest |
|
|
|
|
10 |
10 |
|
64 rest |
|
|
|
|
10 |
10 |
The
resulting cantus firmus is passed through the algorithm "COMPOSE".
Here each attribute of each note-event, the pitch, velocity, duration and
inter-onset time, is multiplied within a modulus and the result is then added
to or subtracted from. The following paragraph gives an example in which only
the pitch attribute is affected:
Imposing
a multiplication of 3, a modulus of 12 and a transposition of 60 on the pitch
attribute of the cantus firmus results in pitch numbers between MIDI note 60
and MIDI note 72, regardless of the pitch numbers sent into the algorithm. Therefore
the resulting pitch numbers are: 60, 63, 66 and 69. This process is shown in
Table 2.6.
|
Original pitch |
Multi-plication |
Result |
Modulus |
Result |
Addition |
Resulting pitch |
|
23 |
3 |
69 |
12 |
9 |
60 |
69 |
|
64 |
3 |
192 |
12 |
0 |
60 |
60 |
|
85 |
3 |
255 |
12 |
3 |
60 |
63 |
|
46 |
3 |
138 |
12 |
6 |
60 |
66 |
Using
the table above the speech sound /d/ could be assigned the ID number 47; the
pitch 23; the velocity 12; the duration 28 time units; and the inter-onset time
41 time units. The inter-onset time and the duration can be multiplied to
create longer time lengths. The same types of adjustments as those imposed on
the pitches may be imposed on the other attributes of the cantus firmus. By
doing this a melody based in the cantus firmus is produced. Appendix 3 shows
the interface for the COMPOSE and CANON algorithms and the process for
triggering events.
The
resulting modified melody then passes through an algorithm called
"CANON", which creates an arpeggio based on the modified melody. The processes
this algorithm uses belong to two main types, firstly defining the pitch
interval between selected note-events; and secondly defining the inter-onset
time between selected note-events. Here the pitch interval process types are
listed, beginning with 6.1.1 through to 6.1.7.
6.1.1:
periodic selection of note-events from the modified melody; for example,
selecting every third note-event;
6.1.2:
defining the pitch intervals between those selected note-events; for example,
if the first selected note is MIDI note 60 and the next selected note is MIDI
note 72 then the pitch interval is 12 (72 - 60), if the second note is MIDI
note 48 then the pitch interval is -12 (48 - 60);
6.1.3:
multiply the interval numbers by a floating point number or a whole number; for
example, an interval of -12 multiplied by 0.25 gives an interval of -3, or if
-12 is multiplied by 2 the interval is - 24;
6.1.4:
transpose the resulting interval numbers up or down by adding or subtracting a
number from the resulting interval number. For example, if a transposition
level of 24 is made, by adding 24 to the interval numbers, then all the
interval numbers are raised by 24.
6.1.5:
store the final interval numbers;
6.1.6:
apply the four intervals to the next selected pitch; for example, if the next
selected pitch is 60 and the intervals are: -3, 3, 6, and 12, then a chord made
up of the MIDI notes, 57 (A), 60 (C), 63 (D#), 66 (F#) and 72 (C) with 60 (C)
being the pitch around which the chord is built;
6.1.7:
output the chord pitches at the times specified by the inter-onset time
selection process. This creates arpeggios or chords depending on the
multiplication of the note-event inter-onset times.
This
process of pitch selection and generating the arpeggio pitches from the selected
pitches is repeated four times to produce a four note-event arpeggio or chord.
The processes for inter-onset time selection and generating the arpeggio
inter-onset time intervals are shown below, starting with 6.2.1. through to
6.2.4.
6.2.1:
periodic selection of note-events from the core melody, as in 6.1.1;
6.2.2:
multiply the inter-onset time of the selected note events by a whole number or
a floating point number. This multiplication can be by numbers chosen and
preset by the composer, as shown in Figure 2.2 a, or by a ratio between two
numbers chosen by the composer, as shown in Figure 2.2 b. For example, if an
inter-onset time of 1000 milliseconds is multiplied by 4 it then lasts for 4
seconds, if it is multiplied by 0.25 then the resulting inter-onset time is 250
milliseconds or, if the inter-onset time is multiplied by 0, a chord is
produced;
Figure 2.2a Using a preset to set arpeggio delays
|
Inter onset preset 1 is chosen, resulting
in a delay time of 0 msecs. between each note of the arpeggio, that is, a
chord. |
|
|
Figure 2.2b Using a ratio to set
arpeggio delays |
||
|
Inter-onset * is chosen. The time
is set at 1 then each successive arpeggio delay is multiplied by 0.5. |
|
|
6.2.3:
store the inter-onset times to be used on the next selected note ;
This creates a polyphony built on the intervals of the
elaborated core melody. An example is shown in Figure 2.3a.
The
inter-onset times between notes can be reflected in the generated notes, as
shown in Figure 2.3b. The hollow note head is the note generated by
"CANON".

The
"CANON" algorithm can place up to four pitches alongside a melody
pitch. These pitches can create a chord, as in Figure 2.3 a, or can be time
displaced in relation to the time displacement of the chosen melody pitches, as
in Figure 2.3 b. Here the displacement is similar to the displacement of the
chosen pitches, that is, one 1/4 note. This time displacement can be lengthened
or shortened to fit the needs of the composer.
The
philosophy behind the "CANON" algorithm is that if the supporting
polyphonic and harmonic background of a melody reflects the pitch and time
intervals of that melody a cohesive and predictable relationship between the
foreground and background is produced. This then provides a holistic and
tightly integrated musical event for the listener. Appendix 3 describes how
"COMPOSE" and "CANON" are used in the three studies
presented here.
In
Studies 1 through to 7, (audio tracks one to seven on the compact disc One), a
Korg 05R/W MIDI module is used for playback. The piano sound was chosen because
it brings forward the rhythmic attributes of the compositions, due to the very
fast attack of the piano sound, and the harmonic attributes by the interplay of
sustained strings. Using the Korg 05R/W means that the compositions are not
re-interpreted by a musician. Therefore the timing of each note and the
velocity with which each key is triggered reflects the computer composition as
accurately as possible.
The
seven studies given here are firmly rooted in the serial, twelve tone, musical
traditions. As mentioned above the phoneme string provides a melody that is
difficult to reconcile within western musical expectations and traditions,
including the serial traditions of the middle and late twentieth century. The
sixty-nine possible pitches used here far exceed the range of twelve possible
pitch classes commonly used in serial music. This is also the case for the
other three elements, amplitude, duration and inter-onset time, used in the
pieces.
Another
mitigating factor in relating this music to its roots is that the order of
note-events is not repeated as regularly or predictably here as in traditional
serial music. The most effective result of this wide range and lack of regular
repetition is that the point around which the listener can orient himself or
herself is elusive when compared to more traditional musical structures.
The
problem of the listener's orientation is partly overcome by the use of the
CANON algorithm. By creating other note streams which reflect the core melody
either as a simultaneous, vertical harmony, as in studies two and three, or as
an arpeggio, or horizontal harmony, as in study one, a context based on the
core melody is provided. This process is most valuable in study two where the
full range of the core melody is used.
Problems
resulting from the large range of pitch, velocity, duration and inter-onset times
are partly overcome in the COMPOSE algorithm. This is done by simply reducing
the range of some of the elements. In each of the studies presented the
velocity of the core melody remains intact. This maintains one unique attribute
in each of the note-events in the core melody.
By
reducing the range of the note elements used in the core melody, as in studies
one and three, there is greater repetition of pitch, duration and inter-onset
times. This results in a more predictable rhythmic structure, particularly in
study three, where note durations and inter-onset times are based on
traditional metric subdivisions.
The
main success of this process is the non-mechanised, almost improvised nature of
each piece. This resonates with the improvised nature of speech, even when it
is constrained by the stylistic requirements of poetry.
On
compact disc One there is the full version of ZOOMING IN and extracts of
music based on two other poems which have been played through the same
algorithms. These two poems are Ambit, again by Alex Skovron and Not
Yet One by another Australian poet, Earl Livings; they are given, in full,
in Appendix 5. Only the first stanza of Ambit is used in the example
here. These are provided as examples of how this process of creating music from
a text works with other texts. In each case exactly the same algorithms are
used.
Areas
for expanding this process and exploring it further include testing it with
poems which use more predictable rhyme and rhythm schemes, such as limericks.
This is somewhat explored in Ambit and Not Yet One, both of which
have more predictable rhyme schemes than ZOOMING IN. Doing this could
create more repetition and redundancy in the core melody resulting in the
listener orienting themselves more easily.
[1] Jean-Jaques
Nattiez, op.cit., p. 80.
[2] ibid., p.81.
[3] Trevor Wishart,
On Sonic Art. York, Imagineering Press, 1985. p 156. Wishart does not
number pages with diagrams or figures, resulting in some possible confusion as
to exact page numbers.
[4] ibid., p. 157,
facing page.
[5] ibid., p. 158,
facing page.
[6] ibid., p. 158.
[7] Robert Rowe, Interactive
Music Systems: Machine Listening and Composition. Cambridge, Mass. MIT
Press, 1993, pp 32-36.
[8] It is
interesting to note how the first three notes in each pitch column, ie C, E, G,
when taken from the bottom up, produce the first five chords of the major
scale: I ii iii IV V. The next group of three notes, ie. A C E, form the vi
chord and the omitted B would form a vii chord. This aspect was not relevent in
Guido's time as these concepts of harmony were not considered.
[9] MIDI is the
acronym for Musical Instrument Digital Interface, a standard serial interface
for most commercial electronic instruments which allows communication between
instruments of differing manufacturers and between computers and electronic
instruments.
[10] Alfred Blatter,
Instrumentation/Orchestration. New York, Longman, 1980, p 411. This is
not a comprehensive set of the phonetic symbols, in fact it is more a set of
phoneme symbols, but it is sufficient for the purposes here.
[11]ASCII is the
acronym referring to the American Standard Code for Infromation Exchange. It
gives a standard number to each key or combination of key strokes on the
computer keyboard.
[12] Examples of linguistic terms such as stop plosive,
fricative, nasal and so on are given in Table 3.3.
[13] All computer
algorithms used here were created using Opcode's © MAX program, version 2.5.2.
The MAX patches use only the standard MAX library and sub-patches created by me
from that library.