Professional Documents
Culture Documents
Volume 1
Charles Koechlin (1867-1950): His Life and Works
Robert Orledge
Volume 2
Pierre Boulez—A World of Harmony
Lev Koblyakov
Volume 3
Bruno Maderna
Raymond Fearn
Volume 4
What's the Matter with Today's Experimental Music? Organized Sound Too Rarely Heard
Leigh handy
Volume 5
Linguistics and Semiotics in Music
Raymond Monelle
Volume 6
Music, Myth and Nature, or The Dolphins of Arion
François-Bernard Mâche
Volume 7
The Tone Clock
Peter Schat
Volume 8
Edison Denisov
Yuri Kholopov and Valeria Tsenova
Volume 9
Hanns Eisler—A Miscellany
Edited by David Blake
Volume 10
Brian Ferneyhough—Collected Writings
Edited by James Bows and Richard Toop
On Sonic Art
by
Trevor Wishart
A new and revised edition
Edited by
Simon Emmerson
Copyright © 1996 OPA (Overseas Publishers Association) N.V. Published
by license under the Harwood Academic Publishers imprint, part of The
Gordon and Breach Publishing Group.
Reprinted in 2002
by Routledge
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
Wishart, Trevor
On Sonic Art, – New and Rev. ed. –
(Contemporary music studies; V. 12)
1. Computer composition 2. Computer music
3. Music – Philosophy and aesthetics
I. Title II. Series III. Emmerson, Simon
781.3’4
PRELUDE
Chapter 1 What is Sonic Art?
PART 2: LANDSCAPE
Chapter 7 Sound landscape
Chapter 8 Sound-image as metaphor: music and myth
Chapter 9 Is there a natural morphology of sounds?
Chapter 10 Spatial motion
PART 3: UTTERANCE
Chapter 11 Utterance
Chapter 12 The human repertoire
Chapter 13 Phonemic objects
Chapter 14 Language stream and paralanguage
Chapter 15 The group
CODA
Chapter 16 Beyond the instrument: sound models
Bibliography
Music Examples
Music References
Index
INTRODUCTION TO THE SERIES
1 Audible Design (A Plain and Easy Introduction to Practical Sound Composition). York: Orpheus
the Pantomime Ltd., 1995.
2 For this reason added footnotes only reference more recent developments where this helps clarify a
point.
PREFACE1
This book was written in a period of six weeks whilst in residence as the
Queen's Quest Visiting Scholar in the Music Department of Queen's
University, Kingston, Ontario. It is a very much expanded version of a
series of lectures given in the Department on the subject of electronic
music, though in fact it ranges over a field much wider than that normally
encompassed by this term. The book grows out of my own musical
experience over the past twenty years. Some of the ideas which I had been
developing were fully confirmed during my experience of the course in
computer music at the Institut de Recherche et Coordination
Acoustique/Musique (IRCAM) in Paris during the summer of 1981.
I would particularly like to thank the Queen's University Music
Department for their generous invitation to me which made the writing of
this book possible. In addition I am particularly indebted to Professor Istvan
Anhalt and to Jean-Paul Curtay for the sharing of ideas and insights into the
field of human utterance. I am also indebted to the Yorkshire Electro-
Acoustic Composers’ Group (previously the York Electronic Studio
Composers’ Group) for the various debates and discussions on musical
aesthetics in which I have been involved over the years.
In a way this book has grown out of a profound disagreement with
my friend and fellow-composer, Tom Endrich, whose very thorough
aesthetic research is founded primarily upon the properties of pitch and
duration organization in different musical styles, both within the Western
tradition and from different musical cultures. I hope that this book will
present a rigorous complement to those ideas and look forward to further
intense debate.
In addition, I would particularly like to thank Richard Orton, Peter
Coleman, Philip Holmes, Simon Emmerson, David Keane, my wife Jackie
for her continuing support and Jane Allen who typed the [original edition of
this] book.
Trevor Wishart
York, 1985
1 The original edition of this book was produced entirely by the author. There were two additional
paragraphs in the Preface of 1985. The first included a pessimistic prediction about the development
of (open access) computer music facilities in Britain which Wishart's own participation in the
foundation and development of the Composers’ Desktop Project in subsequent years was at least
partly to prove wrong. A final paragraph apologized for some of the literary and editing problems
inherent in a self-produced publication which we trust this new edition has addressed. (Ed.)
ACKNOWLEDGEMENTS
The Editor acknowledges with thanks the assistance of Hugh Davies in the
location of material by Jean-Paul Curtay, and of Bob Cobbing for the loan
of the rare Kostelanetz Text-Sound Texts; also the continued support of
Harwood Academic Publishers.
Prelude
Chapter 1
WHAT IS SONIC ART?
This is certainly a good definition to open our minds to the new possibilities
but unfortunately it is much too wide to offer us any advice or sense of
direction in our approach to the vast new world of sounds at our disposal.
At the other extreme Lejaren Hiller remarks, perhaps inadvertently, in an
article in Computer Music Journal:
[…] computer-composed music involves composition, that is note-selection.
Hiller 1981: 7)
In this book I will suggest that the logic of this assertion is inverted. It is
notatability which determines the importance of pitch, rhythm and duration
and not vice versa and that much can be learned by looking at musical
cultures without a system of notation.
What is the series? The series is—in very general terms—the germ of a developing hierarchy
based on certain psycho-physiological acoustical properties, and endowed with a greater or
lesser selectivity, with a view to organising a FINITE ensemble of creative possibilities
connected by predominant affinities, in relation to a given character; […]. (Boulez 1971: 35)
In this book, I will suggest that we do not need to deal with a finite set of
possibilities. The idea that music has to be built upon a finite lattice and the
related idea that permutational procedures are a valid way to proceed will
be criticised here and a musical methodology developed for dealing with a
continuum using the concept of transformation.
When noise is used without any kind of hierarchic plan, this also leads, even involuntarily, to
the ‘anecdotal’, because of its reference to reality. […] Any sound which has too evident an
affinity with the noises of everyday life […], any sound of this kind, with its anecdotal
connotations, becomes completely isolated from its context; it could never be integrated,
since the hierarchy of composition demands materials supple enough to be bent to its own
ends, and neutral enough for the appearance of their characteristics to be adapted to each
new function which organises them. Any allusive element breaks up the dialectic of form and
morphology and its unyielding incompatibility makes the relating of partial to global
structures a problematical task.
(Boulez 1971: 22–23)
This is a rather eloquent example of the ideology of instrumental puritanism
—thou shalt not represent anything in music. In this book I will propose:
As has already been pointed out, sound-art can no longer be confined to the
organisation of notes. Even this original conception had already been
broadened to include at least three areas:
Writing, speaking
Since very ancient times, human thought and communication has been
inextricably bound up with the use of the written word. So much so that it
becomes almost impossible for us to disentangle ourselves for a moment
from the web of written wisdom and consider the problems of meaning and
communication in vitro, so to speak. Ever since the ancient Egyptians
developed pictures into a viable form of hieroglyphic notation, our world
has been dominated by a class of scribes, capable of mastering and hence
capable, or deemed capable, of controlling what was to be written down and
stored in the historical record. Although this function was often delimited or
occasionally usurped by illiterate or semi-literate political supremos, such
tyrants have usually succumbed to the literate scribehood's cultural web as
evidenced by the ‘barbarian’ invasions of the Roman and Chinese empires
and to some extent by the Moslem conquest of Persia and Byzantium which
generated a novel cultural epoch by throwing together the divergent
scribehoods of these two long-established cultures under the unifying
banner of Islam.
In the long era of scribery, all people regarding themselves as
‘cultured’ or ‘civilised’, as opposed to illiterate peasants or craftsmen, have
lived within the confines of an enormous library whose volumes have laid
down what was socially acceptable and, in effect, possible to know and to
mean. Whilst those lying on the margins of ‘civilisations’ retain some
subcultural independence—variously labelled as ‘ignorance’,
‘backwardness’, ‘superstition’, ‘folklore’ or ‘folkculture’—they equally had
no access to the pages of history, and hence whatever the significance of
their cultural world, it was devalued by default. The vast growth in literacy
in the last century, with its numerous undoubted social advantages, has,
however, further increased the dominance of our conception and perception
of the world through that which can be written down.
So here we are in a library, and I would like to convey to you what I
mean. If, for a moment, we could put all these volumes of words on one
side, if we could face each other across a table and engage in the immediate
dialectic of facial and bodily gestures which accompany face-to-face speech
communication, perhaps you could appreciate that what I intend to mean is
not necessarily reducible to the apparent meanings of the words I employ
during the interchange; perhaps you could reach through my words to my
meanings.
Writing, originally a clever mnemonic device for recording the verbal
part of important speech communications between real individuals, soon
grew to such a degree as to dominate, to become normative upon, what
might properly be said. Divorced from the immediate reality of face-to-face
communication, it became objectified, generalised, and above all, permitted
the new class of scribes (whether priests, bureaucrats or academics) to
define and control what might ‘objectively’ be meant. Max Weber's
conception of the advance of Western civilisation, spearheaded by a
specialist rational bureaucracy, is a natural outgrowth of this simple
development. In fact, Weber devoted a small volume to a discussion of the
‘rationalisation’ of musical systems embodied in the Western European
tempered scale (Weber 1958).
For Plato, the idea of the object, which took on a new historical
permanence in its notation in the written word, came to have more ‘reality’
than the object-as-experienced. The commonplace tables and chairs which
we experience in the course of our everyday life were mere pale reflections
of the ideal table and chair existing in some Platonic heaven. (This heaven
in fact was to be found between the covers of books.) This radically new
stance reflects a permanent tendency of scribe-dominated cultures towards
the reification of ideas and the undervaluing of immediate non-verbal
experience, which has special relevance to the history of music. Even for
the average literate individual it might at first sight appear that what we can
think is commensurate with what we can say, and hence to appear verbally
confused or elliptical is easily interpreted as a failure of clear thought,
rather than a difficulty of verbal formulation of a perfectly clear non-verbal
idea. For example, the idea of a good ‘break’ in improvised musical
performance is clearly understood by any practitioner but has never been
adequately reduced to a verbal description.
I am going to propose that words never ‘mean’ anything at all. Only
people ‘mean’ and words merely contribute towards signifying peoples’
meanings. For the scribe meaning appears to result as the product of a
combinatorial process; broadly speaking, various words with more or less
clearly defined reference or function are strung together in a linear
combination to form sentences, paragraphs, etc., which have a resultant
clearly specified meaning. For the individual speaker, however, meaning is
a synthetic activity. She or he means. Not merely the combination of words
but a choice from an infinitude of possible inflections, tones of voice and
accents for their delivery, together with possibilities of movement, gesture
and even song, enter into the synthesis of the speech-act which attempts to
convey what he or she means. In this way a speech act may uniquely
convey quantities of information about the state of mind of the speaker and
his relationship to what is said (for example irony and so on) which would
be entirely lost if merely the words used were transcribed, but is certainly
not lost on the person spoken to. It is clear that not meaning, but
signification, resides in the words and that the mode and context of use of
these significations all contribute towards the speaker's meaning. These two
quite different conceptions of the meaning of words contribute differently to
our experience. The idea of meaning as a synthetic activity is most
significant in direct communications with other human beings, which might
be mediated through musical instruments or recording. The idea of meaning
as a structural property of written words governed by rules of combination
is the basis for the operation of our system of law. Law codes are in a sense
seen as existing transcendentally and having a meaning independent of the
original creators of the legal documents—though of course this does in time
lead to difficulties of interpretation.
Now immediately we become aware of a problem, for all that
remains of what we or anyone else ever meant, once committed to
parchment or print, is these marks on the paper. Here in the library, we see
love, tragedy, joy, despair, lying silently on the shelves, the entire history of
the word. Occasionally, a gifted scholar does appear to question the very
basis of a writing-dominated world-view. Lao Tse, the Chinese philosopher,
resorted to extreme verbal ellipsis in a late attempt to notate his
philosophical stance. At the other extreme, Marx, whose principal
commitment lay outside the scholarly profession, still felt impelled to
justify his world-view before the international scribehood and committed to
paper the astonishing theory that the world is shaped by human activity,
whilst talking, writing and the resulting development of ideas, constitute
only one particular type of human activity, and this of secondary
importance to materially productive economic activity. What had usually
been regarded as history-as-such was, in his view, merely one particular
reified result of human activity. The enscribed verbalisations of certain
mortals with certain preconceptions, economic interests and systems of
relevance.
Unfortunately, Marx's great scholarly erudition won for his radical
works a more or less permanent place on the library shelves, but in so doing
it delivered his work into the hands of the scribehood, who would
promulgate his writings, but not very often their significance. The up-and-
coming would-be radical scholar would learn about ‘praxis’ as a concept in
‘Marxist epistemology’, his understanding of alienation or class-
consciousness would be understood by its verbal competence.
At the other extreme, we have music! Ever since the world library opened,
there have been problems in this department. Somehow it seemed that
music could mean something to people, judging by their reactions, but this
something rarely seemed reducible to any definite verbal equivalent. Music
as an alternative mode of communication, however, has always threatened
the hegemony of writing and the resultant dominance of the scribehood's
world-view. Therefore, from the earliest times, attempts have been made to
lay down what could and could not be accepted as ‘correct’ musical
practice. Both Plato and Confucius recognised the threat posed by
uncontrolled musical experience to the ‘moral fibre’ of the rationalistic
scribe state, and advised the exclusive adoption of forms of music which
seemed to them to be orderly in some kind of verbally explicable way. As,
for the moment, there was no way of capturing music in the same way as
speech—no notation procedure—it seemed safest to adhere absolutely to
previous musical practice, while often ensuring that the music itself was
subservient to an approved text. The codification and standardisation of
church chant by Pope Gregory in post-Roman Europe may be seen as but
one example of a tendency which is exemplified by the Chinese emperor's
great concern for the ‘correct’ tuning of the imperial pitch-pipes at the
beginning of his reign, the execution of performers who made mistakes
during ceremonial performances in the Aztec world and in many other
cultures, and so on.
With the appearance of musical notation, new factors came into play.
However, a rapid glance at the syllabuses of most Western universities
(centres of writing dominated culture) will reveal the tremendous emphasis
placed upon the study of composers who employed a clearly, rationally
codifiable (verbalisable) musical praxis, in particular the work of Palestrina
(the champion of the Council of Trent), J. S. Bach and, of course,
Schoenberg and his ‘12-tone technique’. Even so, music continued to
convey its alternative messages and holy men (like St. Augustin) were
obliged to admonish themselves before God for being seduced by the ‘mere
sensuous quality of musical sounds’. This feeling that attention to aspects of
sound beyond those which are capable of description, and hence
prescription, in writing (and later in musical notation), is lascivious or
morally harmful is a recurring theme of scribe-dominated societies.
Committed verbalists will not be convinced by anything I have to say
about the separation between ‘meaning’ and ‘signification’. For the
linguistic philosopher all problems are reducible to problems of
signification within language and such a philosopher will merely deny the
validity of our problem. However, if you are capable of imagining that
talking to your lover is not merely an exchange of syntactically-related
arbitrary signs and bodily gestures, but an essentially non-verbal
communion between two people, mediated and articulated through word
and gesture, but not constituted by them, then you may understand what I
have to say.
Firstly, if this communion exists, surely it can be named. This is
perfectly true; however, the point remains that its articulation is not the
articulation of signs. We must not assume that we can notate its articulation
by attaching signs to different parts of it and then articulating the signs.
Written language constitutes what I will call a discrete/combinatorial
system. Written words are strictly delimited, distinct and repeatable entities
which form the finite elements of a combinatorial process of structure-
building. Our internal ‘state’ (whether a ‘bio-emotional state’ or
‘intellectual-physiological state’—but let us not be deceived by a label)
constitutes a holistic/ processual system. The distinction between these two
systems can be hinted at by reference to analogies. First of all we have the
distinction between an analogue and a digital system. In an analogue system
the state of the system can be represented by continuously varying
parameters (corresponding to the holistic/processual system) whereas in the
digital system the state is broken up into discrete samples which have
discrete values (corresponding to a discrete/combinatorial system). Of
course, with modern digital technology, the discrete states can be made so
close together, particularly in terms of time that the distinction between a
discrete and a holistic representation ceases to be of importance. However,
on the grosser level of representation that we find in the
discrete/combinatorial system of language, the distinction is absolutely
crucial. A second, though more tenuous, analogy might be seen in the
distinction between particulate and wave descriptions of phenomena such as
the behaviour of light, though again these have a point of reconciliation in
modern quantum theories.
The distinction between these two systems is perhaps one reason why
our vocabulary for referring to internal states is so vague and ill-defined.
Furthermore, there is an important distinction between the experience (the
state) as the state of ourselves, and the mere notations of it, the arbitrary
labels assigned to bits of the ongoing process; or between the most
immediate reality of me, now, and the reality of socially interdefinable
name-plates and syntactic laws. We may reach some agreement on how to
use these name-plates, but that does not touch the heart of the matter. This
problem is with us as soon as we begin to speak. But it is writing, with the
consequent reification of ideas in written reportage and the scribal control
of world-view that forces the problem to the centre of civilisation. Very
soon we are beginning to deny the existence of any sub-label reality at all,
and such things that we have called ‘the emotions’, or the highly articulate
gestural response in improvised music which we may vaguely refer to as
‘spontaneity’, become as mysterious as Platonic ideals.
What the aural-tradition musician takes on faith is that music does
touch the heart of the matter. With language, the actual medium may not be
of special significance; it may be spoken (sound), written (visual), touched
(Braille) and so forth. In a certain sense, a significant part of the message
transcends the immediate concrete experience of the medium which carries
it. Music, however, cannot be divorced from the medium of sound1 and
enters into our experience as part of an immediate concrete reality; it
impinges on us and in so doing it effects our state. Furthermore, as Susanne
Langer remarks in Feeling and Form, in its articulation of the time-
continuum of concrete experience, it corresponds directly with the
continuum of our experiencing, the continuous flux of our response-state
(Langer 1953: chapter 7).
Hence, our pre-notation musician takes on faith that the way his
musical articulation of sound impinges upon his own state is in many ways
similar to the way it impinges upon the state of others. He seeks no verbal
confirmation (except indirectly), understanding that there can be none. We
might say that there is no divorce between the syntax of musical activity
and the syntax of musical experience. Whatever is played is directly
monitored, by the ears, by the player's immediate response to it. There is an
immediate dialectic of musical action and experience by which music
reaches directly to us in a way which language can never do,
communicating powerful messages which are not refutable within the
socially-approved categorical systems of any scribe-culture. It is music's
intrinsic irrefutability, its going behind the back of language, which has
caused it to be viewed with so much suspicion and disdain by guardians of
socially-approved order.
Musical gesture
We are now in a position to describe the concept of a lattice and its bearing
on conventional music theory. For anyone who has ever heard a pitch
portamento or a tempo accelerando, both pitch and tempo can take on an
infinitude of possible values and may vary continuously over this
continuum. Notation, however, imposes a finite state logic upon the two
domains. The result is that music, at least as seen in the score, appears to
take place on a two-dimensional lattice (see Figure 2.4a). Two things should
be said about this lattice formulation. First of all it is our conception of what
constitutes a valid musical object which forces ‘musical sounds’ onto this
lattice; secondly, despite our intentions, the lattice only remains an
approximate representation of what takes place in actual sound experience
(except in the extremely controlled conditions of a synthesis studio).
The technology of instrument design underlines and reinforces this
lattice conception of musical architecture. First of all, on keyed, holed or
fretted instruments, the discrete logic of the pitch lattice is imposed on the
production mechanism of sound-objects. Secondly, the concept of the
instrument itself further expands the lattice notion. Conceptually, at least, an
instrument is a source of stable timbre, but variable pitch. The essential
function of an instrument is to hold timbre stable and to articulate the pitch
parameter. This conception contributes to the myth of the primacy of pitch
(and duration) in musical architecture. The grouping of instruments into
families of distinct timbral type and the development of music based upon
fixed-timbre (or instrumental) streaming develops the lattice one stage
further.
Figure 2.3 Summative rhythm: each note value can be expressed as the sum of smaller equal note
values.
Figure 2.4a Music on a two-dimensional lattice (schematic representation).
However, the fact that Schoenberg felt unable to write the third act of the
opera, in which Moses’ view triumphs has, in the light of the present
discussion, far-reaching significance for an understanding of contemporary
avant-garde music.
Where concrete musical relationships—at least originally based on
their experiential success—are represented by their notations in the score,
and study and conception focuses upon this structure, divorced from the
experiential immediacy of the sound itself, these relationships, as
rediscovered in the score, may be mistaken for conventional relationships.
In other words, what to direct gestural experience may appear as a
necessary relationship—in that it is only through that particular musical
structure that a successful communication of the kind intended can take
place—can come to appear in the score as merely arbitrary permutations of
‘notes’ and ‘time-values’. On the timeless flat surface of the score the
visual-spatial relationships of the notes (used to represent real time) may be
changed at will to produce arbitrarily arrived-at visual-spatial structures, all
having equal validity in visual space, but not necessarily so in experiential
time.
Once, however, we demand that music be heard in terms of the score,
then it is no longer experiential success which justifies notational visual-
spatial arrangements, but notational arrangements become their own
justification. Hence, ‘musical form’ may become freed from any restriction
of direct experiential success in our original terms. This leads ultimately to
a rational formalism in music. The composer establishes certain visual
relationships between entities in his notation, the musical scholar is trained
to listen for these relationships, he hears them and a successful ‘musical’
communication is declared to have taken place.
This beautifully closed rationalist view of music is the ultimate in
scribal sophistication, it is complete and completely unassailable in its own
terms. Music is hence completely socially definable and musical success
may almost be measured with a slide rule. How much more tidy and
convenient such a norm-adherent view of music than one bringing in the
messy business of inter-personal, yet unverbalisable, gestural dialectics.
The rationalist view of music fits ideally into a technocratic age with its
linguistic and positivist ideologies. What we cannot talk about we cannot
know, only that which we can talk about is real—so much for music!
Thus, ultimately, the score becomes its own rationale. It is what it is,
and there is nothing more to say about it. The composer cannot be in
error.13 We see this spatial score-based focus in preoccupations with such
two-dimensional visual forms as the golden section in analytical articles,
but it is permutationalism which is the ultimate notation-abstracted
procedure. Because musical notation presents music to us outside of time in
an essentially two-dimensional scannable score, it does not seem
immediately unreasonable to extract various parameters of the sound and
arrange these into various other patterns. The most thorough-going way of
going about this is the technique of the permutation of parameters as used in
much serial composition. The technique of permuting objects is very
general and is in fact a principle of ordering which does not relate to the
materials being permuted directly. We may permute pitches, dynamic levels
or, for that matter, sizes of shoes, using exactly the same criteria.
Applications of the principle can be very sophisticated, based upon analysis
of the nature of sound-objects. The principle problem from our point of
view is that being an outside-time procedure, there is no reason why the
resulting sequences of sounds should have any dynamism. The parameters,
separated through a lattice-based conception of musical structure, cease to
have any meaningful linkage or gestural continuity and serve merely as
evidence that the permutational procedure has taken place. This abstract
architecture, therefore, reduces all objects which it touches to the same
rather empty non-dynamic experience. There is no rationale beyond the
arrangement of the bricks; the nature of the bricks becomes irrelevant so
long as they fit into the pattern. The committed permutationalist is the
musical (or artistic) equivalent of the linguistic philosopher. He or she
cannot understand that there is a problem inherent in this approach.
A much more sophisticated and satisfactory approach can be seen in
the work of Brian Ferneyhough. Ferneyhough is clearly (from my listening
to the music) concerned with musical gesture and in a piece such as his
Second String Quartet the interaction of musical gestures between the four
players is of primary importance in our apprehension of the music. In works
for a greater number of performers, however, (such as Time and Motion
Study III for sixteen amplified voices) the sheer density of musical gestures
leads to a process of self-cancellation. The individual details of each part
are extremely interesting but an overall sense of direction is lost in the
welter of interaction. In 1981 I had the pleasure of meeting Brian
Ferneyhough over dinner in Paris and the ensuing conversation may serve
as an interesting footnote to our discussions of idealism and materialism in
the conception of music. Ferneyhough and myself both declared that we
were anarchists but on further discussion it transpired that our conceptions
of anarchism could not have been more different. Ferneyhough's view was
that he could take the strongest stand against the system by not voting. In
this way he symbolically denied the relevance of the system and therefore
in some way negated it. My more pragmatic view was that it was important
to vote in order to keep out the worst possible contender. These conflicting
idealist and materialist views of anarchist action had an interesting parallel
in our discussions about musical structure. Ferneyhough noted that a
particular passage in one of his works sounded pretty well aleatoric and that
this was interesting because it was the result of a multi-layered process of
complex compositional decisions. He seemed to be saying that the
methodology of composition was the principle object, not the effect on the
listener. The composition is more like a document which evidences the
composer's methodology and it is evident in the particular case under
discussion that the methodology will only become apparent through
detached analytical study of the document, not directly through the effect of
the music. Thus a priori design, not directed pragmatically to some
practical sonorous end, has become the principal focus of the composer's
interest. The concept of musical experience has been redefined as
rediscovering the composer's methodology through the musical experience
(or rather through the score) rather than feeling the gestural structure in
time of the music in the listening experience, and hence directly
understanding, through the gestural experience, the composer's design.
For me, on the other hand, a musical experience which appears
aleatoric is aleatoric. The experience that the listener has is the music and
the composer's methodology, no matter how rational it may be, is a heuristic
device realising the source of that experience in sound. In Ferneyhough's
case it would seem that music is an idealist object defined essentially by the
composer's intention (just as the political stance is defined by the intention
of the act of not voting). In my case, music is a material entity which is
socially defined and judged by its results (similarly the political act must be
an action taken in the world that will be judged by its success there). This
being said, one must not confuse materialism with populism, but that is the
subject of another essay and I will not pursue it here.
A fundamental thesis of this book is that, in order to understand and
control the musical continuum, we will have to appeal to time-based
notions like gesture and not only at the level of the individual musical
event. Although a formalist, permutationalist approach can be applied to
literally anything, including a particular classification of gestural types, we
cannot ultimately learn anything from it because it is not open to any
experiential verification (except in the tautologous sense that it evidences
the permutations made). What I am searching for in this book are criteria
for composing music with non-lattice materials which ‘work’ in some
experientially verifiable sense that is not merely circular.
A final comment: it is clear that the separation of notation and
actuality is of great value for the purposes of scholarship, even though it
does lead to a distortion of our understanding of the object of study. The
advent of digital recording and analysis of sound opens up a wonderful new
opportunity for such scholarship. In one sense it can be very negative as this
is a heaven-sent opportunity for formalism to run riot with a new ultra-
powerful permutational tool. To date, however, computer technology seems
to have been used in a much more sensitive way in the exploration and
understanding of the inner details of sound architecture. The preliminary
results of this exploration have been a source of inspiration for this
particular book and the control which the computer will give us over this
inner architecture makes the control of the details of gestural structure a
compositional possibility for the first time. With musical sensitivity we may
allow the computer to do the number-crunching and with real-time, or at
least acoustic, feedback we can begin to make more refined aesthetic
decisions about the gestural structure of sound on its most refined levels.
1 Though various scribe-philosophers and aesthetes have attempted to declare that music is
essentially abstract—we shall return to this point later.
2 An interesting example of this is to be found in the neumic notation of Tibetan chant (see
Kaufmann 1967), in which a single curvilinear neume might indicate changes in pitch, duration and
timbre. See Figure 2.2a.
3 A fuller discussion of these issues is to be found in Goody and Watt (1963).
4 See the quote from Lejaren Hiller in Chapter 1 (Hiller 1981: 7).
5 This situation has only partly been alleviated in the years since this was written and was certainly
reinforced by the Midi protocol, relatively new at the time of writing (Ed.).
6 Contemporary instrumental composers have, of course, sought to counteract the stranglehold of
technological rationalisation by exploring non-conventional modes of sound production on the
Western instrument (flutter-tonguing, key-slapping, whistle-tones etc.).
7 Bach's Art of Fugue, arguably one of the finest achievements of the traditional art music of Europe,
illustrates our thesis in an interesting way. Bach confines himself to the notation of pitch and
‘summative’ rhythm, leaving unspecified dynamics and even timbre (instrumentation), both of which
are usually notated (or at least indicated in the score). Although this approach may appear to
approximate very closely to the ‘abstract’ view of music, we would argue that the work is, however,
not an illustration of ‘rational formalism’ as discussed below, as the score notates sets of relations
between sound-qualities which are experientially valid (see text), even if the range of possibilities is
necessarily restricted by the nature of the notation system itself.
8 See the quotation in Chapter 1 (Boulez 1971: 37).
9 Figure 2.6 is from Trevor Wishart's PhD Thesis (University of York, UK 1973).
10 Such as paper bags, soft trumpets (Martin Mayes, Trevor Wishart), amplified springs (Hugh
Davies), or long pieces of elastic (Paul Burwell) (see discography).
11 For an explanation of Stockhausen's ‘+/-’ notation system and some examples of its application
see the introduction to the score of Spiral (Universal Edition).
12 I am indebted to Jan Steele for the following line of argument concerning the problem of musical
retrogrades.
13 Although of course, not all composers, even today, accept this absurd view! There are often other
criteria involved in composition, even where composers refuse, in a strictly positivist way, to talk
about them.
Chapter 3
PYTHAGORAS, FOURIER, HELMHOLTZ:
TOWARDS A PHENOMENOLOGY OF SOUND
Pythagoras, that grave and venerable personage, reproved all judgement of Musick which is
by the eare, for he said that the intelligence and vertue thereof was verie subtile and slender,
and therefore he judged thereof, not by hearing, but by proportionall harmonie: and he
thought it sufficient to proceed as farre as to Diapason, and there to stay the knowledge of
Musick.
[Plutarch]1
The preceding chapter sought to account for the ideology of musical praxis
which sees pitch and duration as primary musical qualities, timbre as a
distinct and secondary musical quality and takes instrumental streaming and
the generation of music on a lattice for granted. The philosophy of the
musical practice based upon establishing elementary relationships between
stable vibrating systems has a very ancient and respectable pedigree.
Pythagoras himself is believed to have first noted the fact that there is a
simple relationship between the lengths of vibrating strings and the
perceived quality of musical consonance between them. Given two strings2
of the same material at constant tension then if one is stopped exactly half-
way along its length it will produce a note sounding one octave above the
other string (which is not stopped). The octave itself is qualitatively
perceived to be the most stable or consonant of music intervals. With a
length ratio of 3:2 the musical interval of a fifth is perceived which is also
very stable and consonant. In general, the relative consonance of an interval
is seen to be directly relatable to the simplicity of the ratio of lengths of the
strings (or columns of air) which produce it (see Figure 3.1). This was not
only the first important contribution to music theory but also had a
significant role to play in the development of a scientific view of the world.
It was the first clear demonstration that qualitative aspects of nature could
be reduced (apparently) to simple numerical relationships. This general
view has been exceedingly fruitful but on occasions misleading. It led
indirectly to the concept of the Harmony of the Spheres, one of the most
persistent and misleading conceptions ever to animate the human mind: the
heavenly bodies were assumed to be transported around the earth on giant
spheres whose motion generated a heavenly music, governed in some way
by the Pythagorean laws of proportion. The desire to find ‘celestial
harmony’ in the world of nature persists even into the work of the
astronomer Kepler, who was obsessed by the desire to fit the (assumed)
spheres of planetary motion (Figure 3.2) around the Platonic solids (Figure
3.3).
Figure 3.1 Relation between string length and ‘consonantly’ related tones.
Helmholtz and others, working with what they called ‘musical’ tones,
i.e. the sounds of conventional musical instruments, proposed a simple
theory of pitch and timbre perception. A sound perceived as a single pitch
was found to be made up of various sine wave components (through Fourier
analysis). These bore a simple harmonic relationship to one another. The
frequency of the higher sine tones were integral multiples of the frequency
of the lowest (which for the moment we will assume to be the fundamental)
frequency. The pitch of a sound corresponded directly with the frequency of
the fundamental, the timbre was the result of the presence (relative
amplitude) or absence of the other sine tones (the partials). Before going on
to criticise and comment upon this theory, we should note that it seemed to
absolutely confirm the ruling musical ideology that pitch was primary and
timbre secondary. Pitch could be seen as fundamentally related to frequency
and timbre as merely a secondary phenomenon arising from the
combination of the frequencies of the constituent sine tones. Timbre
appeared to be thus almost a fused chord over a fundamental pitch.
However, the fact that Helmholtz's theory appeared to confirm the ruling
musical ideology should come as no surprise. It was framed within a culture
which took that system of musical thinking for granted. Helmholtz confined
himself to the analysis of what he arbitrarily defined to be ‘musical’ tones,
i.e. sounds forced onto the pitch-timbre-duration lattice by preconceptions
of the musical and their realisation in instrument technology. Furthermore,
the assumptions firstly that timbre was a unitary phenomenon and secondly
that pitch and timbre were clearly separable qualities, were taken for
granted directly from preconceptions of music tradition.
The first question we must ask about Fourier analysis is, although it is
clearly a very powerful mathematical tool, does it bear any relationship at
all to our perception of sonic reality? Is the Fourier analysis of sonic events
into sine tones unique, or is there any other alternative analytic breakdown?
It turns out that other systems of orthogonal functions can be defined and
used to represent arbitrary mathematical functions. One such system is that
of Walsh functions illustrated in Figure 3.6. With the advent of digital
technology some programs have already been developed for the synthesis
of sounds using Walsh functions rather than the more usual sine tones.
However, whereas the Walsh function analysis of a sound-object seems to
bear no clear intuitive relationship to our aural experience, Fourier analysis
relates very clearly to what we hear. It has been shown in fact that the
human ear is a kind of Fourier analyser so that we may assume that up to a
point the mathematics of Fourier analysis has some direct relationship with
our perceptual experience.
The result of Fourier analysis is what is called a Fourier transform.
The mathematics of the Fourier analysis convert information about the
variation of amplitude with time (time-domain information) into
information about the variation of amplitude with frequency (frequency-
domain information). Simply put, we start off with a graph of amplitude
against time and we end up with a graph of amplitude against frequency
(see Figure 3.7). The inverse Fourier transform performs the opposite
function, turning information about frequency and amplitude into
information about amplitude and time.
Figure 3.6 The first eight harmonics (sine waves) and the first eight Walsh functions.
It must be said immediately that the notion that somehow frequency
(periodicity) is more physically real than spectral information is hard to
justify. In a simple instrumental tone, periodicity can certainly be more
easily seen from a graph of amplitude against time than can any spectral
information. However, this is partly the nature of the beast being analysed
and when we consider more complex musical objects (see the section on
noise below) we will find that the graph of frequency against amplitude (i.e.
the spectrum of the sound) is far more lucid than the amplitude against time
graph (which may be totally aperiodic). In fact, to be entirely reductionist
for a moment, all that really exists is the amplitude of displacement of the
air or the ear-drum and its variation in time. Both periodicity and spectral
information are higher-level derived entities.
A more important problem is simply that the ear is unable to function
as a Fourier analyser above frequencies of around 4,000 Hz although
frequency information above this threshold can be very important in our
perception of timbre. In relation to these I quote from Schouten, Ritsma and
Cardozo:
[...] there may exist one or more percepts (residues) not corresponding with any
individual Fourier component. These do [however] correspond with a group of Fourier
components. Such a percept (residue) has an impure, sharp timbre and, if tonal, a pitch
corresponding to the periodicity of the time pattern of the unresolved spectral
components.
(Schouten, Ritsma and Cardozo 1962: 1419, emphasis added)
(1) components having the same (or very similar) overall amplitude
envelope, and, in particular, components whose onset characteristics
coincide will tend to be grouped together;
(2) components having parallel frequency modulation (either regular in the
form of vibrato or irregular in the form of jitter) will be grouped
together;
(3) sounds having the same formant characteristics will be grouped
together;
(4) sounds having the same apparent spatial location will be grouped
together.
Any or all of these factors may enter into the process of separating one
aural image from another. The importance of onset synchrony is
demonstrated in Example 3.19 where the various constituents of a sound are
separated by increasing time-intervals and then the time-intervals
successively reduced until there is again complete synchrony. The sound-
image will be heard to split into its component parts and then recohere. The
importance of frequency-modulation information has been most eloquently
demonstrated in work by Roger Reynolds5 (Example 3.20). Data from a
phase vocoder analysis was used to resynthesize an oboe tone, elongating it
as well. The regenerated oboe tone was projected from two loudspeakers,
the odd harmonics on one side, the even on the other. These two groups of
partials were each coherently, but differently, frequency modulated.
Because the even set was modulated at a rate corresponding to vocal
vibrato, and the odd necessarily had a clarinet-like sound, the listener
experiences a distinctive composite as the amplitude of modulation
increases: clarinet on one side, voice on the other at the octave, and the
sum, an oboe sound, in the centre. In this way we can contemplate playing
with the aural imaging process and not merely destroying the convention of
instrumental streaming.
Conversely, we may use these aural imaging factors compositionally
to integrate sound materials which might otherwise not cohere into objects.
Thus, by imposing artificial attack and amplitude characteristics on a
complex of sounds (e.g. a small group of people laughing), by artificially
synchronising the onset of two or more normally quite separate sound-
objects, by artificially synchronising the vibrato and jitter on two or more
normally quite separate sound-objects we may create coherent composite
sound-objects. A recently popular example of this approach is the use of the
vocoder where the evolution of the formant characteristics of a speaking
voice is imposed on an otherwise non-coherent source (e.g. the sounds of a
large group of people speaking before a concert as in Michael McNabb's
Dreamsong). This further opens up our conception of what might be
considered a coherent musical object.
With all these potential sound-materials at our disposal a further
problem arises. Music is normally concerned with establishing relationships
between various kinds of material. The question is what determines whether
we perceive a particular piece of sound-material as related to another.
Speaking of sound organisation in the broadest possible sense, the answer to
this question will clearly depend partly on context (upon which aspects of
sonic organisation are being focused upon—pitch, spectral type, formant
streaming etc.). But whichever approach we take there will be a point
beyond which the manipulated sound-material will cease to have any
audible relation to its source. This can already be perceived in conventional
music where in some types of complex serial organisation, the concept of
the derivation of material from a source set becomes meaningless. In the
studio it is seductive to assume that a sound derived from another by some
technical process is hence derived from the original in a musical sense. A
simple example of this may be given as follows. Suppose that we start with
a sustained orchestral sound, the sound of a large crowd and the sound of a
stable sine tone. Let us now take each sound, put it on a tape recorder and
switch the tape recorder onto fast wind so that the sound accelerates from
speed zero to very fast and as the tape recorder reaches its maximum speed
fade out the sound to nothing.6 Having done this with all the sounds, let us
now speed them up to at least sixteen times their original speed. In what
sense are the resultant sounds related to the originals? What we perceive in
each case is a brief, high frequency glissando. Furthermore, and most
striking, all three sounds now appear very closely related whereas the
sounds from which they originated have no relationship whatsoever. At this
distance of derivation it is the overall morphology of the sound-structures
which predominates. We may learn two lessons from this. First of all, with
sound-objects having a dynamic morphology, it is this morphology that
dominates our perception of relatedness- unrelatedness, rather than spectral
or even more general timbral considerations. Secondly, if the organisation
of our music is to be based on the audible reality of the listening experience,
the music must be organised according to perceived relationships of
materials or perceived processes of derivation of materials. In order to
accomplish the former we need an analysis of sound-materials based upon
their perceived properties, a phenomenological analysis of sounds.
The phenomenology of sound-objects
1 The author can no longer recall where he came upon this quaint translation into Elizabethan
English (Ed.).
2 The argument applies equally well to the lengths of columns of air.
3 Strictly ‘second partial’ as the piano spectrum is substantially inharmonic (Ed.).
4 ‘Linearly’ in terms of the perception of loudness (roughly speaking dBs); not amplitude which is
decaying exponentially with time. (Ed.)
5 Working with Steven McAdams and Thierry Lancino at IRCAM. This example is based on band 2e
on the IRCAM LP 0001 which is described somewhat differently on the sleeve note. It is from his
work Archipelago. (Ed.)
6 Unedited from 1983, this phenomenon may be simulated in digital systems (Ed.).
7 The Solfège is still available on three cassettes from the INA/GRM (Paris) at time of press but the
trilingual printout of the recorded French commentary which accompanied the LP version appears to
have been discontinued (Ed.).
Chapter 4
THE NATURE OF SONIC SPACE
It might be assumed, wrongly, from Chapter 3 that one only runs into new,
non-lattice-based, conceptions of musical ordering when dealing with
sounds having complex mass or noise characteristics. This is not the case,
as we shall explain in the next chapter, but we must begin exploration with
a closer analysis of the Pythagorean concept of consonance or harmonicity.
A simple reading of Pythagoras’ theory would seem to imply that an
interval is more consonant the simpler the ratio of the frequencies of its
components. In fact, we can define a measure for this simplicity as follows:
calculate the ratio of the frequencies, reducing it to the simplest fractional
representation (thus 600/350 becomes 12/7) now add together the
numerator and denominator of the fraction (for this example, 19). Confining
ourselves to intervals contained within a single octave this procedure gives
us a reasonable measure of the simplicity of a given frequency ratio.
It might now seem merely a simple matter to plot the simplicity of the
ratio (corresponding to the degree of consonance) against the interval.
However, anyone familiar with the difference between rational and
irrational numbers will be immediately aware of the following paradox. If,
for example, we take the interval of a fifth with the Pythagorean ratio 3:2
and hence a consonance value of 5 and we shift the upper tone by an
infinitesimal amount either upwards or downwards in frequency, the ratio of
the two frequencies making up the new interval immediately becomes non-
simple. In fact, in general, the ratio of the two frequencies will not be
expressible as the ratio of two finite integers. The simplicity value will in
fact be infinite. If we confine ourselves merely to ratios along the line
which are expressible with non-infinite integers, we will discover that the
simplicity-value leaps around in an extremely erratic fashion as we proceed
along the interval axis. This behaviour is illustrated for a limited number of
points in Figure 4.1. Both the rigorous and the approximate description of
the situation lead us to the bewildering conclusion that if we play a constant
tone and make an upward portamento on a second simultaneous tone
starting at the precise interval of a fifth from the first tone we should
experience a sense of very rapid shifts in consonance between the two tones
no matter how slow the portamento is made. This prediction is, of course,
only true if we stick rigorously to the Pythagorean view.
Figure 4.1 Harmonic ‘simplicity’ of intervals as predicted from Pythagorean theory and as
perceived.
Figure 4.2 Various tempered scales and their relationship to Pythagorean ratios.
It is interesting to ask just how rational the Western tempered scale is.
Why should it have twelve equal intervals rather than seven or twenty-
three? If we plot various possible equal-tempered scales against the simple
Pythagorean ratios (see Figure 4.2) we will see that a scale of nineteen
equally-spaced elements would have generated a set of intervals more
closely approximating the Pythagorean ideal. Having made these
observations we may now plot a graph of consonance against interval which
corresponds more clearly to our aural experience. In the graph I have
chosen a set of nodes which perhaps corresponds most closely to what the
typical Western listener might experience. An experienced Indian musician
might want to include several more nodal points in such a graph.
Metric structure of the Pitch-Continuum
The most important result of the perceived nodal structure of the pitch
continuum is in giving us a means of measuring distance in the dimension
of pitch. We will say that the pitch dimension has an audible metric. To
explain what this means let us consider two separate sound-systems. In the
first we deal with pairs of stable pitches. If we put the first pitch on a fixed
note and then vary the register of the other pitch, it is always possible to say
of the interval between these two pitches that it is smaller or larger than
another interval. If we now repeat this experiment with two noise-band
sources filtered so as to be of particular colours, but sufficiently wide so as
not to present any aural experience of a definite pitch, we can produce the
same result. Keeping one band fixed, while changing the register of the
second band we can always judge whether the interval between the two
bands is smaller or larger than another interval. If, however, we now change
the register of both sounds in the two sets, our experience is quite different.
First of all, we play two different pitch-sounds; and then we move both
pitches to different registers and listen again. In this case, we can still say
which interval is larger or smaller. Repeating the experiment with the noise-
bands, however—here it is very important that the noise-bands are
sufficiently wide not to present any pitch characteristic—there is no way in
which we can judge which ‘interval’ is larger because we have no frame
against which to measure the distance between the bands.
It should be stressed here that we are talking about our aural
experience. It would of course be possible to make physical measurements
with appropriate instruments and establish the frequency separation
between the central frequencies of the bands of noise and from this
establish a ratio of these frequencies which we could then compare between
different experiences. The problem is that aurally we are not able to do this.
The reason for this we can now see is simply that the dimension of pitch has
a nodal structure. Given two pitches sounding together we do not have to
rely merely on the linear separation of the two sound-objects along the
dimension of pitch (the criterion of adjacency) but we can relate them to
adjacent nodes and thus, via the principle of harmonicity, establish their
intervallic distance. In the case of the noise-bands, the dimension of ‘noise
colouration’ has no perceivable nodal structure and therefore we can only
have a sense of linear distance (principle of adjacency) between the objects.
This does not suffice for comparing intervals originating from different base
lines. We will express this difference between the two systems by saying, at
least in our aural experience, the dimension of ‘noise colouration’ has no
metric.
It is the existence of this underlying nodal structure and the resultant
ability to define an audible metric on the dimension of pitch which permits
us to establish subtly different nodal scales (as, for example, in Indian
music). It might at first seem that a mode might be exhaustively described
in terms of the intervallic distance between successive notes as one ascends
the scale. The question is, however, how does one know that a particular
interval is larger or smaller than another, especially on a very small-scale?
What accounts for our ability to perceive the subtle intervallic differences
between different modes in Indian music? The answer is that we do not
relate merely to the frequency distance between the notes but to the
underlying nodal structure of the pitch dimension. We are able to tell where
the individual notes of the mode are in relation to the nodes in the pitch
dimension. We can tell with a fair degree of subtlety whether a particular
note is very close or not quite so close to a particular nodal point. It is this
which gives us a sense of measure and enables us to distinguish subtle
differences between modal structures.
We might now consider the question, is it possible to establish nodal
structures in ‘noise-colouration’ space? Can we have modes made up out of
(unpitched) bands of coloured noise? We can, of course, artificially define
such a mode; but if we define two such modes, each with very slight
differences between the placement of certain bands, can we distinguish the
two as different musical entities? The answer to the question is probably in
most cases no, and that even where we can we will experience no
qualitative difference in the nature of the music based upon the two
different modes. This is because, as there is no underlying nodal structure to
the dimension of ‘noise colouration’ then there is no qualitative point of
reference enabling us to experience the two structures in a musically
different way. It should be said that in the world of serialist
permutationalism where the nodal structure of the pitch dimension is often
ignored and pitch levels treated as abstract permutatable entities (like sizes
of shoes, having no intrinsic relationships among themselves, only the
extrinsic relationship of ordering in a set), then this distinction may be
difficult to comprehend. We are assuming, however, that formalists will
have abandoned this book after reading Chapter 1.
Two other features are of great importance in the conception of music
on the lattice. The first is that the set of nodes is finite and closed.1 In the
sense that once we reach the octave the set of nodal values is in a clearly
definable sense reproduced over the ensuing octaves. Music based
exclusively on the lattice is thus a finite closed system with a metric. This is
a more precise exposition of Boulez's conception of a music of hierarchic
relationships upon a finite set of possibilities.
This conception can be extended to the harmonic system of Western
tonal music. First of all let us note that one feature we have not discussed in
the definition of a mode is the ability to recognise the root (or dominant
tone). Where we have an entirely symmetrical intervallic structure (such as
in a chromatic scale or the whole-tone scale) a root can only be defined by
emphasis. Normally, however, the scales used in typical musical systems
are asymmetrical in intervallic structure. This allows us to define where we
are in the scale in relation to any particular note. If a root has been
established we can therefore relate where we are in the scale to that root,
even if the root itself has not been sounded for a very long time. The
asymmetry allows us to tell where we are in relation to an absolute point of
reference.
If our asymmetric scales are selected from an underlying symmetric
set (for example the chromatic scale) then we can define scales having
identical asymmetric structures (for example the major scale, T T S T T T
S, where T is the interval of a tone and S of a semi-tone) but with different
roots. If we now compare the members of these various scales, we will find
that the scales on certain roots will have more notes in common with the
scale on a particular root than others. This again allows us to define a
concept of harmonic distance between keys. Note that the major scale is
chosen so as to establish the closest relationship between scales whose roots
are a fifth apart. Experimentation with modal structures will reveal that it is
possible to construct scale systems where the closest relationship between
roots, as defined by numbers of notes in the scale in common, is the interval
of, for example, a minor third.
We can measure harmonic distance in relation to the cycle of fifths
(in fact the cycle of approximate fifths used in the tempered scale system)
and we can aurally perceive the measure of distance between two keys. It is
interesting to note that in the relationship between major keys we might
presuppose that the simple Pythagorean relationship of a fifth between the
roots of the scales of keys was the predominant factor but when we look at
relationships between major keys and their relative minor keys, we see that
in fact common set membership is the predominant perceptual force. There
are twelve notes in the cycle of fifths, which, being a cycle, is of course
closed, hence we can see that the Western harmonic system is also a finite
closed system with a metric (see Figure 4.3).
Can we expand any of the insights we have gained from our analysis of the
structure of the pitch dimension to an understanding of the world of timbre?
Some crude attempts have of course been made to expand the ideology of
lattice-based music to the organisation of timbre but this is, I feel, merely an
a priori imposition upon the object of musical study. We can, in fact, draw
upon the insights we have already gained but the conclusions we will reach
will be radically different from the formalists. The area of timbre will be
seen to have a radically different structure from the dimension of pitch. This
does not mean that we should abandon it or regard it as essentially
secondary in musical practice, but merely that we should investigate what
criteria of sonic organisation would be appropriate to this particular area.
The first obvious remark we should make about timbre is that it does not
have one single dimension, as does the pitch continuum. This finding often
surprises musicians brought up exclusively in the tradition of Western
instrumental music where timbre has been streamed in specially
acoustically-refined instruments and adapted to the logic of pitch/duration
lattice architecture. It is obvious, however, from our discussion in the
previous section that timbre is a multi-dimensional phenomenon.
David Wessel conducted some preliminary psycho-acoustic
experiments to establish whether any structure exists in this timbre space. In
one experiment timbre has been plotted in a two-dimensional space in
which one dimension relates to the quality of the ‘bite’ in the attack, the
other the placement of energy in the spectrum of the sound (its ‘brightness’)
(Wessel 1979: 49). By this means Wessel has demonstrated that there is in
fact a continuum of values existing within this space which can be
perceived by the listener (Example 4.1).2 I also recall a brief discussion at
IRCAM (undocumented) on this topic between David Wessel and Tod
Machover in which two contrary views were expressed: roughly speaking
that, on the one hand, there is the possibility that the timbre domain will be
discovered to have a structure which we can relate in some way to the
structure of the pitch dimension, and on the other that the timbre domain is
quite distinct in its structure from the pitch dimension. My musical
experience leads me to favour the latter conclusion. But not, therefore, to
come to the Boulezian conclusion that timbre is essentially secondary, of
necessity, in any conceivable musical practice.
Figure 4.3 Representation of Western harmonic system as finite and closed in which harmonic
distance can be measured.
Figure 4.12b The ‘butterfly catastrophe’ and its associated shock wave (after Thom (1975)).
A first reading of Chapter 3 may have seemed to imply that we only run
into the area of non-metric adjacency-based organisation when we leave
sounds with simple stable harmonic spectra and deal with sounds of
complex mass or significant noise structure. This would be a misreading.
Although we can define a metric on the dimension of pitch, we do not have
to do so. Let us define some sound-objects based on elementary spectra
which are not amenable to the pitch-lattice description. The simplest object
will be a sine tone with portamento. In lattice-based music such portamento
events are perceived to centre on the pitch of the start or the end of the
portamento. However, we can design a glissando in such a way that it is
very smooth and has such an envelope that the beginning and end do not
significantly stand out from the rest. A music made up entirely of such
sound-objects would fail to draw our attention to the nodal structure of the
pitch-dimension because, without imposing some very special means of
organisation upon the music, nothing in the musical structure would lead us
to focus our attention upon a point of reference which would enable us to
define nodes in the pitch dimension and hence relate sound-events to these.
Continuing with the same material we may imagine a dense texture of such
portamentoed sine tones constructed with such an average density that no
particular pitch centre was predominant. Finally, we may imagine sweeping
a filter across this texture in an arch form (see Figure 5.1). This final object
has a clearly-defined structure of pitch motion imposed upon the texture of
elementary pitches-in-motion but nowhere can we define the sense of a
pitch in its traditional lattice-based sense.
Let us now define the concept of dynamic morphology. An object
will be said to have a dynamic morphology if all, or most, of its properties
are in a state of change—I use the word properties rather than parameters
here, because I feel at this stage that it is important to view sound-objects as
totalities, or gestalts, with various properties, rather than as collections of
parameters. The concept of a musical event as a concatenation of
parameters arises directly from lattice-based musical thought and is
singularly inappropriate to the musical structures we are about to discuss. In
general, sound-objects with dynamic morphology can only be
comprehended in their totality and the qualities of the processes of change
will predominate in our perception over the nature of individual properties.
From here onwards we will assume that sounds with spectral glide are a
special sub-category of sounds with dynamic morphology. There is,
however, a further class of sounds to be considered: sounds of unstable
morphology. These may be conceived of as sounds which flip rapidly back
and forth between a number of distinct states. In my own writing I often
refer to these as multiplexes. Such sounds are coherent in the sense that the
overall field of possibilities remains constant but the immediate state of the
object is constantly changing in a discontinuous fashion. Example 5.3
illustrates a typical vocally-produced multiplex. To complicate matters
further, multiplexes themselves may have a dynamic morphology! In this
case, the nature of the individual components of the multiplex undergo a
process of gradual change through the timbre space so that the general field
characteristic of the multiplex changes with time. This is illustrated in
Example 5.4.
At this stage, anyone thoroughly enmeshed in the lattice-based mode
of musical thinking may feel that such objects are essentially formless and
incapable of any coherent musical organisation. In fact, however, the
morphology of such objects is a significant recognition indicator in our
everyday experience. To take two simple examples: first of all the sound of
ducks which is normally imitated in the English language by the word
“quack”; the most striking feature of the duck call and the only real feature
which is paralleled in the word “quack” is the spectral glide characteristic
as the formant structure moves from a stressing of the lower formants to a
stressing of the higher formants (caused in the human, and presumably also
in the duck, by the gradual but rapid opening of the vocal cavity).
More significantly, morphology appears to be an essential
characteristic of recognition for certain consonant sounds in speech
discourse. In the Chant programme, developed by Xavier Rodet and
colleagues at IRCAM (See Rodet, Potard and Barriere 1984), vowels have
been successfully modelled by defining their spectral (formant)
characteristics. These models can be used without great difficulty to model
strings of vowels which imitate vocal production very precisely. The
attempt to model consonants has however met with difficulty as consonant
structures have turned out to be extremely context-dependent. Although
spectral characteristics (including noise-based aspects) are important in our
recognition of consonants, they are not sufficient. What does appear to be
preserved, however, from case to case is the shape of the motion, or the
morphology of the consonantal sound-object.
1 It should be said that in order to represent timbre accurately we will need a number of dimensions
for timbre alone so that our representation should be in at least four, if not six, dimensions. We
maintain the fiction that timbre has only a single dimension here merely in order to be able to draw a
diagram.
2 Note that the three voices involved in the latter process form a single stream to the listener, as they
are projected via a loudspeaker system as emanating from a single, moving point in space—see the
comments on aural imaging in Chapter 3—the rests in the three parts are therefore not intended to
break up the perceived stream.
3 The only parallel in the Western instrumentarium is perhaps the crude one of muting with brass
instruments or strings, an all or nothing procedure in many cases.
4 The score gives 68 verbal descriptions of the sound qualities aimed at by the performers (playing a
large tam-tam with an enormous variety of implements of many materials, picked up actively with
microphones and processed with filters and potentiometers). These range from groaning, hissing,
yelling, rustling to quacking, fluting, whimpering, and murmuring (Ed.).
5 Later digital versions of the system utilised a portable EPROM instead of the tape, hence without
the need for a filter stage (see below) to extract the control voltage (Ed.).
Chapter 6
GESTURE AND COUNTERPOINT
Contrapuntal structure
Can we establish a truly contrapuntal method of working in the continuum?
To answer this question we will need to analyse the concept of counterpoint
in lattice-based music and attempt to generalise the conception so that it is
no longer dependent on the existence of a lattice structure. We should also
examine some existing approaches to contrapuntal structure in the
continuum.
The example from Pithoprakta by Xenakis, quoted in Chapter 2,
illustrates a form of rudimentary counterpoint. The three lines of string
sounds, none of which is based on the pitch lattice (glissandi of glissandi),
are in fact streamed in terms of a timbre lattice. We differentiate the three
streams because of their different and consistent timbral qualities. The three
lines certainly coexist in the same musical space and are heard as distinct
entities, but we can hardly describe them as contrapuntal as there is no real
interaction between the parts except in their coming together in the high
register before the sustained chord material is revealed.
In my own piece, Anticredos, for six voices using extended vocal
technique, (listen to Example 6.4), a simple example of stream-divergence
is set up. When we are dealing with continuously evolving streams, we can
imagine that a single stream evolves through the continuum into two
distinctly separated streams. These may be separated in pitch or pitch area,
timbral characteristics or space (in the example given, all three). The
important point is that this division of one stream into two may take place
quite seamlessly, an effect which would be, to say the least, extremely
difficult to achieve in lattice-based instrumentally streamed music. Also, in
this particular example, the two independent streams are themselves
undergoing continuous timbral transformation as they independently circle
the audience (the live work is projected on four loudspeakers surrounding
the audience). In this case, therefore, we have streaming without
instrumental streaming of timbre. However, stream divergence is the only
‘interaction’ between the two existing musical threads.
A more complex and highly-articulated development of the concept
of seamless divergence and merging of sonic streams may be imagined (see
Figure 5.2) and this would certainly be an entirely new realm of musical
development dependent on our acceptance of the multi-dimensional
continuum as a valid substrate for physical composition. The gradual
separation or reintegration of the streams might be emphasised by the
imaginative use of spatial movement (another continuum parameter).
We may, however, develop a conception of counterpoint closer to
what we understand by that term in lattice-based music theory. Let us first
of all attempt to make a generalisable analysis of the substance of the
contrapuntal experience. For us to describe the musical experience as
contrapuntal in a more conventional sense, it is not sufficient for us to
experience the mere coexistence of a number of musical streams. These
streams must be felt to relate to one another or interact in some way during
the course of their separate development. In lattice-based counterpoint this
will involve the ebb and flow of rhythmic co-ordination and harmonic
consonance (or ‘normality’ defined in some other sense) in the relationship
between the parts. Ideally, in tonal counterpoint, we should expect to feel
that the overall musical texture is ‘going somewhere’.
Thus, in addition to the streaming of individual parts, we can
establish two criteria for our recognition of a contrapuntal structure. First of
all, an architectural principle which supplies points of reference in the
overall progression of the musical material. This corresponds to the key
structure in tonal counterpoint. Secondly, a dynamic principle which
determines the nature of the motion. In tonal note-against-note counterpoint
this is related to the ebb and flow of rhythmic co-ordination and the ebb and
flow of harmonic consonance-dissonance that we have discussed
previously, both of which arise from the way in which notes in individual
parts are placed relative to notes in other parts. The lattice structure of tonal
music allows us to develop a detailed and elaborate sense of contrapuntal
development.
1 This articulation is achieved partly through using the cupped hands as a variable filter on the
continuous vocal source.
2 It should be emphasised that the speech is speech-like articulation, the language is imaginary.
3 Akin to tutti so long as we bear in mind that we are talking here about gestural structure and not
spectral type.
Part 2
Landscape
Chapter 7
SOUND LANDSCAPE
Any sound which has too evident an affinity with the noises of everyday life, [...] any sound of
this kind, with its anecdotal connotations, becomes completely isolated from its context; it
could never be integrated, [...] Any allusive element breaks up the dialectic of form and
morphology and its unyielding incompatibility makes the relating of partial to global
structures a problematical task.
(Boulez 1971: 22-23)
I thought it had to be possible to retain absolutely the structural qualities of the old musique
concrète without throwing away the content of reality of the material which it had originally.
It had to be possible to make music and to bring into relation together the shreds of reality in
order to tell stories.
(Luc Ferrari interviewed in Pauli 1971: 41)
Usually, any sort of live recording will carry with it information about the
overall acoustic properties of the environment in which it is recorded. These
might include the particular resonances or reverberation time of a
specifically designed auditorium or the differences between moorland (lack
of echo or reverberation, sense of great distance indicated by sounds of very
low amplitude with loss of high frequency components etc.), valleys
(similar to moorlands, but lacking distance cues and possibly including
some specific image echoes) and forests (typified by increasing
reverberation with distance of the source from the listener). Such real, or
apparently real, acoustic spaces may be recreated in the studio. For
example, using the stereo projection of sounds on two loudspeakers we may
separate sound-sources along a left-right axis, creating a sense of spatial
width. We may also create a sense of spatial depth by simultaneously using
signals of smaller amplitude, with their high frequencies rolled off.
Depending on which type of acoustic space we wish to recreate, we might
also add increasing amounts of reverberation to these sources, the more
distant they appear to be. In this way, depth is added to the image and we
create an effective illusion of two-dimensional space. This illusion is
enhanced if sound-objects are made to move through the virtual space (see
Figure 7.5). (A detailed discussion of the control of spatial motion will be
found Chapter 10.)
The digital technique known as convolution allows us to impose in a
very precise manner the acoustic characteristics of any preanalysed sound-
environment upon a given sound-object. Ideally, the sound-object itself
would be recorded in an anechoic environment. To implement the process
of convolution we begin by measuring the impulse response of the acoustic
environment we wish to recreate. This involves playing a single very brief
impulse (Figure 7.6a1) with a very broad and flat spectrum (see Chapter 3),
e.g. a gunshot, into the natural acoustic environment and recording the
result; this is represented in Figure 7.6a2 as a series of ‘instantaneous’
digital samples - the impulse response of the environment. Provided all
audible frequencies are equally represented in the initial impulse, the
resulting recorded signal should indicate the overall resonance
characteristics of the environment. (If the impulse is specifically pitched we
will be measuring only the resonance characteristics of the environment at
some particular frequency.) Let us now assume that we have a digital
recording of our sound-object. Because digital recording involves a
sampling process, the sound-object may be regarded as a collection of
instantaneous impulses which taken together define the overall waveform of
the sound-object. If we now replace each individual sample in the digital
representation by the graph of its impulse-response (which will be
magnified or reduced according to the amplitude of each impulse) and then
sum at each sampling instant the resultant values (see Figure 7.6b4) we
obtain a waveform corresponding to the sound perceived as if the sound-
object had been recorded in the sound-environment which we analysed.
Figure 7.5 Representation of depth in a stereo field.
Images
Figure 7.9 Mental reconstruction of an image from masked data.
Images
Figure 7.10 Schematic representation of ‘book-slam’ transforming into ‘door-slam’ (from Red Bird).
Images
Figure 7.11 Transformation of ‘Liss-’ (from ‘listen’) into birdsong (from Red Bird).
The intrinsic ambiguity of aural space also means that certain kinds
of transformations may be effected in aural space which it is very difficult
to relate in any way to a visual analogue. In the piece I am sitting in a room
by Alvin Lucier (Example 7.11) the initial sound-image is that of a voice
speaking in a room with a given acoustic (at this stage our attention is not
drawn to the room acoustics). The voice is then recorded and the recording
played back into the room. This process is repeated over and over again. As
this process proceeds the recording becomes increasingly coloured by the
room acoustic until finally at the end of the piece we hear essentially the
room resonance vaguely articulated by the amplitude fluctuations of the
voice. In this case our perception of what is the sound-object and what is
the acoustic space in which it is projected have been conflated. At the
beginning of the piece we would unreservedly state that the sound-object is
the voice. At the end of the piece the sound-object is clearly a more
‘abstract’ entity whose characteristics derive from the room acoustic.
Somewhere in between these extremes our perception passes over from one
interpretation to the other. Not only, therefore, can we control the
dimensions of, on the one hand, simple recognition/non-recognition and on
the other hand recognition-as-A/recognition-as-B, but also the dimension
acoustic-space/‘sound-object within an acoustic space’.
From what has been said so far about the intrinsic ambiguity of aural
space, it might seem unlikely to be able to generate an aural image which is
specifically ambiguous, i.e. which has two very specific interpretations.
This, however, can be quite simply achieved in certain cases, particularly
where one of the sound-sources is the human voice. Thus we may use the
vocoder to impose the articulatory structure of unvoiced speech onto, for
example, the sound of the sea. The two recognisable aural images remain
simultaneously perceptible in the resulting aural stream. Similarly the
digital technique of cross-synthesis allows us to transfer certain
characteristics (e.g. the changing formant structure) of one recognisable
sound-source onto another recognisable source, creating sound-images
which demand two simultaneous landscape interpretations.
One final comment on ambiguity and recognisability: we might ask
the question, what enables us to recognise a sound as ‘like’ another? What
is the aural basis of mimicry? If we are capable of perceiving that certain
sounds, even so-called abstract sounds, are similar to other, possibly
concrete, sounds, does this not then affect our perception of all sonic
structures? Is there any kind of relationship between our perception of the
morphological properties of the natural sonic landscape and our
appreciation of composed sound-events and sound-structures? This point
will be discussed more fully in the following chapter.
The true answer is to be found, I think, in the characteristic that myth and music share of
both being languages which, in their different ways, transcend articulate expression, while at
the same time—like articulate speech, but unlike painting—requiring a temporal dimension
in which to unfold.
(Lévi-Strauss 1970: 15)
Sound-image as metaphor
Figure 8.2 Comparison of music-drama with electroacoustic sound-image composition.
Figure 8.6 The three stages of evolution of screamed “rea-” (from reason) into a clock tick.
At the same time animal, bird and human vocal sounds are integrated into
the mechanical as the squeaks and squeals of the machinery. These may
then:
By comparison with the barbaric challenge of the sea, the wind is devious and equivocal.
Without its tactile pressure on the face or body we cannot even tell from what direction it
blows. The wind is therefore not to be trusted. [...] Jung speaks of the wind as the breath of
the spirit. ‘Man's descent to the water is needed in order to evoke the miracle of its coming to
life. But the breath of the spirit rushing over the dark water is uncanny, like everything whose
cause we do not know—since it is not ourselves. It hints at an unseen presence, a numen, to
which neither human expectations nor the machinations of the will have given life. It lives of
itself, and a shudder runs through the man who thought that “spirit” was merely what he
believes, what he makes himself, what is said in books, or what people talk about’. (Schafer
1977: 171)
Schafer goes on, however, to point out that modern man, living in cities,
sheltered from the elements in air-conditioned buildings and travelling
between continents in aeroplanes, tends not to perpetuate this primeval
symbolism. The sea, for example, becomes a romantic image associated
with holidays. How, then, can we use any metaphor with the certainty that it
will be understood? The answer, I think, lies in the embedding of the
metaphor in a structure of interrelationships and transformations, as in Red
Bird, such that various oppositions and distinctions are established. This is
very much the way that musical objects (e.g. motifs) operate. The
significance of the symbolisation is clarified through its relation to other
symbols. Through suitable structures we could establish either the primeval
or the romantic symbolism of the sea or in fact both, and generate subtle
resonances and transformations between the two interpretations.
1 The syllable ‘lisss’ is understood to be from the phrase ‘listen to reason’; in Red Bird this is
established by context.
2 In fact, the programme Chant now makes this transformation quite straightforward and we may
anticipate that problems of sound-transformation will become increasingly transigent as our
experience with computer synthesis and analysis increases.
Chapter 9
IS THERE A NATURAL MORPHOLOGY OF
SOUNDS?
The reason for making this distinction is simply that the imposed
morphology tells us something about the energy input to the system and
ultimately relates to what we have called the gestural structure of sounds.
Clearly we can gain more information about this energy input where it is
continuous and least where it is in the form of an initiating impulse. Where
energy (mechanical, hydraulic, aerodynamic or electrical) is continuously
applied to the system, we can follow its ongoing subtle fluctuations. The
sounding system is gesturally responsive. Where a sound-event is initiated
by an impulse (e.g. a drum or bell-stroke), however, very little gestural
information can be conveyed—effectively, only a difference in loudness
relating to the force of the impulse. Iterative continuation is ambiguous in
this respect. Iteration may be entirely an aspect of the applied force (as in
the case of the xylophone ‘trill’), purely an aspect of the physical nature of
the medium (vocal fry or slack double bass strings), or an interacting
mixture of the two (a drum-roll).
Clearly, on a synthesiser we can generate events of any kind without
actually supplying any immediate energy input from our own bodies. Two
things need to be said about this. First of all, the mode of continuation (and
attack-structure, articulation etc.) of a sound will tend to be read in terms of
the physical categories I have described. The distinction between, for
example, continuous and impulse-based excitation is not a mere technical
distinction but relates to our entire acoustic experience and ‘tells us
something’ about the sound-object even though it may have been generated
by an electrical procedure set up in an entirely cerebral manner. We can, of
course, transcend these categories of the physical experience of sound-
events, but I would suggest that we do so in the knowledge that this
background exists. In a similar way, for example, we may generate
glossolalia through the statistical analysis of letter frequencies in texts.
Hayes (1983) generated the following examples from analyses of Latin
(Virgil), Italian (Dante) and French (Flaubert) respectively:
AD CON LUM VIN INUS EDIRA INUNUBICIRCUM OMPRO VERIAE TE
IUNTINTEMENEIS MENSAE ALTORUM PRONS FATQUE ANUM ROPET PARED LA
TUSAQUE CEA ERDITEREM [...]
QUALTA'L VOL POETA FU’ OFFERA MAL ME ALE E'L QUELE ME’ E PESTI FOCONT
E'L M'AN STI LA L'ILI PIOI PAURA MOSE ANGO SPER FINCIO D'EL CHI SE CHE CHE
DE’ PARDI MAGION[...]
PONT JOURE DIGNIENC DESTION MIS TROID PUYAIT LAILLE DOUS FEMPRIS ETIN
COMBRUIT MAIT LE SERRES AVAI AULE VOIR ILLA PARD OUR SOUSES LES
NIRAPPENT [...] (Hayes 1983: 19)
But the reader will always hear or read the results against the background of
his knowledge of one or several languages. The forms of sound-objects are
not arbitrary and cannot be arbitrarily interrelated.
Composers who have weighted their activities towards live
electronics rather than studio-based synthesis seem to me to have been
strongly affected by the fact that a morphology imposed upon electronic
sound-objects through the monitoring of performance gesture can be much
more refined and subtle than that resulting from intellectual decisions made
in the studio. The directness of physiological-intellectual gestural behaviour
carries with it ‘unspoken’ knowledge of morphological subtlety which a
more distanced intellectual approach may not be aware of. This is not to say
that theorising cannot lead to interesting results, but that it can lead to a loss
of contact with the realities of the acoustic landscape.
Even where the imposed morphology is a mere impulse, the loudness
of the sound carries information about the force of that impulse. The
association of loudness with power is not a mere cultural convention
although loud sounds have often been used to project political potency
(massed orchestras and choruses, the musicians of the Turkish army etc.).
As far as we know, continuous changing in overall dynamic level
(crescendos, diminuendos) were an invention of the Mannheim school of
symphonic composition in the eighteenth century (though of course it was
possible to differentiate different dynamic levels on instruments such as the
organ in previous ages). The formalistic assignment of a series of different
dynamic levels to musical objects, which was experimented with in the total
serial aesthetic leaves a sense of arbitrariness or agitation (neither of which
is usually intended) because it ignores the landscape basis of our perception
of loudness.
Sounds undergoing continuous excitation can carry a great deal of
information about the exciting source. This is why sounds generated by
continuous physiological human action (such as bowing or blowing) are
more ‘lively’ than sounds emanating, unmediated, from electrical circuits in
synthesisers. The two natural environmental sounds, not of human origin,
which indicate continuous excitation—the sound of the sea and that of the
wind—tend to have an interesting ongoing morphology which may relate to
the symbolic associations of these sounds (see the previous chapter). In the
case of the sea, the excitation (the pull of the moon's gravity) may be
regular but the form of the object (the varying depth of the sea) results in a
somewhat unstably evolving (intrinsic) morphology. The sound ‘of the
wind’ is usually in fact the sound of something else animated by the motion
of the wind. In this case it is the exciting force (the wind itself) which varies
unpredictably in energy giving the sound its interestingly evolving
(imposed) morphology. Murray Schafer has pointed out in his book The
Tuning of the World that it is only in our present technological society that
continuous sounds which are completely stable (the hum of electrical
generators etc.) have come to be commonplace (Schafer 1977, Chapters 5
and 6). The ability of the synthesiser to generate a continual stream of
sounds says something about our society's ability to control energy sources;
but if we take this continuous streaming for granted, like the humming of
machinery, it tends to lose any impact it might have had on the listener. The
machine has no intentions and therefore it inputs no gestures to its sound.
The synthesiser can sound the same way!
Some conclusions
We are going to analyse the situation in which the listener is situated at the
centre of a virtual acoustic space so that sound-objects may appear in front,
to the left, to the right and behind the listener (see Figure 10.7). The
listener, in fact, forms a frame of reference for this space which allows us to
talk about ‘in front’, ‘behind’, ‘left’, ‘right’. From a purely geometric point
of view the space is entirely symmetrical and there are no preferred
directions. For the listener, however, certain directions have different
psychological implications to others so that the frame of reference we are
imposing on the space is not just a convention related to the ear-geometry
of the head but a psychological/aesthetic aspect of our perception.
Figure 10.6 Non-connected and connected paths.
Figure 10.10 Object and frame rotations: distinction of the mathematical and the perceptual.
Direct motions
It is of course possible to take any direct motion and retrace the path
in the opposite direction, thereafter repeating this cycle. I would, however,
prefer to call this motion an oscillation. The motion is partly defined by its
two end points and essentially oscillates between these two positions. There
is no such sense of oscillation in circular motion. As all points along the
circle are equivalent, there is no ‘turning point’. Again, this is not a
mathematical or a semantic distinction but a question of the aesthetic import
of such motions. We might liken circular motion to the motion of a Shepard
tone which, though apparently continually rising, never in fact moves out of
its initial tessitura. An oscillation, on the other hand, is much more like a
trill or vibrato. If we imagine a circular motion in which the diameter of the
circle successively decreases and increases in a cyclic fashion, then the
circular motion would take on the character of an oscillation. These
distinctions begin to blur when we consider eccentric circular motion or
narrow eccentric ellipses (see Figure 10.20).
A related movement type is spiral motion (see Figure 10.21). An
inward spiral which approaches the centre slowly may be perceived as a
circular motion in which the frame is contracting towards the centre (see
Figure 10.22). More commonly, however, spiral motion will be perceived as
direct, as it has a definite start and end point: it is not cyclic. This is
particularly evident in the case of a very shallow spiral (Figure 10.23)
which is more like a mellifluous spatial ornamentation of linear motion. In
between the two extremes, the spiral displays some characteristics of both
circular and direct motion. Like circular motion it tends to negate the
orientation of the space, making all directions equivalent. In its place,
however, and unlike circular motion, it establishes inwards and outwards
motion as significant. Motions which spiral inward and then outward or
vice versa (see Figure 10.24) should also be distinguished. Where this
motion is extended into an oscillation (inwards to outwards to inwards to
outwards etc.) we have the oscillating circular motion discussed previously.
It seems to me unlikely that in two-dimensional acoustic space, spiral
motion which is not centred on the listener's head can effectively convey
the vortex feeling of spiralling.
Double motion
Images
Figure 10.39 Pulsating loop motion.
Images
Figure 10.40 Internal and external additions of circle and ellipse motions.
Images
Figure 10.41 Internal and external additions of circle and zig-zag motions.
Images
Figure 10.42 An elaboration of additions of circle and zig-zag motions.
Images
Figure 10.43 Addition of circular motions.
Images
Figure 10.44 A rotating single loop.
Images
Figure 10.45 Four-cloverleaf formations.
Images
Figure 10.46 Three-cloverleaf formations.
Irregular motion
Images
Figure 10.47 Two types of four-cloverleaf formations.
Images
Figure 10.48 ‘Butterfly’ pathways.
Images
Figure 10.49 Irregular oscillating circular motion.
Images
Figure 10.50 Localised and unlocalised irregular paths.
Images
Figure 10.51 Time weighted irregular paths.
Images
Figure 10.52 Addition of irregular and zig-zag motions.
Images
Figure 10.53 Addition of irregular and circular motions.
Images
Figure 10.54 Addition of irregular and pulsating looped motions.
Time
A motion is characterised not only by its path in space but also by its
behaviour in time. We may distinguish the first order time properties
(different speeds of motion) and second order properties (the way in which
the speed changes through time, the acceleration or deceleration of the
motion). We might even consider in some cases third order properties of the
motion (the way in which the acceleration or deceleration changes through
time) but for the moment we will assume that this degree of precision is not
generally audible.
Images
Figure 10.55 Pulsating looped motion with two degrees of randomness.
The absolute speed of the motion will determine its perceived
aesthetic character. A very slow motion will be experienced as a mere
relocation of position or even as ‘drift’ rather than a movement with some
definiteness or ‘intention’. As the speed of the motion increases the
apparent energy associated with that motion is increased. Motion at
intermediate speeds has a feeling of definiteness or ‘purposefulness’, an
intention to get from one location to another. Fast motions carry a feeling of
urgency or energy. Where fast motion is introduced suddenly into a
relatively static frame, there is a sense of sudden surprise. The similar
introduction of a very slow motion into a static frame may induce a sense of
gentle disorientation. Very fast motion in a circle may even induce a sense
of head-spinning dizziness.
Considering now different categories of speed change we may
broadly differentiate six classes of motion (see Figure 10.56). Accelerating
motions, with their sense of rushing towards a final position, thus increase
in spatial ‘definiteness’ or ‘intention’ and point to the significance of their
target point. They are a kind of spatial ‘anacrusis’. Decelerating motions, on
the other hand, have exactly the opposite effect, a definiteness in leaving
their point of origin and a sense of coming to rest at their target, a calming
or spatial ‘resolution’.
Accelerating-and-decelerating or decelerating-and-accelerating
motions allow us to define some new types of linear motion. Figure 10.57
defines a whole class of there-and-back linear (or narrow elliptical)
motions. Where these have a decelerating-accelerating time-contour they
are perceived as ‘thrown’ elastic motions. It is as if the sound-object is
thrown out from its point of origin on an elastic thread whose tension slows
down its motion and then causes it to accelerate back towards the source.
Simple constant speed motion along any of these paths would usually break
down in our perception into two separate motions, one in the outward and
the other in the inward direction. The time-contour, however, gives the
whole motion a special kind of unity. Conversely, the accelerating-
decelerating time-contour gives the feeling of ‘bounced’ elastic motion, the
motion gathering energy and then being forcibly repelled by the edge of the
space it defines, losing energy as it returns. Again, the overall there-and-
back motion is unified by the time-contour.
Where a motion cyclically accelerates and decelerates, our aesthetic
interpretation may depend on our position in relation to the motion.
Consider the maximally-swung elliptic four-cloverleaf motion of Figure
10.47. We may apply a synchronised pattern of accelerating and
decelerating motion to this path in one of two ways. In the first case the
movement on the elliptic outer loops will accelerate whilst the motion close
to the observer will be slow. As the sound-object will therefore spend most
of its time circling slowly around the observer's head, the motion will
appear rooted in the centre but making dramatic swings out into the distant
space. The motion will thence appear ‘bounced’ elastic. In the opposite
case, however, the motion along the outer ellipses will be slow, accelerating
towards the centre and moving very quickly around the observer's head.
Here the sound-object will spend most of its time on the outer edges of the
space, making sudden (and perhaps disturbing) close loops around the
listener. The motion will then appear ‘bounced’ elastic but in the opposite
direction (inwards) to the first case (see Figure 10.58). We can imagine a
third situation in which the motion around the listener's head is at a medium
pace, suddenly accelerating before it moves off along the outer ellipses
where it decelerates. In this particular case the motion at the centre has a
stable phase (where it is moving at a medium rate) and the listener may thus
feel that the sound is rooted in the centre of the space but ejected to the
edges by ‘thrown’ elastic motion. This example illustrates the way in which
subtle interactions between motion contour and spatial path may influence
the aesthetic impact of a particular spatial motion. As another example,
consider the inward spiral (see Figure 10.21). Where this motion accelerates
towards the centre we have a sense of the sound-object rushing towards, or
being sucked into, the centre of the vortex. Conversely where the motion
decelerates there is a feeling of the sound-object coming to rest at the centre
of motion.
Frame motions
Figure 10.59 Time-weighted circling-looping motion.
Images
Figure 10.64 Further two-dimensional frame motions (translations).
Two types of frame motion are of particular interest: the first (see
Figure 10.66) involves the expansion of sound-objects from the centre into
the surrounding space. If this is accompanied by an accelerated motion it
should give a sense of growth, whereas if accompanied by a decelerating
motion which is initially quite fast, a sense of exploding will be conveyed.
Conversely (see Figure 10.67) all the elements in a space may collapse into
the central position and, if this is achieved with an accelerating motion, a
sense of imploding will be created. In more complex situations we may
imagine most of the objects in the acoustic space undergoing a symmetrical
rotation whilst a single object pursues an independent course. How we
perceptually group the objects in these situations will depend partly on the
various relative motions of the objects and partly on various landscape
aspects of our perception (for example, recognition or sonic relatedness of
the sound-objects).
Images
Figure 10.65 Two-dimensional frame distortions.
Images
Figure 10.66 Frame motion (expansion).
Images
Figure 10.67 Frame motion (contraction).
Some principles
We may draw the following set of conclusions to this part of our discussion:
(1) acoustic space is an oriented space: in particular, front and back are to
be clearly distinguished from one another;
(2) individual motions in the space may be direct or cyclical/oscillatory;
(3) motions may have more than one characteristic;
(4) a degree of irregularity may be imposed internally or externally on any
basic pattern of movement;
(5) the temporal characteristics of a motion will significantly affect its
character: with direct motion (or cyclical motions which have directed
characteristics such as the cloverleaf) the motion contour will
determine the ‘gestural’ feel of the motion, while with cyclical double
and random motions the motion contour will contribute to the spatial
structure of the path;
(6) in certain cases we may consider a one-dimensional or the entire two-
dimensional frame of reference to move.
Images
Figure 10.68 Gestural interactions of spatial motions.
Images
Figure 10.69 Synchronised motions with different spatial contour.
Images
Figure 10.70 Multiple motions symmetry considerations.
Images
Figure 10.71 Spatial division and orientation from cyclic motion.
Images
Figure 10.72 Spatial ‘harmonics’.
Images
Figure 10.73 Temporal ‘harmonics’.
Images
Figure 10.74 Spatial coordination of two motions.
In Figure 10.74 the two objects move on different paths. The two
paths, however circulate around the space in the same direction (always
anticlockwise). They are therefore in some kind of spatial ‘harmony’ with
one another. If at the same time the cycle times are coordinated so that, for
example, they are both at the centre rear of the acoustic space at the same
time a further temporal ‘harmony’ is achieved between the two motions. In
Figure 10.75 a group of sound-objects rotates around the centre of the
space. If they all preserve the same angular velocity we hear merely a
rotation of a one-dimensional frame around the centre. If, however, they all
have the same linear velocity the outer objects will gradually lag behind the
inner objects. The motions of the various objects are however spatially
‘harmonised’ with each other or at least they set up a particular feeling or
structuring to the space which is more vaguely akin to an inharmonic
resonance. Examples of this type may be multiplied ad libitum.
Furthermore, gestural motions may be superimposed on these situations,
either through the movement of other objects through gestural articulation
of the cyclic motions, or through the consecutive use of gestural and
‘harmonic’ modes of organisation. The counterpoint of spatial motions is
thus in itself an extremely rich field for the sonic artist to explore.
Images
Figure 10.75 Coordination of rotations.
Conclusion
There is clearly even more we can say about spatial motion. We have not
yet considered the up-down dimension; we have not considered oscillating
motions which are so fast as to produce amplitude modulation of the signal
(the timbral effects of spatial motion); we have not considered analogies
with the sphere of dance.
As the technology is further developed which permits us to analyse
and control the various parameters which enable us to accurately locate
sounds in space and, as reproduction facilities and acoustics are improved,
we can expect this analysis to be expanded and refined; certainly at this
stage it cannot claim to be complete. The organisation of spatial motion is
undoubtedly a growth area in sonic art.
Utterance
Chapter 11
UTTERANCE
Man’s languages have objective status as internally organised systems that are independent
of the people who speak them, whereas animal communication is precisely the social
interaction of the animals. Notice that this difference is not just a matter of the cultural rather
than genetical transmission of human languages, for many animal signaling actions,
references, or significances may be culturally acquired either separately or together. [...]
Because languages are, as such, not behaviours, their properties cannot be compared with
the properties of animal communication.
(Bastian 1968: 589)
Imagine that we switch on the radio and tune into the sound of an orchestra
playing a familiar piece. The music proceeds for a short while but then we
begin to hear the sound of falling masonry. Performance of the music
becomes disorganised and stops. We hear the sound of chairs being knocked
over and running footsteps and then people screaming. How do we interpret
this sequence of events? The most likely interpretation is as follows: when
we begin to listen we hear the soundstream as music. For many listeners
(though not for all) even the landscape of ‘people playing instruments’ will
be ‘bracketed-out’, their attention will be focused on the syntax of pitch
relations. Once the masonry starts to fall, however, people's attention will
switch very rapidly to the landscape. It will be apparent that some event is
taking place in a location. The landscape of the musical performance can no
longer be bracketed-out. Finally when the screams are heard we perceive an
utterance, that is we assume the sounds that the people are making have
some fairly immediate intent and are not just some new development in
avant-garde musical technique. It may, of course, be objected that the
original musical performance is a type of utterance but it is clear even at
this stage that there is a marked distinction between the two types of sound-
event. In the first (orchestral) case, the utterance is highly formalised, the
sound is patterned according to conventional syntactic rules and we do not
even ascribe the patterning to the performers themselves. They are (up to a
point) merely the agents involved in producing the structure. In the latter
case (screaming) we assume, however, that the individuals involved ‘mean’
their utterances in some immediate sense, i.e. fear, danger. Although we can
make no absolute distinction, for the moment at least we will not describe
the conventionalised case as an utterance. As we proceed with our argument
these distinctions and their ramifications will become clearer.
Now imagine that all of a sudden the pandemonium ceases and an
announcer comes on the air to inform us that we have just been listening to
a new electroacoustic work by a certain young composer. In a sense, with
hindsight, we can now say that the entire experience was a formalised
utterance, a piece of clever tape montage created by a composer, but clearly
it is not as simple as this. Even where we know that we are dealing with a
conventionalised situation, we cannot normally completely obliterate from
our minds the interpretation in terms of (direct) utterance. If we compare
two extreme cases, for example: I tread on my dog's foot and it yelps;
someone says something which he does not mean which is then quoted in a
play by an actor, a recording of which is then used as the basis for a tape
composition which is overheard on the radio by someone who speaks a
different language. Here we feel we can make a clear distinction between
what is a direct utterance and what is a completely different kind of sonic
‘communication’.
In most normal cases, however, where human beings are heard to
produce sounds, then we will tend to impute intention to the sonic event.
We will hear it at some level as an utterance. In particular, whenever the
human voice is used as a source of sound in whatever context, the concept
of utterance will enter at some level into our apprehension of the event.
This becomes particularly important in the sphere of electro-acoustic music
projected in the virtual space of loudspeakers where we can no longer rely
on the physical and social cues of the concert hall to conventionalise and
sanitise the vocal events. In general, sounds produced by individual
creatures may be taken to indicate or express something about internal state,
reactions to environmental events, responses to utterances by other
creatures and so on, becoming more involved, convoluted and to some
extent detached as we move up the cerebral hierarchy, finally reaching the
etiolated heights of artistic manifestation. At whatever level, the sense of
utterance, whether as indicator, signal, symbol, sign or syntactic or
semantic-syntactic stream, enters into our perception of the events.
The study of utterance will have a bearing on sonic art in two related
ways. First of all, wherever voices enter into sonic structures, we will have
to deal with the special characteristics which pertain to the sonic
architecture of utterance, for example, universal utterance-gestures, para-
language, phonemic objects, language-stream articulation. At the same
time, aspects of utterance may be observed in, or structured into, the
morphology of other sound-objects and events. Just as we can imagine a
landscape containing an utterance, so we can imagine an utterance
containing a landscape (a crude example would be vocoded sea sounds).
Either of these may be aspects of an essentially musical composition.
The higher we go in the animal kingdom, the more diffuse and heterogeneous become the
motor zones, introducing a notion of the degrees of freedom. The production of complex
signals depends upon numerous centers which interfere with each other, and thus no longer
permit the ‘all or none’ responses of invertebrates or lower vertebrates. In mammals, zones
corresponding to a specific signal are not found. Instead, generalised phonation zones can be
described which are diversely activated by other centers concerned with different emotional
behavior patterns.
(Busnel 1968: 135)
The range of sound emission organs found in the animal kingdom is quite varied; they are
usually bilateral in invertebrates and very often unpaired in vertebrates. They may be
restricted to one sex, or they may present a considerable sexual dimorphism. They are found
on all different parts of the body. For example, the following may be found functioning as
sound emission organs in invertebrates: chitinous toothed files which, by friction, stimulate a
vibrating body—wing, elytra, antenna, thorax, leg, abdomen (Orthoptera, Crustaceans);
friction or vibration of nonspecialised organs such as the wings (mosquitoes and some
moths); semi-rigid plates on a resonant cavity stimulated by neuromuscular contractions
(Tymbal method—cicada); reed-like organs which function by aspiration and expiration of
air (death's-head hawk moth, Sphinx atropos). [...] two species have been found [...] which
can automate legs. These species have no special stridulatory organ; however, when the legs
are separated from the body, they emit sounds. When they are intact, they are silent. The
hypothesis is that the noise emitted by the leg attracts predators, leaving the animal free to
flee [...].
In lower vertebrates, [...] nonspecialised organs may produce friction, as do
vomerine teeth in certain fish; osseous, rattle-type apparatuses may be found, made up of
moving, oscillating parts which knock each other when agitated, as in rattlesnakes; whistling
or vibrating apparatuses which function by air expulsion through a more or less
differentiated tube (larynx) ending in an aperture (glottis) with more or less functional lips.
The expelled air is supplied by the lungs themselves or by being in contact with an air pocket
reserve [...] (vocal sac of some amphibians); and finally, membranes may be stretched over
resonating pockets (as is the swim bladder of fish). These apparatuses are activated either by
external percussion (fin beating) or by contraction of muscles disposed in different ways
around the cavity.
Sounds produced by nonspecialised organs are also found in higher vertebrates.
These include breast-beating in the gorilla, organ-clapping, such as wing-beating of the
wood pigeon, drum-rolling in the hazel grouse and gold-collared manakins and trembling of
remiges (primary) and rectrices (tail feathers) in the woodcock and snipe. Owls and storks
use their beaks, and some bats [...] and some insectivores [...] use their tongues. In many
higher vertebrates specialised organs are found, usually working by propelled or aspirated
air in a more or less differentiated tube equipped with modulating membranes or slit systems.
These organs are vocal cords, muscular glottal lips, the larynx [...] and the bird larynx and
syrinx. These apparatuses often have additional organs which form air reservoirs or
resonators (clavicular and cervical air sacs) as found in the bustard, ostrich, crane and
morse. In some monkeys these features are found in the thyroid cavity, as is the gibbon's vocal
sac or the hyoid bone resonating chamber of the New World howler monkey. Curious
peripheral sound organs are also found, such as the fifteen-spined sound apparatus in
tenrecs, [...] and the tail bell of the Bornean rattle porcupine.
(Busnel 1968:131–132)
Indicators
No cry leopard frogs emit in the laboratory, however, is comparable in intensity to the scream
heard near midnight on one occasion when a mixed chorus of over a dozen species of frogs
called from a single pond. Microphones were being disconnected when a piercing scream
came from a smaller pond nearby. A beam of light disclosed a raccoon scarcely twenty feet
away carrying a leopard frog in its jaws. The raccoon had seized its prey in shallow water
where numerous other frogs continued to call as though oblivious of its presence.
(Bogert 1960: 204)
Totemics
Rowell (1962) has been able to show that these nine sounds actually constitute one system,
linked by a continuous series of intermediates. Moreover there is one example of multi-
dimensional variation, the pant-threat grading independently into three other calls [...]
Rowell's descriptions also demonstrate correlations with a continuously varying set of social
and environmental situations [...]
(Marler 1965: 561)
Hence, just as with a sound-object of dynamic morphology we are able to
make gestural articulations in different dimensions simultaneously, the
rhesus monkey is able to ‘present’ its internal state in a multi-dimensional
field of utterance. We might imagine a single articulated signal carrying
information simultaneously about, for example, dependency, sexual arousal,
aggression and territoriality, a rich communication medium without the
benefit of the arbitrary sign of a language system. This state of affairs also
reflects the fact that the ‘internal states’ of such organisms have themselves
become multi-dimensional, complex worlds and further underlines a point
made in an earlier chapter about the inadequacy of a discrete verbal
vocabulary of ‘emotional states’ as a means of describing what is going on
within a being.
In addition, however, the repertoire of sound signals may be
sequenced in particular orders and gestural information conveyed through
this sequencing. Most animal studies are heavily concerned with the
development of denotative and arbitrary signs; they are searching for the
roots of human language and there is a tendency to assume that syntax
(rules of sequencing) implies semantics (in the sense of language). As every
musician knows, however, this does not follow. Just as previously we were
able to make a distinction between an indicator and a signal, we must now
note that there is some confusion between a signal and a symbol. Therefore
a certain combination of fear, aggression and sexuality may produce a
particular level of arousal and a particular articulation of the internal state
causing the vocal apparatus to emit sounds of a particular form. As another
organism of the same species will recognise these sound-objects as if it
itself had emitted them, they may be taken to symbolise the particular state
of the first organism. However, we cannot therefore assume that the emitter
intended this symbolisation. Apart from the bringing into action of the
vocal apparatus as a whole, the resulting evolution of the sound-objects
may have been substantially involuntary, a direct utterance.
For the emission to be a symbol to the producer, an act of self-
mimesis is necessary. Mimesis (the imitation of sounds external to the
organism) can be observed in a number of animal and bird species. For
example Indian Hill Mynahs can mimic almost any noise presented to them,
mocking birds incorporate an enormous variety of other bird-sounds into
their repertoire and parrots may be taught to imitate human speech (mimetic
factors in fact enter into natural human languages; see the next chapter). We
may imagine now that a creature may mimic the fear cries of its fellow
creatures and finally that it may mimic its own fear cries, in other words
that it may pretend fear. At this point the signal “fear” becomes a symbol
for fear. This, however, is a difficult point to define. On seeing a predator a
monkey might emit a particular sound which we might take to be a signal of
fear or a symbol of the predator. If the predator is nearer the emission may
be louder. Does this mean, therefore, more-fear or predator-nearer? Where
does the signal end and the symbol begin? We may assume in fact that at
least until the emergence of the arbitrary linguistic sign, there is no absolute
separation. All symbols carry with them an element of physiological-
intellectual signalling to which other creatures respond in a very direct
manner. Once, however, we are able to sequence different signals, we may
convey gestural information in the overall sequence and contour of the
expression. The more this is the case, the less need the individual sonic
gesture units carry immediate signalling information. We generate a
separation (not unknown in music) between the microstructure and the
macrostructure of an utterance. Here, however, nothing is referential in the
sense of the arbitrary sign of language and although the microstructure
gestural units no longer carry such intense and immediate internal gestural
information, there is still a remnant of that original physiological-
intellectual response which allows us to differentiate and respond to them.
We have thus created an articulate gestural syntax which exists on (at least)
two levels.
In such a way we can evolve a multi-level syntax without the
linguistic ‘arbitrary sign’. We can even imagine referring to hypothetical
situations within the context of a real situation, once this operation of
syntactic levels is generated. I am not suggesting here that I know of any
non-human animals which have evolved such a system! Only that such a
system, without language, is conceivable.
Invention; convention
The Western classical art music tradition is often noteworthy for its
rejection of the concept of utterance. This may in some respects be traced to
the totemic function of music within society to emphasise group solidarity
in various ways. In situations where the activities of a large group of
musicians are co-ordinated to fulfil a certain predetermined musical end
(for example, fixed ritual observances associated with religious practice),
individual utterance is intentionally negated for the furtherance of a group
utterance manifested through the organisation of the musical materials re-
presented. Here also a second level of conventionalisation arises. Not
merely is a conventionalised structure of musical gesture used, but our
attention is directed away from the personal intent of the performers. In
music of the standard Western repertoire, the conventionalisation of
utterance is many-layered. The composer, conductor and the individual
performers each contribute a level of conventionalised utterance to the
overall sonic experience.
Utterance in the special sense in which we have used it in this chapter
can occasionally come to the forefront of our perception of the musical
experience. The display of virtuosity draws our attention to the technical
expertise or ebullience of the individual performer. The veil of convention
is broken. In the gospel singing of Mahalia Jackson or Aretha Franklin,
conventional musical syntax is gesturally articulated in an extremely
elaborate way which suggests a sense of immediacy (rather than
hypothesis) in the utterance. In contrast a typical Lieder recital normally
presents a sense of distancing; the utterance is clearly of a hypothetical
nature. The singer is not directly involved in the actions or internal states
suggested by texts or musical architecture. In the case of opera singing,
however, a further state is reached. Here (just as an actor adopts a persona
in a play), the singer attempts to present musical material as if it were the
direct utterance of the character represented. In fact the situation is even
more complex, because we of course know that the character represented
would probably not sing about his or her grief, joy, etc. but would more
likely speak about it. We might then schematically represent the situation as
follows: (conventional utterance—opera singer (direct utterance—character
represented (conventional utterance—singing))).
Even without the further level of characterisation we have in opera,
music may be presented in such a way that the utterance aspect is played
down, for example, Xenakis’ Pithoprakta, where the large-scale structure is
dominated by slowly-evolving events, many of which seem gesturally
neutral while the activities of individual players are amassed in dense or
semi-random textures which negate the possibility of individual
articulations emerging through the total structure, or pushed forward, as in
Schoenberg's Erwartung, a monodrama about a frightened woman lost in a
wood sung by a single female singer. The use of the voice in modern
Western popular music presents an interesting case. Whereas in the classical
tradition the singer strives towards the perfection of a particular kind of
voice which is a social convention and is felt to be transferable from one
work or one expressive context to another (liturgical, concert etc.), popular
music projects the idiosyncratic features of the individual singer's voice.2
The audience is assumed to be more interested in music as a personal
utterance rather than as a socially conventionalised utterance. We are
clearly not dealing here with direct utterance in our original sense of the
term. The popular singer adopts many levels of conventional utterance-
structures in order to communicate with an audience. It is usually assumed,
however, (whether or not it is justified) that at some level the singer is not
‘acting’, that the conventionalisation stops and that the singer is presenting
his or her personal utterance. This is pretty obvious in the case of protest
singers, but may be much more indirect. For example, the idiosyncratic
features of the particular voice may be felt to carry the mark of personal
tragedy, grief or difficulty (if in a somewhat distilled format), for example,
Edith Piaf, Judy Garland or Janis Joplin.
Often such personalised utterances will be expressed through widely-
known popular and often totemic song-structures. This is taken to an
extreme in the case of blues, where an almost claustrophobically rigid
structure of music and text is used as a vehicle for sophisticated gestural
expression. This is akin to the highly articulate gestural articulation of
‘stock phrases’ in vernacular speech where the linguistic content can be the
least significant communicative element. The concept of ‘sincerity’ in the
world of popular music can only be understood in relation to these ideas
about utterance.
A more interesting interrelation between conventional and direct
utterance can be observed in ecstatic behaviour. The state of ecstasy
achieved in various religious rites and sometimes in music or dance
improvisation is experienced as a loss of conscious control. In glossolalic
speech, possessed dance, ecstatic gospel music, etc. the performer is able to
articulate the voice or the body to a degree or extent and with a fluency
which is not possible where the conscious mind retains control over the
intellectual-physiological sphere. However, this articulation usually takes
place over a field of conventionally-established possibilities (phonemic
strings, dance movements, musical scales) an intense and immediate
utterance swirls upwards through the conventional structures. How can we
explain this?
When a child begins to walk, it must learn how to do so. It begins
with difficulty to co-ordinate the necessary muscular movements and the
signals about balance and posture received from the ears by the brain.
Eventually, however, all these activities become ‘second nature’ and we are
able to do all sorts of intricate tricks (avoiding objects, hopping over things,
changing our pace to match another person) without consciously thinking
about any of these. Although there is probably a lot of genetic input into our
development of the walking skill, the development of second-nature skills
does not stop when our ontogenetic development ceases. Thus the motions
of the fingers and the fingering patterns required by an experienced concert
pianist are not normally thought about as such in detail during a
performance. They are second-nature. Furthermore, it can be argued that
speech itself (except perhaps among heavily contemplative intellectuals) is
a second-nature activity. High level conscious control is only required at the
most general semantic level. Once this is released, ecstatic glossolalic
speech becomes possible.
One recurring trend of Western art is the movement away from any kind of
direct and ecstatic utterance towards the conventionalisation of all
parameters of the event. The conventionalisation may be an aspect of social
distancing where the conventions are generally understood and accepted as
a medium through which social messages may be transmitted. They may
also, however, be personal conventionalisations of the artist, ways of
distancing himself or herself from the social conventionalisations and even
the implications of direct utterance. Thus the sound poet may plan and
execute a sequence of rhythmic screams or sobs during the performance.
Similarly, a composer like Berio may sit in the studio and coolly edit
together segments of tape carrying verbal gestures which are erotic, funny,
terrifying and so on. Although we know of the artist's detachment from such
utterances, we do not normally distance ourselves entirely from the
utterance-implications of the sounds involved. There is, however, a fine
balance to be preserved between distancing from and involvement in the
utterance whether by the artist or the listener. To quote Gregory Bateson:
I suggest that this separate burgeoning evolution of kinesics and paralanguage alongside the
evolution of verbal language indicates that our iconic communication serves functions totally
different from those of language and, indeed, performs functions which verbal language is
unsuited to perform. [...] There are people—professional actors, confidence tricksters, and
others—who are able to use kinesics and paralinguistic communication with a degree of
voluntary control comparable to that voluntary control which we all think we have over the
use of words. For these people, who can lie with kinesics, the special usefulness of non-verbal
communication is reduced. It is a little more difficult for them to be sincere and still more
difficult for them to be believed to be sincere. They are caught in a process of diminishing
returns such that, when distrusted, they try to improve their skill in simulating paralinguistic
and kinesic sincerity. But this is the very skill which led others to distrust them.
It seems that the discourse of non-verbal communication is precisely concerned
with matters of relationship—love, hate, respect, fear, dependency, etc.—between self and
vis-à- vis or between self and environment and that the nature of human society is such that
falsification of this discourse rapidly becomes pathogenic. From an adaptive point of view, it
is therefore important that this discourse be carried on by techniques which are relatively
unconscious and only imperfectly subject to voluntary control. [...]
If this general view of the matter be correct, it must follow that to translate kinesics
or paralinguistic messages into words is likely to introduce gross falsification due [...]
especially to the fact that all such translation must give to the more or less unconscious and
involuntary iconic message the appearance of conscious intent.
(Bateson 1968: 615)
The point at which the artistic manipulation of materials collapses over into
formalism (in the listener's perception) is very difficult to judge. The type of
artist we are discussing needs to be sufficiently removed from the
immediate utterance implications of his or her materials to explore new
areas of statement or expression. If these implications are ignored, however,
the artistic result is likely to be perceived in some way not intended by the
artist or, worse still, it will be rejected as the artistic equivalent of a
‘confidence trick’.
The problem of detachment has particular significance in Western
society. As an aspect of a professional pursuit, particularly the pursuit of
science, it has proved highly socially fruitful. A detachment from the social
sphere, however, is normally (except in the case of politicians and military
personnel) regarded as a form of mental illness. Mental detachment in
science is useful because it enables us to develop instruments which may
then be useful to the social body. Social detachment in the research which
precedes an artistic work may also be useful in that it enables us to look at
our materials in new ways. Social detachment in the artistic work itself,
however, makes it intrinsically meaningless except as a solipsistic activity
for the artist or an interesting intellectual game for analysts. There is a
certain psychopathology in the scientific method where it is applied to other
beings such as in the pseudo-science of behaviourism and in the pseudo-art
of the notational formalists.
Towards language
1 At least at the time of writing (1983) during the apartheid era [Ed.].
2 See Roland Barthes’ essay ‘The Grain of the Voice’ (Barthes 1977) for a parallel view (Ed.).
Chapter 12
THE HUMAN REPERTOIRE
We are now at the point where we can describe the human repertoire,
the fund of possible sonic objects and their articulations which is available
to the human utterer. The following taxonomy is based on an earlier
analysis of mine published in Book of Lost Voices (Wishart 1979). Recent
discussions with the lettriste poet Jean-Paul Curtay have clarified a number
of physiological and other distinctions enabling me to present a more
systematic classification of the sound-materials discussed in that
publication. Curtay has presented a physiological analysis of vocal
technique (Curtay 1981) whereas my own approach is oriented to a
description in terms of sound-objects. This leads to two different notational
approaches to the sounds (both of which will be discussed below).
Furthermore, the analysis here will be confined to sounds related to the
vocal tract. Body slaps, hand rubbing and other sounds may of course be
produced by human beings (and these are discussed in Curtay (1981)) and
no aesthetic preference is implied by their omission here. Furthermore, I
will not claim that this is a complete analysis as I have been discovering
new sounds almost every week since the completion of Book of Lost Voices.
This listing does include several sounds not mentioned in the earlier
publication.
The analysis does treat sounds of intrinsically short-duration
separately from sounds which can be sustained. Of course any of the
sustained sounds can be made into a short sound merely by curtailing its
duration. Furthermore I have not made the distinction between ‘iterations’
and ‘vibrations’ simply because any iterated sound becomes a vibration if it
is sufficiently speeded up. Some sounds (e.g. rolled-r, lip-flabber) appear at
first sight to be intrinsically iterative as we normally produce them in a
range where we can hear the individual pulsations; however, as will be
demonstrated, these sounds can be made to rise into the normal audio
vibration range (the first by increasing tongue pressure on the roof of the
mouth, the second by hand tensioning the lips) and even at their normal rate
of iteration a pitch can be perceived (particularly if it is stabilised). Slow
iteration may be looked on as sub-audio vibration, for example, the glottis
(vocal chords) can be made to vibrate in a sub-audio mode and even to emit
individual impulses (particularly when activated on inhaling).
For the moment we will assume that the flow of air is outwards from the
lungs (exhaled). The principal sound-sources in the vocal tract are physical
oscillators in the sense that they produce sound by physically moving to and
fro (just like the reed in a reed instrument). Certain other parts of the vocal
tract can be made to resonate and hence produce pitched sounds.
(1) The glottis is the source of the normal human singing voice. Sounds are
produced by oscillations of the vocal cords, and we will refer to these
sounds either as glottal vibrations, or vibrations of the larynx. Vibrating
the larynx produces an impulse which is normally iterated (Example
12.1). At normal rates of vibration this is heard as a pitched sound
(Example 12.2). At least two separate registers can be perceived within
the range of pitch produced by the larynx. For the male voice these are
usually referred to as normal voice and falsetto voice. A completely
seamless transition can be made between the two registers. It is also
suggested that there is a further break in the voice permitting an even
higher range of pitches to be obtained. My own (male) voice will at the
moment reach to the G two octaves and a fourth below middle C (the
lowest part of the range is very relaxed and quiet) and as far as the G
one octave and a fifth above middle C (in falsetto).
(2) The windpipe. If air is expelled very forcibly from the lungs a low pitch
is produced which does not originate in the larynx but somewhere
below it (Example 12.3). The sub-glottal vibration may be stabilised on
a definite pitch. This pitch, as far as I can tell, cannot be altered and
may be combined with glottal pitches. Note that this sound is quite
different from sub-harmonics (see below) and is the basis of the famous
‘Satchmo’ gravel-voice. A similar effect may be observed in the
windpipe above the larynx (Example 12.4). Although I have no direct
medical evidence that these sounds are produced where I suggest, they
are quite distinct from glottal vibrations because it is possible to
combine them with glottal vibrations. Examples 12.5–12.8 illustrate the
sound below the larynx, the same combined with the sound of the
larynx, the sound above the larynx and the same sound combined with
the sound of the larynx.
(3) Subharmonics. If glottal production is made very relaxed, it is possible
to instigate a note one octave below the original glottal note and
sounding simultaneously (Example 12.9). This note varies in pitch
along with the original glottal note, always remaining at the interval of
one octave and is produced either by the larynx or by the false vocal
folds resonating (half) in step with the larynx. With practice, a note one
octave and a fifth, or even two octaves, below the glottal note can be
produced.
(4) The oesophagus is resonated during belching. The sound is used as a
basis for speech by people who have had their larynx removed for
medical reasons.
(5) The tongue may be vibrated against the roof of the mouth. This may be
done using the tip of the tongue towards the front of the mouth
(Example 12.10), upwards onto the soft palate (Example 12.11) or
strongly retroflexed (Example 12.12). Alternatively the tongue may be
arched so that the middle of the tongue is in contact with the roof of the
mouth, producing the characteristic French ‘r’, either in the middle of
the mouth (Example 12.13) or towards the back of the mouth (Example
12.14) and finally against the uvula, the sound associated with snoring
(Example 12.15). By suitable use of tongue pressure and placement
these sounds can be brought into the range of normal sung tones,
particularly the arched tongue type (Example 12.16).
(6) The lips may be made to vibrate either in normal position (Example
12.17) or strongly folded inwards towards the teeth (Example 12.18) or
strongly pouted outwards (Example 12.19). Pitch formation with the
lips may be assisted by using the hands to stretch or relax the lips. This
also helps to stabilise the vibrations so that lip notes can be sustained
for long periods (Example 12.20). The available aperture and tension of
the lips can also be manually controlled to produce a variety of
different kinds of oscillations (Example 12.21) and in particular two
independent sets of vibrations can be set in motion at different corners
of the mouth simultaneously (Example 12.22).
(7) The cheeks may be vibrated independently of the lips and the pitch
controlled by varying tension by use of the hands (Example 12.23).
These vibrations may be sub-audio (Example 12.24). The two cheeks
may also be set in vibration independently (Example 12.25) with
possibilities such as producing patterns of beats between two sub-audio
frequencies.
Filters
Sounds produced within the vocal tract have not only a fundamental
frequency but also formants, frequency areas within the spectrum where
energy is concentrated. Formants are generated by various resonances
within the oral and nasal cavities and it is quite possible to vary these in a
continuous fashion. This may be done in four ways: by varying the size of
the oral cavity, by varying the position of the tongue's arch, by greater or
lesser rounding of the lips and by greater or lesser nasalisation (i.e. varying
the amount of sound which is passed by the nasal passages). The latter will
be discussed separately as it is of less importance and applies only to glottal
and windpipe sounds. (Vowels, Example 12.26.)
Sounds produced prior to the oral cavity (glottal, oesophageal and
windpipe sounds) may have their formant structure altered by any of these
four methods. The oral (nasal) cavity may thus be regarded as a complex
filtering device. As an initial approximation we will omit lip-rounding from
our analysis. Fortunately there is a notation immediately to hand for
specific formant types as the vowels of human languages are determined by
their formant colour. As a first approximation we may draw a two-
dimensional map (see Figure 12.1) of the ‘formant space’ available. Note
that we may move from the open a sound of English ‘father’ to the small
aperture vowel with the tongue arched against the front of the mouth
(German ü) by either first closing down the mouth aperture to reach the
vowel u as in North of England ‘mud’ and then moving the tongue arch
position forward, or we may begin by moving the tongue arch forward,
passing through the vowel e of English ‘red’ and arriving at the vowel e of
‘she’ and then closing down the mouth aperture. Between these two
extremes there are any number of roots from the a to the ü through the
vowel space (Example 12.27).
Figure 12.2 Some vowel signs from the International Phonetic Alphabet.
Noise
The types of sound available are not as simple as this chart might
suggest; the particular nature of the individual oscillations, their relative
amplitude and pitch and the extent to which they interact may vary or be
varied, thus producing quite distinct classes of sound-objects. Thus, when
we combine glottal vibrations with windpipe vibrations, the latter may be of
the stable low-frequency type which gives us the ‘gravel voice’ or the less
clear and more forced ‘air’ type which interact with the larynx sound to
produce a sound-complex like a roar or bark (Example 12.46).
Where glottal sounds are combined with tongue oscillations in speech
discourse, the amplitude of the glottal vibration is usually much higher than
that of the tongue oscillation which thus appears as a mere colouring of the
glottal pitch. However, if the strength of the tongue vibration is increased
we become more strongly aware of the amplitude modulation which is
taking place (the tongue is an audio rate amplitude modulator of the glottal
pitch stream). If the tongue vibration is made even louder and the glottal
vibration reduced in level the glottal pitch becomes a mere decoration of the
strong tongue vibration. In addition the two oscillators may be tuned to one
another. Where they are of equal strength and in similar register this tends
to take place ‘naturally’. As the physical system of the oral cavity seeks
positions of lowest energy, we discover ourselves producing octaves and
fifths almost automatically and it is much more difficult physically to
produce other intervals. By slightly detuning one of the oscillators, beating
effects can be produced which may be physically felt within the mouth (or
on the lips) (Examples 12.47–12.50). In glottal/cheek and glottal/lip
vibrations there may be a remarkable difference between ‘rounded’ slow-
stream impulses with glottal pitch colouration and chordal effects produced
by the inter-modulation of mid-frequency glottal and lip or cheek vibrations
(Examples 12.51–12.53).
If a highish glottal note and a whistle are tuned to an octave and then
the whistle tone moves slightly away from the octave intermodulation will
produce chord-like colouration of the resulting sound. This process is
exactly analogous to what takes place (electrically) in a synthesiser. If on
the other hand a very high s-whistled note is produced fairly quietly against
a deep (male) glottal note, an almost grain-like amplitude modulation is
imparted to the note (Example 12.54).
Windpipe sounds can be combined with other articulations but
currently I would not recommend anyone to try these too much! It is also
possible to vibrate the tongue in two modes (rear arch and tip)
simultaneously (Example 12.55). The forward vibration tends to be a
subharmonic of the rear vibration. In a similar fashion the lips and tongue
can be simultaneously vibrated. Tongue-tip vibrations can be easily
combined with loose-lip and manually-stretched-lip vibrations, often
producing intermodulations which can be felt in the lips (Example 12.56)
and either low register or high register pitches may be tuned. Noise bands
are easily and effectively amplitude modulated by low-frequency tongue,
uvula and lip vibrations and noise may also be made to colour the more
high frequency vibrations of these types (Examples 12.57–12.64 give some
examples of these combinations). Even whistling can be amplitude
modulated by tongue-vibrations (easier with rear-arched tongue than with
tip of tongue). The lip modulation of whistling is the basis of the
‘trimphone’ imitation which became a craze in Great Britain in the early
1980s (Example 12.65).
We can go beyond this and generate three or even four sounds at
once, for example noise turbulence, tongue vibration, lip vibration and
glottis vibration may all be activated simultaneously and controlled
independently of one another (Example 12.66). Some special cases can be
observed, for example, if a glottal/tongue-tip vibration in the male low
falsetto register is passed through pouted lips, they may be caused to vibrate
in a very subtle way, producing a trumpet-like colouration of the sound
(Example 12.67).
Water effects
Saliva (or externally introduced water) may interact with the articulation of
vibrations. Gargling is the most obvious example of this. Saliva often
affects the sound quality of arched-tongue vibrations (Example 12.84). In
particular, the noise sound x has a great number of possible modes when it
is combined with saliva-water sounds (Example 12.85). The sound may be
filtered in various ways (Example 12.86). It may be half-lunged and then
filtered again (Example 12.87). It may be plosively attacked with a k and a
short rush of air to produce the children's ‘gun’ sound (Example 12.88). It
may be half-lunged, filtered to produce very high partials and produced
staccato and plosive (Example 12.89). A rational notation for the
distribution of harmonics in this sound is very difficult to achieve because
although the high partials are strongly emphasised, it is clearly still possible
to vary the resonance of the mouth cavity that is produced. The notation
shown in Figure 12.9 uses the ‘stave of harmonics’ to indicate that this is a
high partial sound but a simpler mnemonic is proposed. This sound
incidentally can be combined with tongue-tip vibration (Example 12.90).
Saliva-water effects also account for the pitch content of the sounds
indicated in Figure 12.10 (Example 12.91).
Most of these half-lunged water sounds can also be produced inhaled.
Inhaling, however, also generates a number of other sounds such as the
modifying of inhaled lip vibrations by water held behind the lips (Example
12.92) and various sounds around the sides of the tongue (Example 12.93).
Transformations
Inhaled sounds
Sounds may also be produced when air is inhaled. In many cases these are
the same or quite similar to those produced on exhaling, but in a number of
cases quite different sounds are produced. Many of the sounds produced in
this way exhibit instabilities, either in pitch, spectral content or sub-audio
attack rates.
Vibrating the lips by inhaling can produce pure tones, trains of pulses
or multiphonics (Example 12.100). The vibrations may be controlled and
articulated by using the heels of the hands (Example 12.101). All these
sounds are unlunged. The tongue may be made to vibrate, both in retroflex
position unlunged (Example 12.102) or at the uvula as in snoring (Example
12.103).
Finally, the glottis (and possibly the windpipe) may be made to
vibrate while inhaling. If a lot of air is drawn inwards the effects produced
by (I believe) the larynx and windpipe are heard. A better method of
production, however, is to draw air in regularly and slowly (as air would be
expelled during normal singing). By varying the tension of (I believe) the
larynx and the filtering in the oral cavity a great number of different kinds
of sounds can be produced: from pure tones (Example 12.104) which may
be outside the normal range, click trains (Example 12.105), sub-harmonics
(Example 12.106), more complex multiphonics (Example 12.107) to
complex and unstable oscillations (usually produced at the end of a long in-
breath when the pressure inwards is difficult to maintain) (Example
12.108). The instabilities in these latter sounds can be felt as a kind of
irregular beating in the larynx. In the various complex sounds different
aspects of the complex spectrum can be emphasised by the filtering process
(Example 12.109).
As some of these inhaled sounds are unlunged it is possible to
simultaneously produce inhaled and exhaled sounds. For example, one can
produce inhaled lip vibrations while projecting glottal vibrations through
the nose. Furthermore, various of the flutter techniques can be applied to
the air stream.
Pulses
Notation
In Figure 12.12 various other aspects of the notational system for voice-
sounds are indicated. The vowel and consonant symbols are derived from
the international phonetic alphabet (Figure 12.13). In assembling a score a
three-level representation is used (see Figure 12.14). At the upper level
durations and loudness are indicated in the traditional musical fashion, at
the bottom level detailed phonetic and extended-phonetic notation is used to
indicate the details of the sounds. In the central level these sounds are
notated using graphic symbols which allow us to indicate pitch, pitch
motion, transformation, intermodulation and so on.
The international phonetic alphabet has been developed for linguistic
analysis of phonemes. In a performance situation, however, such diversity
of symbols may become confusing and it seems more practicable to use a
smaller set of symbols and methods of combining or modifying them (see
previous diagrams). In addition the phonetic alphabet has been derived from
natural languages and does not cover the whole human repertoire, therefore
modifications and extensions are required.
The system of notation developed here has a degree of redundancy. In
particular information is given both in a ‘phonetic’ format and in a graphic
format. This redundancy is useful when it comes to reading notation during
a rehearsal or performance. The notation is also somewhat eclectic, using
devices drawn from standard repertoire music, contemporary music and
phonetic research. This, however, has the advantage that we are able to pass
over into conventional musical or conventional linguistic use of the voice
without any abrupt change in the way we represent the sounds.
It is also possible to present a systematic physiological notation for
the sounds of the human vocal tract (and the body in general). This has been
developed by Jean-Paul Curtay (Figure 12.15 which is from Curtay (1981)).
This notation is in fact more systematic and is used by Curtay in his
performances. However, I would still tend to prefer the eclectic method
which retains the links with language and conventional music and allows us
to notate complex sounds such as multiplexes while using physiological
descriptions as a very useful aid to performance practice. Furthermore, just
as traditional music notation tends to channel the aesthetic possibilities
(Chapter 2), even these extended notations have some bias towards a
physiological and a sound-object-oriented perspective respectively.
Natural morphology
In the next chapter we will return to some of our ideas about a natural
morphology of sound-objects in relation to phonemes. However, even at
this level we can note certain distinctions which may relate to our natural
morphology classification. Curtay has suggested a gross classification of
the human repertoire into gaseous, liquid and solid.1 This might be given a
more general interpretation as an aspect of natural morphology. Sounds of
solid objects are generally of stable mass (or pitch)—this includes the air
resonances within fixed-shape objects, e.g. a flute. Liquid sounds on the
other hand will often have changing mass, tessitura, spectrum and other
features but this change will exhibit a specific class of form (like a bubble
archetype). Gaseous sounds however, will be varying continuously in
various parameters (particularly mass) without a definite class of
morphological forms emerging. Clearly there is very much more to be said
about this. Air columns or liquids vibrating within solid objects (the water-
in-saucepan effect) or air passing through liquids would need to be
considered but there is certainly an interesting natural morphological aspect
here. For example, the sounds of a classical synthesiser can be very stable
in their spectral properties, implying a ‘solidity’ of the source. The sounds
of the human voice, on the other hand (and of course of musical instruments
articulated by human beings), even when they attempt to be stable, in fact
contain micro-fluctuations of pitch, dynamics etc. partly because the
musculature acting as a physical source or articulator is not a rigid object.
We tend to prefer even in normal musical practice a certain small degree of
‘liquidity’ in our musical objects.
Moving outward
From this repertoire of human sounds we may lay out areas of sonic
discourse. Focusing on the repertoire as physiological acts and perhaps
complementing them with visceral sounds recorded from within the body
(flow of the blood, etc.) we may evolve sound-structures which become a
kind of physiological diagnosis of the state of the organism. Something of
this feel is achieved in Curtay's Abridgement (Curtay 1981) where the
physiological landscape becomes the basis for sonic exploration. We may
treat the repertoire as a class of (intertransformable) sound-objects and
organise music accordingly (for example in my Anticredos) though we
cannot entirely avoid physiological (and para-linguistic) aspects of the
landscape adhering to the events at least on a first hearing. The sound-
objects may be extended into the electronic (as in McNabb's Dreamsong) or
into the world of recognisable sound-objects such as the transformation
from ss to birdsong in Red Bird. Given a good computer model of the
human voice as in the language Chant we may manipulate the form and
structure of the voice beyond that which is physiologically possible so that,
for example, individual glottal pulses may become bells (the formants are
narrowed and ring) or the energy in the formants may be refocussed and the
pitch articulated in such a way that we produce birdsong.
We may focus upon the mimetic abilities of the human voice made
possible by its enormous repertoire. The imitation of instruments such as
drums or trumpets has been touched upon earlier. Natural morphology in
relation to phonemes will be discussed in the next chapter as will the idea of
phonemic analogy (phonemic objects which are akin to but not exactly
mimetic of other sounds). We may extend the human repertoire by the use
of external resonances; thus brass instruments amplify and stabilise lip
vibrations extending their range of loudness and controllable pitch. We may
model the voice on musical instrument technology, separating out pitch as a
parameter and developing the field of song.
We may focus upon the paralinguistic articulation of the sound-
objects. Such paralanguage may be based on (originally) involuntary
physiological states, gestural expression or linguistic conventions. In this
way we produce a kind of phoneme-free poetry. Such paralinguistic
articulations may enter into instrumental practice, particularly in relation to
pitch and timbre control on the trombone or pitch, timbre, breathiness
control on the saxophone. Finally we may select specific sound-objects
from the repertoire and combine them in particular ways to produce
phonemic objects. We then enter the sphere of language, of linguistic syntax
and semantics. In the following chapters we will look more closely at this
world from a sonic art viewpoint.
Figure 12.16 indicates some of these many possibilities. (Fine arrows
indicate areas between which there is a continuum of intermediate
possibilities or an interaction of perceived categories.) Note, however, that
these cannot truly be represented on a two-dimensional surface. The
implications of human vocal utterance are multi-dimensional. The
biological, paralinguistic, linguistic, mimetic and musical may all be
present in an utterance. In sonic art we will structure this material in order
to focus in upon one or several aspects of this amazing universe of sounds.
Although the archetype of the keyed musical instruments, fashioned in the
image of a theory of pitch, has been the dominant focus for musical
thinking in the West for at least 300 years, at this moment of enormous
technological and musical change there can be no doubt that we shall return
to the human voice for our inspiration as “voice is the original instrument”.
1 Although mentioned in the spoken introduction to ‘body music’ on the cassette (where solid is
strictly referred to as tissue), Curtay (1981) concentrates on a discussion of method of production
which he divides into three levels: excitation, emitter and modulation (or resonance) indications.
Wishart interviewed Curtay at the time of his visit to London in 1981 and his material has been
elaborated through this personal communication and through Curtay (1983) (Ed.).
Chapter 13
PHONEMIC OBJECTS
We must return to the innermost alchemy of the word, we must even give up the word
too, to
keep for poetry its last and holiest refuge. (Ball 1974: 71)
From the vast array of possible sound-objects available from the human
repertoire any natural language selects only a small proportion and
combines these phonemes into phonemic objects. These are the basic sound
units of any language and correspond roughly to the notion of the syllable.
Phonemes themselves are then combined sequentially to form morphemes
(words), the smallest independently meaningful units of language. We will
not go into this in greater detail, however, as this is not a linguistic analysis
but an attempt to explore the world of phonemic objects as a special class of
sound-objects for the purposes of sonic art.
Any particular language will exclude a large number of possible
human utterances from the sphere of the phonemic. If, however, we scan all
existing (and extinct) languages, we will discover that quite a large
proportion of the sounds in the human repertoire are being (and have been)
used in human language discourse. For example, inhaled clicks are used as
consonants in a number of Southern African languages but not in any
European languages.
Phonemic objects are almost paradigmatic examples of sound-objects
of complex morphology. For example, if we consider the word ‘when’, its
written form suggests that it contains four consecutive objects. A superficial
listening might suggest that it contains three separate sound entities (a ‘w’,
an ‘e’ and an ‘n’). If, however, we speak this word very slowly we will
discover that it is one coordinated sound articulated by the opening of the
lips (and associated widening of the oral aperture) and the coordinated
motion of the tongue concluding where the tongue tip reaches the roof of
the mouth and cuts off the air stream. As a sound-object, therefore, this
event consists of a complex but continuous motion through the formant
space, most likely simultaneous to a slight movement of the fundamental
pitch. This kind of continuum exists in most verbal objects, except where it
is explicitly chopped up by stops and pauses. Thus, in the word ‘say’ it may
seem superficially that we have two distinct objects ‘s’ and ‘ay’. If,
however, we speak these two objects (even very carefully) and record them
onto tape and edit the two together we will not reproduce (except
approximately) the word ‘say’. In the speech stream there are subtle
transformations both of formants and into and out of noise turbulence. The
speech stream is thus an archetypal example of complexly evolving timbral
morphology. This will be discussed again in the next chapter.
For the purposes of linguistic analysis it is necessary to separate off
the distinct units which form the basis of the ‘digital’ coding of language.
From a sound morphological point of view, however, this can be quite
misleading. For example, the computer model of the singing voice
encapsulated in the programme Chant had by 1981 very successfully
modelled vowels in terms of definable and fixed formant structures which
could be reproduced by simulating the effect of a related system of filters
on an impulse stream. Modelling many of the consonants, however, proved
to be more difficult as their absolute characteristics varied very greatly with
context, both in absolute formant structure and, for example, onset time of
noise turbulence. They were thus characterised more by second order
characteristics (characteristics of the process of change itself) than by any
absolute properties. It is therefore important not to confuse the economy of
print with the reality of this speech stream.
A similar complexity of timbral morphology may be found in the
utterances of other creatures. For example, various kinds of birdsong which
appear superficially as trills or bubblings of fixed pitch have in fact a
complex internal structure which can be heard when they are slowed down
(Example 13.1). I would not agree, as some observers have suggested, that
these internal complexities do not exist for the human listener. On the
contrary, it is possible to predict with some accuracy what the internal
structure of a sound will be when slowed down if one develops one's ear for
dynamic morphological properties. Even without this degree of
discrimination, however, the listener can usually observe a qualitative
distinction between various songs which derive from these rapid internal
articulations, even if he or she is unable to describe how they arise.
Phonemic objects, then, are interesting sound-objects from the point
of view of the musician interested in sound-objects of dynamic morphology.
Interest has also arisen, however, from an entirely different direction. In
1947, the Rumanian Isidore Isou published Introduction à une Nouvelle
Poesie et à une Nouvelle Musique (Isou 1947) In this book Isou proposed a
new type of poetry based on the letter, to be known as lettrism. Isou's view
of the development of poetry is illustrated in Figure 13.1. The lettrist
movement led to many interesting developments, including aphonic poetry
(see Figure 13.2) and also to a deeper interest in the expressive possibilities
of phonemic sound-objects beneath the level of the word. Curtay's work
(discussed in the previous chapter) develops out of this tradition. This is an
interesting juncture of fields of thought about the world of sound, that
springing from music and that springing from poetry and language. As we
begin to consider larger units of language (such as words, phrases,
sentences etc.), considerations of semantic meaning (or the lack of it) will
enter increasingly into our field of view until we finally arrive in an entirely
different area of human discourse (didactic or scientific prose). At the level
of the phoneme, however, we are still deeply embedded in sonic art.
We may also note that from the human repertoire described in the
previous chapter we can create imaginary phonemic objects and in fact
construct imaginary languages and linguistic streams from these. (We can
also construct imaginary linguistic streams from ‘valid’ phonemic objects,
as we shall see.) This kind of imaginary language retains our material for
the field of sonic art as questions of semantics cannot enter into the
listener's perception (though paralinguistic and other signs and signals may
remain part of the experience). The four-voice piece Vox-I concludes with
the peroration of such an imaginary text.
Language divides the human repertoire into two distinct fields. Those
sounds (or sound combinations) which may enter into the construction of
language sounds and those sounds (or sequences) which may not. If we do
not make any restriction on the sounds we can use, the immense pliability
of the voice makes it able to mimic an enormous variety of sounds. In the
previous chapter we discussed how it was possible to use plosive
consonants and stops to imitate drums very accurately. Similarly, various
entertainers and serious investigators are able to imitate birdsong
sufficiently accurately to fool other birds, to imitate the idiosyncratic
features of another person's voice, or to imitate the sounds of natural objects
or machines. Even approximate mimesis makes the construction of the kind
of transformations into other recognisable sounds used in Red Bird a
possibility. Once, however, we restrict ourselves to those sound-events
deemed suitable for use in a particular language this type of mimesis
becomes more problematic.
Figure 13.1 Isadore Isou's view of the development of poetry (adapted from Isou (1947)).
Figure 13.2 Example of notation from Roland Sabatier aphonic poem Histoire.
There may of course also be direct analogies with birdsong as certain birds
(such as parrots) are able to articulate clear formant structures (such as in
their imitations of human speech).
Phonemic analogy and formant tracking of pitch can also be found in
human names for animal sounds. These vary in the ‘goodness of fit’. For
example, a cow, which produces a low-pitched sound, ‘moos’. A mouse,
producing a high-pitched sound, ‘squeaks’; a wolf, which sings a sustained
pitch which then gradually falls, is represented by a formant structure which
falls as in ‘howl’ (say it slowly). On the other hand, the low frequency
glottal/windpipe multiphonic (which humans can produce) of the pig or the
lion is only loosely represented by ‘grunt’ (where the ‘gr’ hints at the
subaudio oscillation of the windpipe) and ‘roar’. Such phonemic analogy
may breach the distinctness of natural languages, such as the various words
for the sound of the cockerel (cock-a-doodle-doo in English, kikeriki in
German, kokke kokko in Japanese, kio kio in the language of the Lokele
tribe of the Congo) or of sneezing (kerchoo in American, atishoo in
English, atchum in Arabic, cheenk in Urdu, kakchun in Japanese and ach-
shi in Vietnamese).
We may go one stage beyond the concept of mimesis and ask whether we
can apply the criteria of natural morphology to the sound-objects produced
in the vocal tract. Clearly these sound-events are generated by processes
which may be physically described (turbulent air-streams, plosions, opening
and closing of apertures, etc.). Is there a natural morphological description
of certain kinds of phonemic (and other) vocal objects?
Jean-Paul Curtay has approached the same problem from a slightly
different point of view. Hence he considers both the motions and shaping of
the vocal tract organs in his conception of mouth symbols. These two
conceptions are very close indeed and it is worth considering the slight
difference that does arise. In considering, for example, the word stop,
Curtay talks of “st- evoking a sudden interruption of movement in a rigid
vertical posture” (Curtay 1983).1 The rigid vertical posture is suggested by
the tight downward movement of the tongue and this particular st-
formation can be associated with a number of words (stake, stalk, stand,
stare, statue, staunch, stick, stiff, stop). Another way of describing this
would be that we hear a continuously sustained sound which is suddenly
interrupted by an impulse. The feeling of interruption of flow is equally
apparent from such a description and in fact, of course, the sound
morphology arises from the physical morphology of the sound production
process. If, however, we now consider the phonemic object sp-, Curtay
states that this is “evoking a circular movement” (spin, spiral, spool). The
circular movement is presumably suggested by the circularity of the lips, a
kind of spatial metaphor. Looking at it temporally (and in terms of what is
heard) we would suggest that the (air) flow of the s- is momentarily
interrupted by the constriction p- and then released into the rest of the word.
This is more evocative of the whipping of a top where a sweeping motion
of the whip is applied suddenly (the strike) to the stable spinning motion of
the top. Similarly, when a dancer spins it is necessary to tense the
musculature in a particular way and then suddenly release this energy,
allowing the body to spin as a result. The sense of stored energy and sudden
release into a stable motion is evoked by the time morphology of the word
spin.
It is interesting to consider some other examples. The phonemic
object sl- consists of a stream (or store) of energy (s) which is gently
released (l) into the stable motion of the vowel. The sense of gentle release
into movement is of course caused by the sliding of the tongue. We may
bring this motion to an abrupt end by the insertion of a stop consonant such
as p, to produce the word slip. This motion is so analogous to someone
slipping on ice, where the move into the continuous sliding motion is
abruptly interrupted by a fall that it seems unlikely to be coincidental.
Curtay quotes the related words sledge, sleigh, slide, slime, slip, slough,
slug, slant, slope, slash, slat, slit, slice, slither, slot, slender, slim. Some of
these associations are undoubtedly metaphoric or tangential (for example,
slope from slip and possible also slap from the act of killing by sliding a
sword into someone). Furthermore, the reader can now think of many words
which do not conform to this archetype. The point being made, however, is
not that all language is somehow made up of symbolic or metaphorical
sound-objects, but that some parts of language may originate in such
symbolism.
Spr-: a flow (or store) of energy (s), passing through a constriction
(p-) and continuing but being broken up into an iteration, (r-), as in spray,
sprinkle and, more metaphorically, sprout, spring.
Spl-: in which a continuous air stream (s-) meets a constriction (the
initiating mouth formation for p-), leading to a double release (pl-), as in
split, splice, splinter, splay, splash and, by metaphor or association,
splendour.
Gr- and scr-: in which an impulse or contact (g-) is followed by a
non-continuous (iterative), or we might say abrasive, motion as in grate,
grind (and possibly grip) and scratch, scrape, scrawl (even perhaps
scream).
Cl-: a double, rather than a clean, attack as in clang (as opposed to
dong), clatter, clink, clash. Certain word endings are also interesting from
this point of view, such as -ng, a resonant extension of vowels in which the
higher formants are gradually filtered off—sound is gradually directed
through the nasal cavity—as in many naturally decaying resonances. This is
found in bang, clang, bing, ding, ping, ring, dong, song. ‘-ash’, a resonance
which breaks up into turbulence, as in dash, smash, clash, splash, bash,
crash and, perhaps as a metaphor for the visual after-effect, in flash. The
word clash thus has an unclear (double) attack onto its resonance which
rapidly dissipates into turbulence.
These sound morphologies (which are illustrated graphically in
Figure 13.4) point in two directions. Curtay has suggested that the linking
of these phonemes with objects and activities in the real world is to a great
extent kinaesthetic, i.e. we feel the formations inside the mouth and thereby
associate them with activities or the shapes of objects in the external world.
This naturally leads us into the sphere of representation and language. At
the same time the sound ‘s’ which we feel as a store of tension because it is
produced by constricting the passage of air with the use of the tongue, is
also indicative of a similar or related physical situation in any natural world
event. These forms, therefore, also point towards a natural morphology of
timbral gesture.
For a language utterance to convey its meaning, the linguistic signs need
not reflect in any way the properties of the objects or activities to which
they refer. There is nothing in common between the word ‘red’ and the
property of redness. This is the famous Jacobsonian arbitrariness of the
linguistic sign and is the assumption upon which most linguistic research is
based. However, when we say that the linguistic sign need bear no relation
to the signified, we are not saying that it must bear no relation. As we have
seen in our analysis of phonemic objects, such relationships can be
established in different ways in many cases.
The language stream itself conveys meaning in many ways (in many
different sonic dimensions). Taking a minimalist view, we may describe the
significant distinguishable elements of the speech stream purely in relation
to the formants. Roughly speaking, with an unvoiced speech stream, vowels
will be distinguished by specific formant structures and consonants by a
combination of specific qualities and specific structures of change of these
qualities. A typical speech act, however, is also characterised by a number
of other properties relating to its rhythm and tempo and their articulation, its
pitch and pitch articulation, its phonemic connectedness and so on. These
other properties are also capable of conveying meaning and particularly of
modifying the significance of the semantics that might be implied from a
written version of the sentence. Furthermore, we can define sonic properties
of the language stream which have nothing to do with any of this, for
example, aspects determined by the particular physiology of the speaker.
From the point of view of sonic analysis, this distinction between
language and paralanguage (as these other aspects have been called) is
somewhat arbitrary. It is not based on a distinction between what is
semantically meaningful and what is not, but on a distinction between what
is captured in writing and what is not. This is only true of certain writing
systems, however. In the Aztec codices sometimes the ‘speech scroll’—the
balloon issuing from the mouth of a character depicted—is specially
elaborated to indicated paralinguistic aspects. In one instance (see Figure
14.1) ambassadors delivering speeches are shown with knives coming out
of their mouths (left), whilst in another a Spaniard is shown talking to
Aztecs and his speech scroll is decorated with feathers indicating the soft,
smooth words he is using (right). Approaching these things from the point
of view of sonic art, therefore, we will talk in terms of timbre fields and
articulation fields which will be explained more fully below.
Comparing, for a moment, standard repertoire use of language and
standard repertoire music, we may characterise the difference in the
articulation of the sonic stream as follows: the melodic stream is
pitchdisjunct and may be articulated by timbral colouration (either in the
choice of instrumentation or within the internal morphology of the sound-
objects of instruments). The language-stream is timbre-disjunct (bearing in
mind the qualifications on the notion of vocal disjuncture mentioned in the
previous chapter) and may be articulated by pitch inflections. It has been
argued that a music based on the complex articulation of timbre could not
be as sophisticated as that based on the articulation of pitch. However, if we
investigate the language-stream, we will discover that the human brain has
a truly amazing ability to generate and perceive rapid articulations of
timbral quality. At certain points in the speech stream (particularly in
diphthongs or within consonants) the removal of just one thirtieth of a
second of sound is clearly noticeable (as the ear is crucially sensitive to
change-continua). It is often argued that this perceptual ability is crucially
linked to semantic understanding. If, however, we consider text-sound-art
using essentially meaningless phonemic strings and take into consideration
our discussion of mouth symbols and natural morphology of phonemes,
plus our ability to perceive simultaneously a wide variety of characteristics
of the speaker (age, status, regional accent, idiolect, physiological state, and
attitude), it is clear that a sonic art based on the articulation of timbral
characteristics may be quite as subtle as any pitch-lattice-based sonics.
Figure 14.2 The symmetrical harmonic field in Webern's Symphonie op. 21.
It is easy to see how the concept of a field may be adopted for the
spheres of timbre, timbre articulation, pitch articulation, phonemes and so
on. At the most general level we can talk about the timbre field of a
particular language. When listening to a language we do not understand, we
will be particularly aware of this feature: in Japanese the extreme (very high
and very low) formant areas used in some vowel production, in Dutch the
salival fricatives (X+), in English the sense of articulatory (consonantal,
dipthongal) continuity (compared with, for example, German). Such
features in fact may often lie at the root of certain kinds of cultural
prejudice where the timbral and articulatory aspects of the language are
taken to indicate something about a spurious ‘national character’ of the
entire group of speakers. This is essentially a confusion of the normal
timbral field of a particular language with the attitudinal ‘modulations’
which may be applied to the normal timbre field of the native speaker's
language.
Just as the definition of a harmonic field on a pitch lattice allows us
to define chromaticism (the inclusion of pitches foreign to the harmonic
field originally established), so the definition of a timbre field allows us to
define ‘chromaticism’ or ‘modulation’ from normal language practice.
Figure 14.3 gives a highly schematic representation of this idea. To do
justice to the concept we would need a multi-dimensional space in which to
draw this figure. However, from it we can see that a particular language will
have a characteristic ‘normal’ set of timbre types, articulations, etc. and
(even without understanding a language) we will be able to distinguish
regional accents or idiolects (ways of speaking characteristic of the
individual speaker) by their variation from this norm. Other variations from
the norm, which might overlap with aspects of accent or idiolect, will
indicate non-neutral modes of discourse (for example, anger or ridicule).
It is more interesting, however, to look at much more specific timbral
fields. Thus, any short verbal utterance contains a particular set of timbral
objects, and these define a timbral field. We may explore the
interrelationships amongst these objects, not only through their reordering
in a linear text (an approach from poetry) but also in simultaneous and
textural orderings (a choral approach). The objects may also be grouped
into ‘timbral-motifs’ (which may, in fact, be words or phrases). Thus, just
as Lutoslawski or Berio will define a harmonic field and a class of pitch-
groupings (melodic motifs) simultaneously, it is possible to do exactly the
same thing with timbre-stream material.
Figure 14.3 Timbre fired of a language and various subsets compared with a similar analysis for
harmonic fields in tonal music.
In Steve Reich's Come Out the phrase ‘come out to show them’ is
used purely as a timbral motif. Several copies of the phrase are played
initially in synchronisation and then gradually de-synchronised. As this
happens, rhythmic and timbral patterns (due partly to phasing effects) are
established, which arise directly out of the timbral properties of the sonic
object ‘come out to show them’ (or rather the specific speech utterance of
this phrase initially recorded on tape).
Aspects of variation and change amongst timbral fields may be
observed in various text-sound pieces with no semantic content. Thus, in
Schwitters’ Ursonata (Schwitters 1993), for example in the ‘fourth
movement’, we can see first of all that the overall timbral field is confined
to a small number of phonemes—Grimm, glimm, gnimm, bimbimm, bumm,
bamm, Tilla, loola, tee etc.. Next we notice that there are large-scale
groupings, for example, we may divide attack-vowel resonance-m areas
(Grimm, bamm) from l-vowel resonance-l-vowel resonance areas (loola,
luula, lalla) and from attack-vowel resonance areas (Tuii, tee, bee). Within
these areas we may make further subdivisions, for example, in the first
section between areas stressing g and i, and areas stressing b and using a
number of different vowel resonances (u, i and a).
All this differs from our perception of field characteristics in pitch-
lattice music (apart from the obvious pitch-stream/timbre-stream
distinction) in a number of ways which are, however, not intrinsic to text-
sound composition. First of all, there is no counterpoint or chorusing.
Secondly, there is no indication of rhythm (which might, however, be
implied from the printed spacing) or tempo. Adding these, and other,
dimensions we can imagine a sophisticated contrapuntal art based on the
articulation of a multilevelled timbre or timbre-motif (possibly phonemic)
field structure.
These conceptions have a bearing on the construction of standard
poetry. Here the timbral colouration of words may be a crucial aspect of the
poetic form. Such features are usually divided into vowel correspondences
(assonance) and more general correspondences usually involving
consonants as well. From a sonic art point of view this distinction is either
invalid (both are to do with timbral correspondences) or too narrow (there
are many more than two classes of timbral objects). The correspondences
between phonemic objects have a very long history in poetry, mainly in the
form of rhyme. Some poets, such as Gerard Manley Hopkins, have placed
particular stress upon this aspect of their writing and, in the book Phonetic
Music with Electronic Music, Robson (1981) has developed a general
theory of vowel harmony. If we free these considerations from linkage to a
linear solo text, then poetry (even poetry deeply based in semantic
meaning) and a choral or electroacoustic art of timbral articulation meet, as
pitch and harmonic field can be articulated independently of all these
parameters. A vast new area of sonic art opens up before us which has
previously been bypassed by the linguistic or pitch-lattice preoccupations of
poets and musicians respectively.
Linguistic flow
Mike walked in on the: attense of Chjazzus as they sittith softily sipping sweet okaykes H-
flowered purrhushing ‘eir goofhearty offan-on-beats, holding moisturize’-palmy sticks clad in
clamp dresses of tissue d'arab, drinks in actionem fellandi promoting protolingamations e
state of nascendi; completimented go!scene of hifibrow'n [...]
(Hans G. Helms Fa:m’ Aniesgwow quoted in Kostelanetz 1980: 20–21)
Although almost half the words in this text are imaginary, it is quite clear
from the context that brillig is a time of day,2 a season or special occasion
and that toves are some kind of creature. Slithy is interestingly constructed
out of mouth symbols (relatable to both slimy and slither) whilst gyre we
might relate semantically to the word gyrate.
In the computer-generated mock-Latin quotation quoted in Chapter 9,
there are almost no real words but a non-speaker will accept the ‘Latinness’
of this text as the frequency of occurrence of certain phonemes and their
typical ordering is reminiscent of Latin. In the poem of Hugo Ball quoted at
the beginning of the previous chapter, although no known natural language
is being used, we accept the utterance as linguistic because phonemes are
organised into polysyllables and their statistical distribution has some
similarities with typical languages. In the Schwitters Ursonata, however,
the typical statistical distribution of phonemes in language is contradicted.
We are beginning to enter into pure sonic art. Finally, the
‘rdnstrklmndrnchtkrissvrichk!’ utterance of lettrism severs our link with the
phoneme and we approach the pure percussive sound-object ‘tjak’ of the
rhythmic Ketchac monkey-chant of Bali.
The pitch and loudness of a natural speech act tend to remain quite close to
a mean value. Around this value both pitch and loudness are articulated.
Some of these articulations are conventional and contribute or qualify the
meaning of the linguistic stream. In certain languages (such as Chinese) the
perception of pitch is integral to the recognition of a particular phoneme. A
phoneme otherwise having the same timbral characteristic will have a
different meaning according to whether it is spoken in a low, medium or
high tessitura or with a rising pitch contour or a falling pitch contour. Pitch
characteristics which are integral to the recognition of phonemes are known
as tonemes. Pitch is also used in a less specific way to indicate the end of
sentences (usually rising in French but falling in English), to indicate
questions (rising at the end of the sentence in English) and so on.
Stress, which may be articulated by loudness difference or pitch
difference, is also used as a conventional aspect of the language stream.
Thus, in English, four levels of stress may be recognised: primary (1),
secondary (2), tertiary (3), weak (4). For example: hot food (2, 1); hotter (1,
4); hotel (3, 1); contents (1, 3); operate (1, 4, 3); operation (1, 4, 3, 4).
Stress is also distributed in a semantically significant way within phrases.
Compare the difference in meaning of don't do that with the stress patterns
(1, 4, 4), (2, 1, 4), (2, 4, 1), (4, 1, 2) or (4, 4, 1).
Such aspects of conventional paralinguistic pitch and stress cannot be
completely divorced from even conventional musical practice. Thus, for
example, classical Chinese word-setting must concern itself with the level
or inflection of the tonemes. Certain languages tend to place the principal
stress on the first syllable (and not on the second). Musically we might say
that the language-stream lacks an anacrusis. This is most true of Finnish and
Hungarian and the characteristic stress pattern of Hungarian carries through
the folk music into the work of Bartok (the characteristic falling leap with
the stress on the first note).
Beyond these features we may be able to distinguish certain overall
characteristics of pitch (or stress) which are deviant from the normal mean
value. The general range may be over-high or over-low or the range through
which pitch is articulated may be over-wide. This may be a conventional
aspect of accent as in the perceived ‘sing-song’ of Welsh English. Beyond a
certain limit, however, an unusual tessitura or range of pitch-articulation
will have more universal gestural significance. For example, the expression
of anger is usually associated with high pitch and loudness, whilst low
pitch, quietness (and possibly breathiness and smoothing of the air flow)
may be associated with intimacy.
The melodic implications of intonation patterns may, in fact, be quite
sophisticated. Istvan Anhalt (1984: 159) cites Fonagy and Magdics (1963)
as having described in musical notation “what they perceive to be the
melodic patterns of certain emotions or emotional attitudes: joy, tenderness,
longing, coquetry, surprise, fear etc.”. Even the simple question-pitch-
inflection is, in fact, much more complicated and Kingdon (1958) has
distinguished between “general, particular, alternative, asking for repetition,
interrogative repetition, insistent and quizzical” intonation patterns for
questions (cited in Anhalt 1984: 159).
Such paralinguistic aspects of pitch and stress have a bearing on the
construction of vocal melody (and thence on melody in general). Thus,
where a melodic line proceeds by wide leaps, particularly where this is
outside a simple harmonic field, such as an arpeggio, and/or is associated
with a freer speech-type of rhythmic articulation, we are likely to make the
paralinguistic interpretation that the utterer is agitated, frightened or
disturbed, or at least over-emphatically expressive (the ‘Schoenberg
effect’).
The paralinguistic implications of pitch, tessitura and motion have
clearly always had a strong influence on the practice of singing (consider
for example Japanese joruri singing for the Bunraku puppet theatre, the late
sixteenth century invention of recitative in Western Europe, etc.). Even
where pitch is compositionally ordained to follow the most rigorous
instrumentally conceived pitch-lattice logic, interpretation tends to input the
paralinguistic gestures in the form of small articulations of pitch and stress.
Schoenberg further extended these connections by his development of
Sprechstimme. A schematic analysis of pitch usage (and its combination
with timbral morphology) is indicated in Figure 14.5.
The use of pitch in any vocal utterance which is not clearly
conventional language or conventional pitch-lattice music needs careful
consideration. Curtay, in his lettrist compositions, specifies pitch in only the
four (linguistically significant) registers, low, medium, high and very high,
as he wishes to avoid any over determination of the events by the logics of
conventional music. I would consider this a much too simple view as may
be evident from the analysis above. There is, however, a technical difficulty
in integrating clearly-pitched material into a sonic structure which has
previously contained no (stable) pitches; steady-state pitches tend to ‘stick
out like a sore thumb’, producing a marked discontinuity in our perception
unless they are introduced with great subtlety. It is clear that there are all
sorts of degrees of balance between a sonic art fundamentally rooted in the
relationships of fixed pitches (to which timbral characteristics are
subservient) and a sonic art based on the sophisticated control of timbral
possibilities (to which pitch characteristics are subservient). Although we
may achieve transformations of structural organisation away from one pole
and towards the other within a single sonic composition, it is important to
be aware where the perceptual focus lies. This seems to be a problem both
for conventional musicians (for example, in assessing timbrally articulated
works based on a relatively fixed harmonic field) and for text-sound-artists
anxious to abandon linguistic reference.
Figure 14.5 A schematic analysis of vocal pitch-usage.
Personality, society
Individual interaction
It is interesting to note that this ‘group syntax’ has already evolved in such
a lowly creature. Furthermore, the normal functionalist approach to the
explanation of all animal communications (for example, in terms of sexual
bonding, status etc.) are difficult to apply when three individuals are
involved! The bou-bou shrikes of East Africa —
[...] can sing in duet with such a rapid reaction time that unless an observer is actually
standing between the two birds it is impossible to recognise that more than one bird is
singing. [...] a pair of birds can elaborate a whole repertoire of duet patterns by which they
can recognise one another in dense undergrowth and be distinguished from other pairs in the
neighbourhood. In this species either sex can start or finish and either bird can sing the
whole pattern alone in the absence of the partner. When the partner returns, the pair can
either sing in perfect unison or sing antiphonally again. Trio singing has also been observed
[...]
(Hooker 1968: 333–334) (See Figure 15.22).
Figure 15.2 Examples of duet patterns of various African bou-bou shrikes where X and Y represent
the two birds in the pair (after Thorpe and North (1965)).
Here, then, whatever the function or ‘meaning’ of the song, its syntax can
be articulated by two or three creatures acting ‘in concert’. In the music-
making of groups of humans, this mutual solidarity function is implicit. It
may be so distanced (within the structure of a larger society) that it is not
immediately perceived as a function of the musical activity by the
participants. Alternatively, the musical act may function specifically in that
role (ritual function of music in group ceremonies). Because, however,
many more levels of distancing are involved in human social
communication and interaction, we may represent within the convention of
group music-making the concept of disorder and strife as well as various
sophistically-differentiated conceptions of social cohesion.
Renaissance imitative vocal polyphony, for example, presents a
particular archetype of the relationship between the individual and the
group in the way in which the similar, but different, vocal lines are
harmonically co-ordinated. The sense of balance and equality within
harmony is quite distinct in its symbolic representation of the group from
Bach's Kyrie in the Mass in B minor. Here a sense of co-ordination of
utterance and planned development towards a final resolution is articulated
over a framework of highly affective dissonance. The social metaphor is
quite different.
And both of these differ quite markedly from choral works in which
the voices are rhythmically (and perhaps harmonically) co-ordinated in their
utterance, so we perceive only the group, or in which such groups are set off
in antiphonal relationship to each other. The organisation of the group may
point to specific group roles or functions within society, such as the
simulated collective meditation by a group of individuals in Stockhausen's
Stimmung or the quasi-religious action by a large group suggested in the
Introitus of Ligeti's Requiem.
At the other extreme, particular types of rhythmic disco-ordination
may present the group as a ‘sea of humanity’, a multitude or mob being
manipulated or out of control. Even this social image has subtle
ramifications. In his book, Anhalt (1984) gives some interesting insights
into this. Thus, in the third movement of Lutoslawski's Trois Poèmes
d'Henri Michaux, the composer uses a ‘fan’ effect where the chorus moves
from rhythmic synchronisation to a swarm effect gradually. Anhalt
describes this as “an allusion to the individual will, which seems to prevail
over that of the collective [...]” (Anhalt 1984: 137). In the second
movement, however, we experience ‘raw force’, ‘aggression’, ‘mob
behaviour’, the shouts of a crowd, either semi-concerted or synchronised
like the synchrony and asynchrony of the crowd cries at a great fight.
Alternatively:
The ‘Kyrie’ of Ligeti's Requiem is a powerful showing of a mass of human beings, swirling
and twisting in so many vocal currents, adding up to a turbulent sea of voices in which the
identity of an individual is painfully and irrevocably submerged on account of the number of
concurrently used similar melodic designs and overlapping registers. The canonic structures
here have a ‘blind leading the blind’ character, conveying the cumulative affect of a hopeless
predicament for the whole mass; [...]
(Anhalt 1984:2003)
1 Anhalt is here quoting from two articles originally in German. The English translation is
presumably his (Ed.).
2 After Thorpe and North, 1965 quoted in Hooker 1968: 334.
3 ‘Affect’ sic in last sentence of quote (Ed.).
Coda
Chapter 16
BEYOND THE INSTRUMENT: SOUND-
MODELS
The distinction between object and model may be understood more easily
when we move into the field of sound-sources possessing a repertoire, such
as the human voice. In this case the definition of one particular additive
synthesis or FM spectral type is obviously totally useless (except, of course,
if we wish to specify a different such type for every single vocal event in
the stream). What we can, however, specify are various invariants which
occur in the sound-source, for example, a description of the particular
placing and spread of the formants of a particular voice and also typical
articulation structures (the vocal attack-time, transition phenomena between
formants and so on) and the general spectral typology (e.g. the shape of the
individual glottal impulse).
All this, in fact, applies to standard musical instruments. As a simple
example, it is already well-known that the spectral richness of a piano tone
decreases with increasing frequency. This is a particularly simple law, but
we might also specify the ways in which spectral envelope, pitch, jitter and
amplitude envelope and fluctuations are correlated during various bowing
actions.
This specification of (near-) invariants over a field of possible sounds
is what I mean by a sound-model. It is a more general notion than the
typical limited view suggested by the concept of ‘musical instrument’. For
example, we might model the rules which govern the relationship between
spectral change, pitch change and loudness change in a metal sheet which is
being flexed. Together with other invariants this will effectively specify the
intrinsic morphology of the sound-model and the set of rules will govern
the behaviour of the sound-model when we articulate it through some input
device (which might be part of a program or a direct physiological-
intellectual input; see below).
With the computer as sound-source, however, we are not confined to
basing our sound-models on existing physical objects or systems. We may
build a model of a technologically (or even physically) impossible object.
We might specify the characteristics of the voice of an imaginary creature.
Once, however, the sound-model is specified, we are free to change the
invariants of its behaviour. We may transform it into an entirely different
sound-model.
The crucial difference between building sound-models and building
sound-objects is that the former preserves a clear and perceptually relevant
distinction between intrinsic and imposed morphology. If we articulate the
object within the rules specifying its invariants, we perceive an imposed
articulation of the sound-model. If, however, we proceed with some process
which changes those invariants, then we actually perceive the sound-model
itself to change. The intrinsic morphology changes, the perceived source
becomes ‘something else’. This is crucial to our understanding of the
perception of typical analogue synthesiser sounds. Largely because
instrument-definitions on such synthesisers are based on sound-objects and
not on sound-models, then no matter how we transform the sound-material,
we tend to perceive it as coming from a synthesiser. It was not just a lack of
detailing in the modelling of individual spectra in the voltage control
synthesiser that made its sound-world characteristically ‘synthesiser’ but
the more general lack of structuring in relation to perceptual sound-models.
Some of these concerns may be illustrated with reference to the digital
synthesis language Chant. Here the language makes a broad specification of
the (semi-) invariants of the sound-model ‘human voice’, for example, the
field of formant bands and their characteristics, typical values of vibrato,
vibrato variations, jitter and so on. Composition with this language may be
multilevelled. At a gross level we may specify merely pitch, loudness,
vowel type (a, e etc.), variations in type of vibrato, and so on. At a deeper
level we may specify particular modifications of the formant bands (for
example, to characterise a particular idiolect) with which we then compose.
At yet another level we may wish to impose transformations on the formant
bandwidths and the attack structures of individual events. This begins to
interfere with the invariants and rules which govern the sound-model
‘human voice’ and by doing so we can generate sounds of bells, drums and
so on. Defining classes of sound-objects at the level of sound-models,
therefore, has a direct relationship with our perceptual categorisation of
sound-events. Change the invariants and rules and one changes the
perceived model.
At the time of writing (1983), Chant does not model most of the
consonants. The modelling of such structures of articulation will be a key
development in the evolution of the digital computer as a powerful tool for
sonic art. Moreover, it should lead on to developing a modelling system for
natural processes themselves. Perhaps some rigour may be brought to this
through insights and mathematical techniques from differential topology.
This would give a sound theoretical basis to the concept of natural
morphology discussed in this book and allow us to have handles on the
evolution of a sound process that corresponds to the critical parameters of
the flow.
Given such powerful modelling systems, we may bring an imposed
morphology to bear upon the sound-models through some kind of real-time
or programmed input. The imposed fluctuations may then be made to
articulate what remains a ‘solid’ (relatively stable mass) object (by
changing the overall pitch level, loudness, loudness envelope, vibrato,
vibrato width, vibrato steadiness, tremolo width and steadiness, jitter, etc.
within certain limits) to ‘liquefy’ that object or a stream of objects (by
articulating the spectral envelope and pitch contour, possibly in relation to
one another, the typical event duration and density, the spread of pitch and
so on), or merely to interact with its ‘gaseous flow’. In the latter case I am
not thinking of simple vocoder-type processes but some way of mapping
bodily or vocal gestures into the flow properties of a sound (such as speed,
density, turbulence etc.).
Imitation, transformation
Operational fields
Human interfacing
The other crucial feature in the application of digital technology to sonic art
is the development of sophisticated hardware and software tools to permit
human physiological-intellectual performance behaviour to be transmitted
as imposed morphology to the sound-models existing within the computer.
This connects directly with the whole area of the evolution of musical
instruments in their more conventional sense. We may anticipate in the not-
too-distant future the development of a whole generation of digital sound-
producing devices which are, to a greater or lesser extent, analogous with
existing musical instruments (in fact some are already here). The keyboard
synthesiser has been with us for a long time but now we are beginning to
see the emergence of blown and bowed synthesisers which present
themselves to the performer as analogues of conventional mechanical
instruments but in which the sound-production is entirely electronic. The
immediate advantage of this development is that we may, in fact, select the
timbre that the instrument produces by varying the program. More
significantly, perhaps, we may alter the way in which our various
articulations of the string, air column etc. appear as articulations of the
sound.
The concept of the transfer of parameters was already well-developed
in the field of live electronics using analogue synthesisers. Thus, the arm
motion speed, breath flow or, more typically, the resultant loudness
variation, pitch variation and so on, could be monitored by various
electronic devices (such as envelope followers, pitch-to-voltage converters
etc.) generating a voltage proportional in some way to the magnitude of the
input. The resultant voltages could then however be used to control any
desired feature of the resulting sound. Amplitude might control pitch and
pitch control amplitude. Both might control the parameters of a second
instrument or more complex features of an evolving electronic sound-
stream.
The analytic power of the computer at least potentially gives us the
ability to monitor in several simultaneous dimensions the subtle details of
performance behaviour. A sufficiently intelligent and fast machine should
be able to sense parameters of breath flow, formant structure, glottal, tongue
and lip vibrations, noise turbulence type and so on, separating these out so
that the information from each can be applied to the control of different
parameters of a sound-event. This sound-event may, of course, have nothing
whatever in common with the characteristics of behaviour which generate
the control information. Through practice, just as with the conventional
instrument, we can imagine the performer developing a sophisticated co-
ordination between his or her performance skills and the sonic output.
The design of sophisticated inputs has been one of the major
weaknesses of the digital instrument revolution but with the generality of
the computer there seems no reason why a whole range of multi-
dimensionally sensitive input devices should not be developed. These might
involve keyboards which were sensitive not only to which key was pressed
but also to finger velocity and/or pressure and to lateral motion of the
fingers (as in the analogue Buchla synthesisers) and these might be made
much more interesting than the more common approaches have been, by
adding devices allowing continuous contraction or expansion of the average
interval size, the warping of intervallic uniformity (perhaps in a pre-
programmed fashion), spectral change with register, linking attack velocity
to, for example, timbral stability and duration rather than to loudness and so
on. Bowed interfaces would be sensitive not only to pressure but also to
speed of bowing, the width of bow to touch the string, the sul tasto-sul
ponticello dimension, the temporal fluctuation of these things and so on.
Even quadrapan and pedal devices might be redesigned. We might, for
example, imagine a console with two quadrapan units which were also
capable of moving in the up-down direction and two foot pedals which
could be moved not only up and down but also from side to side and
backwards-forwards. We might even use totally novel ways of inputting
physiological-intellectual information, such as the monitoring of the many
dimensions of facial gesture.
What, however, is clear above all else is that the internal architecture
of sounds becomes both analytically and conceptually accessible and hence
available for more or less precisely defined composition and, as our ability
to monitor the subtleties of human intellectual-physiological gesture and
transfer them onto sound-materials increases, our notion of what ‘music’ is
must become much more generalised. It must embrace and systematically
investigate areas that have traditionally been regarded as the legitimate
property of psycho-acousticians, phoneticians, poets and sound-poets, of
nature recordists and audio-zoologists, of naturalistic and ‘effects’-based
film-sound engineering and much more. Musicians will concern themselves
with the affective and systematic ordering of timbre structure, sonic gesture,
sound-landscape, the subtleties of psycho-linguistic and psycho-social cues
and many other dimensions of the sound-universe, alongside the more
traditional parameters of pitch and duration. The era of a new and more
universal sonic art is only just beginning.
Postscript
It is clear that we are about to see a radical change in the nature of our
civilisation. The impact of the computer, the universal metamachine, could,
in a short time, destroy the whole basis for the work-ethic upon which most
of our present-day materialist culture is built. We can expect a rough ride
into this new world from the guardians of social orders which are no longer
relevant but when we finally arrive we may at last find the arts playing a
central role in the lives of the community in general, provided, that is, we
do not manage to commit geno-suicide in the meantime.
The possibilities opened up for musical (and all other types of art)
exploration are truly staggering. It is as if a magical dream has come true.
We have the potential to make real any sound-event we can imagine. What
will prevent us from getting to grips with this new situation is primarily our
aesthetic preconceptions and lack of sensitivity. The effect of the former is
obvious and has always been with us; the lack of sensitivity, however, may
prove to be the most debilitating. The question is simply, if one can do
absolutely anything, what precisely is worth doing? If it is not to be judged
in terms of the pre-existing criteria of available musics, we must have
enough personal musical integrity to admit that there is a distinction
between the arbitrary manipulation of materials according to some
preconceived plan and the construction or performance of valid sonic
experiences. I hope this book might open up some new pathways without
leading us into the sterile wasteland of formalism.
BIBLIOGRAPHY
Anhalt, I.
(1984) Alternative Voices: Essays on contemporary vocal and choral
composition (Toronto, University of Toronto Press).
Ball, H.
(1974) Flight Out of Time: A Dada Diary (Elderfield, J., editor; Raimes, A.,
translator) (New York, Viking Press).
Barthes, R.
(1977) ‘The Grain of the Voice’. In Image-Music-Text (Heath, S., translator)
pp. 179–189 (London, Fontana/Collins).
Bastian, J.
(1968) ‘Psychological perspectives’. In Animal Communication (Sebeok, T.
A., editor) pp. 572–591 (Bloomington, Indiana University Press).
Bateson, G.
(1968) ‘Redundancy and Coding’. In Animal Communication (Sebeok, T.
A., editor) pp. 614–626 (Bloomington, Indiana University Press).
Berio, L.
(1959) ‘Poesia e musica—un’ esperienza”. Incontri Musicali 3 (1959): pp.
98ff also Contrechamps 1 (1983) pp. 24–35 (French) and (extract in
English) sleeve note to LP recording of Omaggio a Joyce (Vox Turnabout
TV34177).
(1967) Visage: sleeve note to LP recording Vox Turnabout 34046S
Bogert, C. M.
(1960) ‘The Influence Of Sound On The Behavior Of Amphibians and
Reptiles’. In Animal Sounds and Communication (Lanyon, W. E. and
Tavolga, W. N., editors) pp. 137-320 (Washington, American Institute of
Biological Sciences).
Boulez, P.
(1971) Boulez on Music Today (London, Faber and Faber).
Busnel, R.-G.
(1968) ‘Acoustic Communication’. In Animal Communication (Sebeok, T.
A., editor) pp. 127–153 (Bloomington, Indiana University Press).
Carroll, L.
(1994) Through the Looking Glass (London, Penguin Books).
Chowning, J.
(1971) ‘The Simulation of Moving Sound Sources’. JAES 19 pp. 2–6
(reprinted in CMJ 1(3) pp. 48–52).
Curtay, J.-P.
(1974) La poésie lettriste (Paris, Seghers).
(1981) Body Music 1 (Cassette with sleeve notes) (London, Audio Arts
Editions).
(1983) Lettrism, abstract poetry, mouth symbols and more ... (Unpublished
manuscript).
Darwin, C.
(1965) The Expression of the Emotions in Man and Animals (University of
Chicago Press).
Emmerson, S.
(1976) ‘Luciano Berio talks to Simon Emmerson’. Music and Musicians
(London) (May 1976) pp. 24–26.
(1982) Analysis and the Composition of Electro-Acoustic Music (London,
City University (PhD thesis): University Microfilms International).
Erickson, R.
(1975) Sound Structure in Music (Berkeley, University of California Press).
Hall, D.
(1992) Klee (London: Phaidon Press).
Hayes, B.
(1983) ‘A progress report on the fine art of turning literature into drivel’.
Scientific American 249(5) pp. 16–24.
Helmholtz, H.
(1954) On the Sensations of Tone as a Physiological Basis for the Theory of
Music (New York, Dover).
Hiller, L.
(1981) ‘Composing with Computers a Progress Report’. CMJ 5(4) pp. 7–
21.
Hopkins, G. M.
(1959) The Journals and Papers of Gerard Manley Hopkins (House, H. and
Storey, G., editors.) (Oxford University Press).
Hooker, B. I.
(1968) ‘Birds’. In Animal Communication (Sebeok, T. A., editor) pp. 311–
337. (Bloomington, Indiana University Press).
Isou, I.
(1947) Introduction a une nouvelle poésie et une nouvelle musique (Paris,
Gallimard).
Joyce, J.
(1939) Finnegans Wake (London, Faber and Faber).
(1960) Ulysses (London, Bodley Head).
Kaufmann, W.
(1967) Musical Notations of the Orient (Bloomington, Indiana University
Press).
Kingdon, R.
(1958) The Groundwork of English Intonation (London, Longman).
Kostelanetz, R.
(1980) Text-Sound Texts (New York, William Morrow and Company).
Langer, S.K.
(1953) Feeling and Form (London, Routledge and Kegan Paul).
Levi-Strauss, C.
(1970) The Raw and the Cooked (London, Cape).
McAdams, S.
(1982) ‘Spectral Fusion and the Creation of Auditory Images’. In Music,
Mind and Brain (Clynes, M., editor) (New York, Plenum Press) pp. 279–
298.
Marler, P.
(1965) ‘Communication in Monkeys and Apes’. In Primate Behavior: Field
Studies of Monkeys and Apes (DeVore, L, editor.) (New York, Holt,
Reinhart and Winston) pp. 544–584.
Motherwell, R.
(1989) The Dada Painters and Poets: An Anthology (Harvard University
Press).
Pauli, H.
(1971) Für wen komponieren Sie eigentlich? (Frankfurt a.M., Fischer).
Robson, E.
(1981) Phonetic Music with Electronic Music (Parker Ford, Pa., Primary
Press).
Rowell, T. E.
(1962) ‘Agonistic Noises of the Rhesus Monkey (Macaca mulatta)’.
Symposium of the Zoological Society of London 8 pp. 91–6.
Schaeffer, P.
(1966) Traité de Objets Musicaux (Paris, Du Seuil).
Schafer, R. M.
(1969) The New Soundscape (BMI, Canada/Universal Edition).
(1977) The Tuning of the World (New York, Knopf).
Stuckenschmidt, H. H.
(1959) Arnold Schoenberg (London, Calder).
Thom, R.
(1975) Sructural Stability and Morphogenesis (Reading, Mass., W. A.
Benjamin).
Thompson, D'A. W.
(1961) On Growth and Form (Cambridge University Press).
Weber, M.
(1958) The Rational and Social Foundations of Music (Martindale, Riedel
and Neuwirth, translators) (Carbondale, Southern Illinois University Press).
Wessel, D.
(1979) ‘Timbre Space as a Musical Control Structure’. CMJ 3(2) pp. 45–52.
Wishart, T.
(1979) Book of Lost Voices (York, Wishart (private publication)).
Xenakis, I.
(1971) Formalised Music (Bloomington, Indiana UP).
[Revised edition: Stuyvesant, Pendragon Press (1992)].
MUSIC EXAMPLES
This book is about listening. Trevor Wishart insists that only the ear can
validate or criticise music composition. His original lectures and plan for
this book included recordings of a vast array of musical examples from
many sources. Copyright problems have made the assemblage of a
complete accompanying recording prohibitively difficult.
Within the current constraints of copyright law the author and editor invite
the reader to construct an ideal series of music examples from the following
list of commercial recordings which will be referred to in the text by the
example numbers indicated. This book makes most sense if the reader has
assembled these music examples and listens to them at the relevant point in
the text. All entries are CDs unless marked as LP or Cassette. Where the
author has made reference to specific sounds, transformations etc. timings
are given with respect to the actual recording cited, whereas where
reference is to a general style or approach no specific timings are given. For
well known works of which several recordings are easily available none is
specifically cited. (Where no source is cited the example may be found on
the accompanying CD.)
Chapter 1
Chapter 2
Ex. 2.1: Japanese Joruri singing (gidayu style) (e.g. Takemoto
Tsunatayu: Kiyari Ondo).
[King Record Co. Ltd. (Japan): KICH 2008]
Ex. 2.2: North Indian singing (e.g. Sulochana Brahaspati: Khyal
(Raga Bilaskhani Todi)).
[Nimbus (UK): NI 5305)
Ex. 2.3: Jazz singing (e.g. Billy Holliday: Lady Sings the Blues).
[Verve (Polygram): 823 246–2]
Ex. 2.4: Josef Haydn: Dona nobis pacem from Missa in Tempore
Belli (‘Paukenmesse’)
Ex. 2.5: Japanese shakuhachi playing (e.g. Kohachiro Miyati:
Shika no Tone).
[Elektra Nonesuch: 7559–72076-2]
Ex. 2.6: A classical chamber work (using wind instruments) (e.g.
Antoine Reicha: Wind Quintet in D major op. 91 no. 3).
[Hyperion Records Ltd. (UK): CDA66268]
Ex. 2.7: Traditional jazz (e.g. Louis Armstrong and His Hot Five:
West End Blues).
[BBC Enterprizes (UK): BBC CD 597]
Ex. 2.8: Anton Webern: Symphonie op. 21 (1st movement).
[Sony Classical: SM3K 45845]
Ex. 2.9: Krzysztof Penderecki: Polymorphia (2.10–3.34 + 4.50–
5.36).
[Polskie Nagrania (Poland): PNCD017(A+B)]
Ex. 2.10: Iannis Xenakis: Pithoprakta (7.40–9.00).
[Le Chant du Monde (France): LDC 278368]
Ex. 2.11: Karlheinz Stockhausen: Carré (0.00–1.26).
[Stockhausen Verlag (Germany): Stockhausen 5]
Ex. 2.12: Pierre Boulez: Don (from Pli selon Pli) (3.05–5.00).
[Erato (France): 2292-45376-2]
Ex. 2.13: Spontaneous Music Ensemble (John Stevens, Trevor
Watts): Face to Face 5 (3.15–4.25).
[Emanem Records (UK): (LP) EMANEM 303]
[2.01] Ex. 2.14: Trevor Wishart: Anna's Magic Garden (2.30–3.31).
[Overhear Music (Keele, UK): Ohm 00l]
Chapter 3
[3.01] Ex. 3.1: Melody with individual elements less than five
milliseconds long.
[3.02] Ex. 3.2: Melody with individual elements ten milliseconds long.
[4.01] Ex. 3.3: A low note on the piano.
[4.02] Ex. 3.4: As 3.3 but filtering out all frequencies below the second
harmonic.
[5.01] Ex. 3.5: An example of Shepard tones.
[6.01] Ex. 3.6: White noise: (a) normal (b) double speed (no pitch shift).
[6.02] Ex. 3.7: Complex sound: (a) normal (b) double speed (only a
third shift).
[7.01] Ex. 3.8: As 3.6.
[7.02] Ex. 3.9: Melody of filtered noise.
[8.01] Ex. 3.10: (a) piano with ‘flattened envelope’ (b) flute at same
pitch.
[8.02] Ex. 3.11: (a) flute with imposed ‘piano envelope’ (b) piano at same
pitch.
[9.01] Ex. 3.12: Sound with relatively constant envelope: (a) complete (b)
with start cut.
[9.02] Ex. 3.13: Piano note: (a) normal (b) with attack cut.
[10.01] Ex. 3.14: Bell sound: (a) normal (b) with attack cut.
[10.02] Ex. 3.15: Cymbal sound: (a) normal (b) with attack cut.
[11.01] Ex. 3.16: Vibraphone sound: (a) normal (b) with attack cut (3
versions).
[12.01] Ex. 3.17: Flute note: (a) with attack cut (b) normal.
[12.02] Ex. 3.18: Trumpet note: (a) with attack cut (b) normal.
[13.01] Ex. 3.19: The influence of onset synchrony on coherence.
[14.01] Ex. 3.20: Splitting an aural image into two (Roger Reynolds).
[15.01] Ex. 3.21: Different sound objects from a single source (metal sheet
and a taut string).
[16.01] Ex. 3.22: Sound of definite mass: resistance to filtering and its
transposition.
[17.01] Ex. 3.23: Grain illustrated with electronic impulses.
[17.02] Ex. 3.24: Grain illustrated for a bassoon note.
[18.01] Ex. 3.25: Speed up of melody into grain: (a) descending scale (b)
irregular melodic pattern.
[19.01] Ex. 3.26: Speed up of string of speech sounds approaches speech
multiplex.
Chapter 4
[20.01] Ex. 4.1: Two sequences based on David Wessel's researches into
timbre space.
Chapter 5
[21.01] Ex. 5.1: Chant example: transformation from bell to male voice.
[22.01] Ex. 5.2: Trevor Wishart: Red Bird transformation ‘(Li)-sss-(ten)’
to birdsong (1.26–1.42).
[October Music: Oct 001]
[23.01] Ex. 5.3: A typical vocally-produced multiplex.
[24.01] Ex. 5.4: As 5.3 but the field characteristic of the multiplex
changes with time.
Chapter 6
Chapter 7
Chapter 8
Examples 8.1–8.12, from Trevor Wishart's Red Bird are all found on the
accompanying CD. The whole work is found on October Music (UK): Oct
001.
Chapter 10
Ex. 10.1: Jean Sibelius: Symphony No.4 (III: bars B+6 to C).
Chapter 11
Chapter 12
The sound examples from Chapter 12 were all performed by Trevor Wishart
and are to be found on the accompanying CD.
The tongue
Filters
Water effects
Inhaled sounds
Voiced pulses
Chapter 13
1 Original commissioned broadcast BBC2 Television (Sounds Different: ‘Music Outside’) 1980.
MUSIC REFERENCES
The following works are referred to in the text in addition to those specified
by the author as music examples for audition (see above). For
electroacoustic works recording details are given otherwise the publisher of
the score.
Ferneyhough, Brian
Second String Quartet [Peters Edition]
Time and Motion Study III [Peters Edition]
Ferrari, Luc
Music Promenade [(LP) Wergo 60046]
IRCAM
IRCAM un portrait [(LP) Centre Georges Pompidou (Paris) IRCAM 001]
Ligeti, György
Nouvelles Aventures [Peters Edition]
Requiem [Peters Edition]
La Barbara, Joan
Voice is the original instrument [(LP) Wizard Records (New York) RVW
2266]
Reich, Steve
Come Out [(CD) Elektra Nonesuch 979 169–2)
Schoenberg, Arnold
Erwartung [Universal Edition]
Pierrot Lunaire [Universal Edition]
Smalley, Denis
Orouboros [Unpublished]
Stockhausen, Karlheinz
Mikrophonie I [Universal Edition]
Hymnen [(CD) Stockhausen Verlag 10]
Telemusik [(CD) Stockhausen Verlag 9]
Stimmung [Universal Edition]
Wishart, Trevor
Anticredos [(CD) October Music (UK) Oct 001]]
Tuba Mirum [Wishart (York)]
Vox I [(CD) Virgin Classics (UK) VC 7 91108-2]
INDEX
Allen, Jane xii
Anhalt, Istvan xi, 310–1, 315–6, 321
Armstrong, Louis 340
Auden, W.H. 175
Augustin (Saint) 15
Cage, John 5
Cardozo, B. L. 54
Carroll, Lewis 299, 309
Chowning, John 194
Cobbing, Bob xiv
Coldman, Richard 5, 340
Coleman, Peter xii
Confucius 14
Cott, Jonathan 194
Curtay, Jean-Paul xi, xiii–xiv, 184–5, 263, 280, 283–4, 289, 294–5, 307, 311
Dante 179
Darwin, Charles 249–50
Davies, Hugh xiv, 36, 353
Descartes, René 31
Eliot, T. S. 169
Emmerson, Simon xii, 113, 136
Endrich, Tom xi
Erickson, Robert 67–8, 94
Ernst, Max 137
Galilei, Galileo 85
Garland, Judy 259
Gauss 59–60
Goody, J. 20, 22
Goin 319
Gregory (Pope) 15
Kaufmann, Walter 18
Keane, David xii
Kepler, Johannes 47–8
Kerouac, Jack 308
Kingdom, R. 310
Klee, Paul 134
Koch, L. 292
Kostelanetz, Richard xiv, 6, 308–9
Machover, Todd 80
Magdics, K. 310
Mallarmé, Stéphane 290
Markov 70
Marler, Peter R. 253, 255
Marx, Karl 14
Mayes, Martin 36
McAdams, Steven 64–5
McNabb, Michael 5–6, 65, 159, 163, 285–6, 340, 343
Miyati, Kohachiro 340
Moorer, James Andy 328
Motherwell, R. 293
Palestrina, Giovanni da 15
Parker, Charlie 30
Parmegiani, Bernard 5, 111, 136–7, 155, 340, 342–3
Pauli, Hansjörg 129
Penderecki, Krzysztof 32, 341
Piaf, Edith 259
Planck, Max 54
Plato 13–14, 16, 35, 39, 47–8
Plutarch 45
Potard, Yves 96
Pythagoras 45, 47–8, 67, 71–5, 78, 129, 188