The game environment from an auditive perspective
by Axel Stockburger

1. Introduction
The point of departure for this paper is to consider the concept of spatial practice as a possible perspective for the understanding of computer and videogames. Espen Aarseth (2000, p.169) states that “[t]he problem of spatial representation is of key importance to the genre’s aesthetics”. Jesper Juuls (1999, p.46) seems to confirm this position when he says: “computer games are almost exclusively set in a space”. He does however not follow up on this important observation.
Sound is not necessarily the first thing that comes to mind when we think about spatial
representation. Nevertheless, it is a significant factor in the emergence of specific immersive environments generated by contemporary 3D computer games.

Of course sound serves other important purposes, which are not exclusively related to spatial representation: to inform and give feedback, to set the mood, rhythm and pace, and to convey the narrative. If we consider all of this and if we agree that the majority of games are audiovisual artefacts, it is curious why sound is given so little attention in the literature. This does not necessarily come as a surprise if we remember that it took decades in the field of film studies to develop a deeper knowledge of the inner workings of sound practice. Indeed, the marginalisation of sound and the concentration on vision as the dominating sense can be found in theoretical approaches to all audiovisual media systems.

A recent methodological model by Lars Konzack (2002, p.89), which sets out to provide a framework for the complete analysis of any particular computer game does not even
mention sound. He has used his model to analyse the fighting game Soul Calibur (Namco
1999), and developed seven layers of analysis: hardware, program code, functionality, game play, meaning, referentiality and socio culture. Elements of the visual aesthetics are described as part of game play, meaning and referentiality. I am convinced that the various effect sounds connected to the different weapons in a game like Soul Calibur are important elements of the gameplay as they generate feedback about the player’s performance while they are referentially linked to the genre of martial art films and their distinctive sound design. This is one of many examples for the marginalisation of sound in the current literature.

Whenever sound does turn up on the agenda, it is very often in relation to film and music. Poole (2000, p.80) for example argues that a superficial similarity between films and video games exists because they communicate to eyes and ears of the audience and both share methods of sound production. There are indeed some of similarities between the sound practice of film and computer games, which makes use of film theory relevant in this context. Of course, the complex issues of the use of sound in film are still discussed and remain controversial. However, there is one unifying element among the many unanswered questions, “[one] claim on which we all can agree: the image has been theorised earlier, longer, and more fully than sound.”(Altman, 1992, p.171).
I would like to argue that computer games have a very specific way of deploying sound,
which is different from film as we will see in the following.

According to Brenda Laurel (1991, p.161), “[t]ight linkage between visual, kinaesthetic, and auditory modalities is the key to the sense of immersion that is created by many computer games, simulations and virtual-reality systems.”
In this paper we will attempt to a closer look at this “linkage” between the different sensual modalities in relation to the spatial nature of computer games.
Even though we take an auditory perspective on the game environment, it is crucial for our undertaking that sound cannot be analysed in isolation and that visual and kinaesthetic modalities must be taken into consideration.

First we will discuss the notions of the game environment and the user environment.
Then we will define a number of sound objects. Finally, the spatial functions which are
regulating the deployment of sound objects in the game environment will be described. As a basis for our examination and for demonstration purposes, we will use the game Metal Gear Solid 2, Sons of Liberty (MGS2). The reason for choosing this game is that although it can be seen as part of a larger genre it uses sound in an innovative way. MGS2 was developed by Kazuki Muraoka at Konami and primarily uses 3rd person perspective. The main theme is tactical espionage and infiltration. Hideo Kojima led the sound production team.

Although the impact of the technological evolution on the concepts and strategies of
sounddesign for computergames, we will deliberately avoid to discuss them here. There are numerous sources covering the history of technological development. A good example is the account given on Joerg Weske’s website (Weske 2000). We will assume that the reader is familiar with the concepts of stereo sound and Dolby Surround 5.1 which is used by MGS2.

2. The distinction between user and game environment
It is important to first define the categories influencing the player’s spatial experience. One category is constituted by the actual real world space in which the player plays the game and listens to its audio. It is quite obvious that external factors such as the size and the nature of the room (game arcade, living room or game boy on a bus), as well as the specific hardware used (stereo headphones or home theatre) influence the auditory spatial experience of a game. The nature of this type of space that I will in the following refer to as the user environment has a very high variability and essentially differs from player to player. Although this could open an interesting field for research into the acoustic qualities of the “real world” surroundings of players and their influences on the game experience, e.g. in public spaces or at home, this study intends to look at the mechanics that can be observed in the game itself.
This means that for our purposes, we will assume an “ideal” user environment based on the hardware needs derived from the game manual and an “ideal” room that is free of external auditory influences.

In contrast to that there is the space brought forward by the game itself. It can be understood as a collection of sound elements organised as culturally coded representations of space. This space that we will call the game environment, encompasses the sounds originating from the game during play. This also includes the sound of the credits before and after the game, all the sounds related to its interface as well as the program or software that defines their deployment or generation.
Although our perspective is directed towards sound, most computer games are audiovisual artefacts. It is therefore sensible to widen the category of the game environment to include the visual elements in a game, as well as the program defining the relations between visual and sound elements.

The game environment is usually generated by a very consistent set of elements that will not vary from player to player. As most games are packaged as a product in a specific form on a storage device the distribution of the same data to all the players is ensured. However, there is number a growing number of games that reside on dedicated internet servers and are thus altered and updated periodically. Game modifications, patches, add-ons and updates are also changing the game environment. The following is a sketch for a method of game audio analysis in relation to spatial practice and it focuses on the elements and functions in the game environment.

3. Sound Objects
Now that we have established the game environment as the origin of sound and graphics, it is necessary to develop a way of describing and differentiating the sounds it generates. In this respect it might be interesting to consider the status given to sound in the production of games and the concepts arising from it.
The structure that defines the relations between all the elements in a game is commonly
referred to as the game architecture. Following the concept of object oriented programming, all the separate elements of the game, such as pre-recorded sound files and textures are understood as objects. These are organised in classes. The software that makes the interaction between these objects possible is called the game engine. A game engine is first and foremost a set of software libraries. The architecture consists of the objects within classes on the one hand and the libraries defining how they are implemented in the game environment on the other. Following this concept, every sound in the sound library is referred to as an object and it is treated on the same level as a graphic file, as one of the many objects comprising the game.

In order to emphasise the discrete nature of the sound elements used in the game
environment, we will use the term sound object for our purposes. As they can also be
produced dynamically by a program it is important not to limit our understanding of the
term solely to the actual sound files on a CD-ROM or harddrive, because they could also be generated by a dedicated algorithm for a soundchip. How is it possible to access and
describe these sound objects if we do not have a direct access to the game’s sound library?

Most game researchers are familiar with the dilemma of being confronted with the black
box of the program on the one hand, and the output of the game on the other. Jesper Juuls (1999, p.35). triadic model of games describes this as the interaction between
program (game architecture) material (sound, text, graphics) and output. He says that
“ The interesting focus in a system like this regards the relationship between the
represented and the rules for the combinations of material.” and claims that “ [in] the
computer game: the material and the program can be taken apart”(1999,p.36).
He is using this dichotomy between program and material to set computer games apart from traditional narrative text, following Aarseths (1997, p. 94) model of ergodic text.
Juuls even argues (1999, p.36) that the rules of the game are of higher importance than the material, for the skilled player of an action game. This might hold true for a limited number of games but it is certainly not true for a game like MGS2 that has a very idiosyncratic and aesthetically diverse material side. In other words one could produce a new game based on the rules present in MGS2 but it would be a very different experience.

Generally, we can say that it is precisely the dynamic relation between material and program that separates sound practice in a computer game from other media and thus we will concentrate on it. In our context this means that the sound objects, of which
many belong to the material, can first of all be grouped according to their use in the game. Because the program defines the use of sound objects in the game, we have to focus on the relation between program and material all the time. With the advent of RDSP (Real-time Digital Signal Processing) this dynamic relation between program and material is emphasized because the program has the ability to dynamically alter the aesthetic appearance of sound objects.

The use of sound objects in the game architecture reveals one of the major differences
between the structuring of sound in a game and the sound practice in film. All the sounds or objects are part of a dynamic environment. Their qualities, such as pitch, volume, reverberation and other effects, as well as the relations among them are defined by a program that can also be influenced by user action. Every sound object can potentially enter a temporal and spatial relation with every other sound or graphic object. Obviously, the program differs to a great extent between different types of computer games, and the following typology might have to be adapted accordingly.

The notion of the sound object clearly resonates with meaning from the field of music
theory and its use in the game context has to be clarified. The “objet sonore” was
introduced by the French theorist and musician Pierre Schaeffer in his influential work
“Traité des Objets Musicaux”(1966). Schaeffer proposed the method of reduced listening; a way of listening that would avoid the habit of searching for the semantic properties of sounds and instead try to find ways of describing their specific properties and perceptual characteristics. It is quite obvious that a mode of reduced listening will not be achieved when we are playing an audiovisual game, simply because we are drawn to construct relations between the visual and auditory information we are receiving. However, it is possible to describe sound qualities, which generate a spatial understanding independent of an indexical connection to their source. Ulf Wilhelmsson (2001, p. 119) has shown that sounds can contain orientational information, such as up-down or approaching-leaving schemes, via the change of their pitch. He claims that sounds have spatial qualities without being indexically linked to their material source and describes how changes in loudness and pitch can generate the illusion of movement. Sound objects could be analysed according to these inherent qualities as Wilhelmsson has shown. Yet, this paper intends to focus on the use of sound objects in the game environment and there is no room to describe the inherent qualities of the sounds themselves in detail here.

We will instead attempt develop a typology of sound objects according to their use in the game MGS2. Pre-rendered cinematics are excluded from this analysis, as most of their characteristics could be analysed fully by taking a film theoretical approach. This has been demonstrated by Sacha Howells (2002, p.110). Following this initial typology we will try to explain how sound objects are used in relation to each other as well as in relation to the visual elements in the game and the user action.

4. Types of sound objects in the game environment
This typology of sound objects aims to identify the inherent qualities of different types of sound objects present in the game environment. We will discuss five different types of
sound objects, namely speech, effect, zone, score and interface sound objects. This is a
preliminary classification based on the observation of MGS2, and it might be possible to
define more or other sound objects in other games.

4.1 Speech sound objects
Speech is used in a variety of ways in computer and videogames.
Mostly it is employed as an intrinsical element of the diegetic system, developing the
narrative of the game. Speech will either be recorded speech, spoken by voice-over actors, or synthesized by the FM chip, producing the type of speech that is usually associated with computers or robots. In MGS2 speech sound objects are used in a various ways. All the important characters in the game, such as Solid Snake, Raiden, Vamp, Rose, Otacon are linked to speech sound objects derived from voice-over acting. It is interesting to note that speech sound objects in MGS2 are always accompanied by written text in the style of film subtitles. They are indeed the core transport elements of the game’s narrative. Most options and objectives are explained via speech. The game employs a radio/video communication device that contacts the main character on a regular basis and enables the player to reach other characters in order to receive directives or help. Even the save interface is an element of the described communication device. Each time the user saves a gamestate he is drawn into a conversation with a person. This trick cleverly incorporates the save dialogue into the diegetic system of the game.

Speech sound objects are also constantly employed for the description and mapping of
locations in the game environment. They are often qualitatively transformed in order to represent other media systems and their specific aural qualities, such as telephone, radio or TV. In MGS2 this is present in the radio communication, which is accompanied by hissing and crackling noises. The communication device is also used to emphasize the spatial separation between the user and game characters. It enables the construction of complex spatial relations between speakers, who are supposed to be in separate locations in the game environment.

There are 2 very different ways in which speech sound objects influence the spatial practice of a game. One the one hand the spatial information can be transported by the text in the form of language. Whenever a character in MGS2 gives us directions, where to move, which object or place to look for, it influences our movement in the game environment. On the other hand speech sound objects can move through the game environment, which generates a sense of the location of characters. This is especially strong in a 3D sound environment, such as Dolby Surround. The movement of sound objects in the game environment can be understood as a spatial function, and we will discuss specific spatial functions following our typology of sound objects.

4.2 Effect sound objects
Effect sound objects are sounds, which are cognitively linked to visual objects or events in the game environment by the player. They are in other words perceived as being produced by or attributed to visual objects or events within the diegetic part of the game environment. Visual objects in this context mean all the visual objects that are part of the game environment, whether moving or static elements, directly interactable or not.

There are numerous examples for visual objects, which are linked to effect sound objects, such as opponents in the game, consumable objects, doors, transportation devices and so on. Sometimes effect sound objects are connected to direct user action, sometimes they are synchronised to visual events in the game and at other times they are merely used to generate the impression of an action without a visual equivalent.
The realm of the effect sound objects is generally constituted by all the sounds, which are at the forefront of the user’s attention with the exception of intelligible speech. They are often used to signal changes in the game state. They can provide feedback about changes of conditions in the game, such as the points gained, the health status, birth (spawning) or death events. Here it is important to note that they do not have to refer to a game object that is visually represented.

The sound one hears in synchronisation to the movement of the avatar, “motoric” sounds such as footsteps or motor sounds are included within this type. In spatial terms effect sound objects have the ability of situating objects in the game environment.
Just as speech sound objects, they can also be moved through the game environment, which is in most cases created by panning in a stereo situation from left to right, as well as modulating loudness in order to generate the illusion of objects approaching or leaving. In the case of MGS2 there are large number of effect sound objects. They could be classified as being linked to the avatar, the game characters, objects, and events. However, this system is not rigid and the placing of some of these objects in the classification could be discussed further. From this perspective, the following list is not intended to be exhaustive:

a) Effect sound objects linked to the avatar:
- External body related sounds generated by the avatar’s movement, such as footsteps,
sounds produced by fighting or martial arts moves (a swishing type of sound – cutting
through the air), the sound of the avatar swimming, the sound of knocking on walls
(intended to confuse guards), cries of pain when the player is hurt.
- Internal body sounds such as heartbeat - which is used in an intriguing way to strengthen identification with the avatar in connection with the vibration of the controller when the player is hiding in cupboards, breathing sound when smoking a cigarette.

b) Effect sounds of usable objects carried by the avatar:
- This includes all of the weapon sounds, such as a variety of guns, grenades, a sword but also objects like binoculars and different types of sensors, throwable objects and clothes.

c) Effect sound objects linked to game characters:
- Movement sounds such as Footsteps, Weapon sounds, a particular sound for surprise,
snoring sounds, yawning sounds.

d) Effect sound objects linked to other entities in the game environment:
- Opening and closing of doors, hatches, cupboards.
- Elevators and other transport devices, servo sounds of cameras, flying drones, helicopters, planes, ships, birds.

e) Effect sound objects linked to events in the game environment:
- Sounds produced while objects are consumed by the avatar: power up objects, ammunition objects, weapons and tools.
- Sounds produced by bombs before they explode (ticking) as well as the explosion itself.

4.3 Zone sound objects
Zone sound objects are sounds, which are connected to locations in the game environment. Zones or locations can be understood as different spatial settings that contain a finite number of visual and sound objects in the game environment. A zone might be a whole level in a given game, or part of a set of zones constituting the level.
Zones are separated by differing causally linked visual, kinaesthetic or auditory qualities. In special cases different zones overlap. Zone sound objects are aurally defining zones within the game environment. They can have an indexical or non-indexical connection with the number of visual objects or events present in the zone. They share a lot of qualities with the type of film sounds Michel Chion describes as ambient or territory sound when he suggests to “[…] call ambient sound, sound that envelops a scene and inhabits its space, without raising the question of the identification or visual embodiment of its source: birds singing, churchbells ringing. We might also call them Territory sounds, because they serve to identify a particular locale through their pervasive and continuous presence.”(1994, p.75).

However, in our context the spatial metaphor of the zone is preferable to the symbolical
notion of the territory. MGS2 contains a number of different zone sound objects.
There are two main outside zones: an oilrig and a ship. On the platforms of the oil rig the zone sound is generated by waves, wind and the sounds of seabirds. The outside zone sound on the ship is mainly characterised by the sound of raindrops that is particularly immersive. The inside zone sound objects are usually dominated by ambient mechanical sounds, such as the humming of an air conditioning system and in one particular case a conveyor belt.
A very good example for a zone sound object is the sound in the underwater level, a flooded part of the oil rig (Shell 2 Core) that has a very particularly muffled sound quality that reproduces perfectly the immersive experience of being under water.

4.4 Score sound objects
The game score or music consists of a number of sound objects that belong to the nondiegetic part of the game environment. In numerous games the player can decide to switch the music on or off independently from the sound effects.
Game music is a very complex area and we will therefore only consider the qualities of the score, which are important for the spatial apparatus of the game.
Score sound objects can also be connected to locations in the game, which makes them
significant for the spatial practice of a game. Generally the score has a huge emotional
impact on the player and it can enhance the feeling of immersion. This means there should not be too many gaps within the musical score of a game as this would threaten the immersive bond with the player. Score sound objects are often used to mask transitions and to veil load times or idle situations. There are number of interesting aspects that remain to be analysed, of which the possible dynamic relations between the game-score and game events might be the most interesting.

The score of MGS2 has been produced by the Hollywood composer Harry Gregson
Williams, who has written the sound for action films such as The Rock, The Replacement
Killers and Enemy of the State. It was his first game score and in an interview
on the making-of-DVD (Konami, 2002), that is shipped with the game, he says that it was interesting for him to produce music without having a visual reference for it. Instead he chose to write themes linked to different actions or states in the game, such as sneaking, alert, action, ambient as well as general feelings, like being watched watching, tension and so on. Additionally he produced individual themes for the main game characters, which is a pattern reminiscent of traditional film soundtracks. He then delivered these elements as 1-minute clips, which were built into the game by Konami’s sound department. Overall the music is synthesizer based and is built around several different drum patterns related to the different states of alert in the gameplay.

4.5 Interface sound objects
Interface sound objects share most of the qualities of effect sound objects with the notable exception that they are usually not perceived as belonging to the diegetic part of the game environment. However, it cannot be ignored that a number of games have managed to include interface elements in very clever ways into the diegetic part of the game environment. MGS2 is a perfect example for this because it manages to include load/save dialogues into the overall game narrative.
Interface sound objects are all the sounds connected with saving or loading gamestates, and with changing the settings for the game. In the case of MGS2, all the sounds one hears when changing the different game settings, such as the controller functions, image and sound settings, onscreen representation of objects carried by the user can be understood as interface sound objects. Here the sounds are short bleeps, which can be heard whenever a setting is changed.

These sound objects enhance the knowledge of the present location within a metaphorical structure and give feedback about actions. Interface designers even coined the term of the earmark, the sound equivalent of the icon. Current research (Maaso, 2001) is trying to define the possibilities of adding sound options to information media systems in order to improve usability. This process is termed “sonification”.
Now that we have distinguished between different types of sound objects we will move on to describe their interrelations as well as how they function when they are linked to visual objects or events as well as user action.

5. Spatialising functions in the game environment
Without doubt, one of the most important contributions to the understanding of sound in
film has been made by the french theorist and musician Michel Chion. In a number of texts he analysed the internal workings of visual and auditive elements in film and TV. He
describes the relationship between sound and image, the “audiovisual contract”, as a sort of “symbolic pact to which the audio-spectator agrees when he or she considers the elements of sound and image to be participating in one and the same entity or world”(1994, p.222).
This notion emphasizes the aspect of construction on the one hand and underlines that we are facing cultural conventions that are subject to change on the other. It is quite obvious that computer games are offering an audiovisual contract as well. Yet, we have to ask ourselves in which way the rules constituting this contract in a computer game differ from film. We have already mentioned one major difference between the two media earlier in the text: In film, the audiovisual contract inscribes static relations between the elements, whereas in games they are dynamic and potentially user driven. Following this logic, dynamic relations between sound objects, visual objects and user action which are
defining the spatial practice of a game can be understood as spatialising functions. A
function defines all the participants of a relation, as well as the nature of it, over time.

One of the most important of these spatialising functions that we have already
mentioned, defines the movement of sound objects in the game environment. In the
following we will discuss how some other functions can be described. Although there
many distinctive spatial functions, we will concentrate on the two most important ones
employed by MGS2, namely the dynamic acousmatic and the spatial signature function.
It is important to note in this context that the use of a sound object can be influenced by either one function or by combinations of them.

5.1 The dynamic acousmatic function
The “acousmatic” is a pythagorean term, which considers the distance separating sounds from their origin. It refers to the situation of the disciples listening to the words of the priest while he is hidden behind a curtain. For some, the term is very precise and refers specifically to this listening situation. However, it has gained wider usage, in describing a genre, which, to a large extent derives from the Musique Concrète tradition and is founded upon this specific listening situation. The acousmatic has been very important for Pierre Schaeffer in his discussion of the sound object, and Michel Chion transformed it in order to describe the specific relations of sound and vision in film. Radio, phonograph as well as telephone can be identified as purely acousmatic media systems. Film on the other hand has the possibility of showing the source of a sound. “In a film an acousmatic situation can develop along two different scenarios: either a sound is visualized first, and subsequently acousmatized, or it is acousmatic to start with, and it is visualized only afterwards.”(Chion 1994, p.72).

The importance of the acousmatic situation for the representation of a specific spatial setup can be understood if one considers the nature of how people localize sounds. They are usually trying to identify the source of a particular sound. In a perceptual sense sound is surrounding us completely and, even if there is spatialising at work through phase difference between our ears, hearing is not directional in the same way as seeing. The natural everyday action of accurately locating a sound that one hears from a place that is not part of the visual field, would be to move ones head in the general direction and to try to visually locate the source of the sound. A film does not devolve that action to the viewer, it is using the apparatus of the spatially fixed container of vision, the frame and the possibility of sound editing to create different spatial situations, which are then frozen and generate one possible temporal and spatial narrative stream or order of events.

The description of sound as being either off or on screen is used frequently to describe
these acousmatic situations in film. It has been argued that it is not really the sound that is on or off screen, because the sound can either be heard or not, but that the reference is always being made to the visual source of the sound. Christian Metz (1985, p.157) states that “[we] tend to forget that a sound in itself is never “off”: either it is audible or it doesn’t exist” and he goes on to say, “[t]he situation is clear: the language used by technicians and studios, without realizing it, conceptualizes sound in a way that makes sense only for the image.”(1985, p.158). Thus, the concept of off and on screen sound seems to be problematic, or at least confusing in our context. We will refer to sound objects that are related to visual objects in the player’s field of vision as visualized sound objects. In contrast, those sound objects linked to visual objects outside the player’s field of vision will be called acousmatised.

Contemporary computer games, and especially first person perspective 3D games, allow the user to actively visualize or acousmatise sound objects. The player of such a game
constantly has to make the conscious decision if the visual source of a sound object is worth seeing, if the sound object should be visualized within the game context. In other words, when playing a game such as Quake (ID Software 1996), one constantly has to scan the aural field for effect sound objects that are have been identified with opponents. If one hears such a sound one will usually “turn his head” (move the visual field of the virtual camera) in the direction of the sound to visualize the sound and to precisely locate the source of it.

The kinaesthetic control over the acousmatisation and visualisation of sound objects in the game environment is a key factor in creating unique spatial experiences when playing
computer games. Whenever one might have wished to be able to identify the source of a
strange sound, e.g. in a horror film, one had to acknowledge that the timing of this
identification is up to the film’s author. A game like MGS2, which is appropriating a lot of
conventions derived from film, gives the user the option to locate and visualise the source of a sound. This is indeed an important element of the gameplay. Even if we are only focus on how acousmatic situations are being constructed temporally, the functional differences between films and computer games immediately take shape.
In film the relations between sound and image are usually chronologically defined after the filming has taken place. In the process of adding the sound to the image, each connection has to be defined for each frame, and the final outcome of the process is a product constituted by the fixed bond of aural and visual elements.

In a computer game we can also find a number of fixed relations between sound objects and visual objects, but the temporal process of visualization and acousmatisation of sound objects is a dynamic process that can be subject to the player’s action. It can also be different each time a game is played. This state of affairs makes it necessary to think about the mechanics at work as dynamic functions. The user controlled dynamic acousmatic function is one of the most important spatialising functions in video and computer games.
The sound objects used in these situations are in most of the cases either speech sound
objects or effect sound objects. Dynamic acousmatic functions are very important for a
stealth intrusion game like MGS2. The gameplay consists to a large extent of hiding from
guards and opponents and sneaking past them without alerting them. In order to remain
undetected the player has to stay hidden, which means that most of the time he cannot see his opponents. He can however hear effect sound objects indicating their approach such as footsteps as well as speech sound objects i.e. conversations. Whenever the player has somehow attracted the interest of the guards and alerted them the gamestate changes. This is additionally indicated by an alarm sound and the small map that is used to show the locations of opponents in the game environment disappears. In these situations the player has no way of locating the enemy other that listening for footsteps. Another interesting acousmatic situation emerges from the avatar hiding in a cupboard. It is possible to get a very limited amout of visual information through small slits in the cupboard door, but one will only see the enemy when he is directly in front of the cupboard. In these moments the acousmatic state shifts and the effect sound object of enemy footsteps is visualized. These functions are crucial for actively locating or situating objects in the game environment and are thus very important features of the game’s spatial apparatus.

The introduction of a directional microphone as a feature that can be used in particular parts of the game signifies the importance of sound in MGS2. In one short episode of the game the player has to locate a specific hostage held by terrorists in a room with thirty other hostages (Oil Rig Shell 1, Core 1 B). Due to the fact that the hostage has a heart pacemaker, a directional microphone is used to listen for any unusual cardiac pattern. In another situation (Oil Rig Shell 2, Core 1, Air Purification Room) the same microphone is used to listen to a conversation that takes place behind a wall. Because the source of the voice is moving, the player has to move the directional microphone in the right direction to listen in.
This is a perfect example for a dynamic acousmatic function based on speech sound objects that even defines a specific subtype of gameplay. It generates a complex spatial setup that is dynamically linked to user action, moving the microphone in the right direction. The idea of the directional microphone has since also been used in the game Splinter Cell (UbiSoft Entertainment 2002), to listen to a conversation in a moving elevator and another one taking place inside an embassy building.

5.2 The spatial signature function
The fact that one particular sound event will have very different qualities if it is heard by
different people in a different environment is familiar.
Rick Altman (1992, p.24) gives the example of a baseball that broke his window and that
sounded very different for him, who was outdoors than for his father in the house or for his mother in the basement. In his article about the material heterogeneity of recorded sound, he deals with the problem of perspective within recorded sound and he introduces the term “spatial signature” in order to explain that “[…] every recording carries elements of this spatial signature, carried in the audible signs of each hearing’s particularities (Altman 1992, p.24).”Andrea Truppin (1992, p.241) defines this phenomenon as follows: “Spatial Signature can be defined as a sound’s auditory fingerprint that is never absolute, but subject to the sound’s placement in a particular physical environment. These markers include reverb level, volume, frequency, and timbre that allow auditors to interpret the sound’s identity in terms of distance or the type of space in which it has been produced and/or is being heard.” According to this definition, it is clear that recorded sounds can have multiple signatures: There is the specific spatial context of the original sound, the space at the source of the recording, and the space of playback and its particularities.

How does spatial signature appear in the game environment? First and foremost we can differentiate between the spatial signatures of recorded sounds (qualities inherent to the sound object itself) and the functions that allow computer games to simulate the spatial signature connected with particular environments.
Most interesting for our analysis is the fact that contemporary sound technology and
especially real-time DSP are able to simulate certain qualities that define the spatial
signature of sounds. If the amount of reverb of any given sound object (whether recorded or synthesized) is changed, this will simulate the experience of hearing the sound in a different type of environment. As large rooms suggest strong reverberation, a sound that is changed by a reverb filter will bear the spatial signature of being reflected in a large room. This simulation process literally turns the relation between source and surrounding, as observed by Rick Altman on its head. The user follows an inductive process from the isolated sound to an assumption about the surrounding space. These simulations of particular spatial signatures are used by many contemporary games. The most common factors are the amount of reverb or echo of sound objects, but as the game architecture gets more sophisticated, complex simulations like the reflection of sound from walls or objects are being introduced. This observation hints at the relation between representation and simulation that is at work in computer games. Here is however not the time and place further discuss this particular subject in further detail.

Because the spatial signature of sound objects is defined by the space surrounding the
sound source, it can be understood as a function that operates on the level of the zone sound object. A zone within the game environment shares the rules defining its spatial signature, suggesting a coherent spatial structure. In other words, if the zone sound object is defined as a particular room in a house, the qualities of reverb and echo within that room will be shared by all the sound objects it contains. Functions defining the spatial signatures of sound objects can greatly enhance the immersive qualities of a location in the game environment. If we return to MGS2, we can note that the game does indeed present us with places that have general auditive qualities.
We have referred to them as zone sound objects above, but spatial signature functions also define the qualities of effect sound objects. For example, all the sounds we hear in the flooded level of MGS2 (effect sound objects linked to avatar movement as well as
exploding underwater mines) will be muffled in the same way and thus convey a particular spatial signature.
An interesting example, that shows the range of influences of the spatial signature function on different sound objects and that can literally be understood as a type of “audio perspective”, can be found in the outside levels of the oil rig.
MGS2 uses two optional visual perspectives, a third person perspective mode, which usually shows the avatar from behind and above, as well as a first person perspective.
The user switches between these two visual perspective modes throughout the game.
The two visual perspectives correspond to two different aural perspectives. If one switches to the first person perspective in the outside level of the oil rig, one will hear a much louder rendition of the sound produced by the wind. Essentially there are two different auditive spatial signatures at work, which are related to the virtual camera. This function greatly enhances the feeling of being in the place, while it refers to film sound conventions.

6. Conclusion
We have seen that sound is an important element of the spatial apparatus of contemporary computer games. The sounds in the game environment have been classified according to their use. The acousmatic spatialising function and the spatial signature function have been proposed and described. Although a lot remains to be said, I hope that this short paper has managed to generate a point of departure for an auditive perspective on games and that the research into this particular aspect will attract more interest in the future.
As sound technology is evolving with new gaming platforms we will most definitely
see more games that make innovative use of sound in the near future.

7. References
Quake (1996). ID Software.

Soul Calibur (1999). NAMCO.

Metal Gear Solid 2 Sons of Liberty (2001). Konami Computer Entertainment Japan, Inc.
Including: DVD The Making of Metal gear Solid 2 (2001).Fun TV.

Tom Clancy's Splinter Cell (2002). Ubi Soft Entertainment.

Aarseth, E. (2001). Allegories of Space, The Question of Spatiality in Computergames. Cybertext Yearbook 2000. R. K. Markku Eskelinen. Jyväskylä, Research Center for Contemporary Culture University of Jyväskylä.

Aarseth, E. (1997). Cybertext, Perspectives on Ergodic Literature. London, Johns Hopkins University Press.

Altman, R. (1992). Material Heterogeneity of Recorded Sound. In: Sound Theory Sound Practice. Ed R. Altman. New York, Routledge.

Chion, M. (1994).Audio-vision : Sound on Screen. New York ; Chichester, Columbia University Press.

Howells, S. A. (2002). Watching a Game, Playing a Movie: When Media Collide. In: ScreenPlay, Cinema/Videogames/Interfaces. Eds. T. Krzywinska, G. King. London, Wallflower Press.

Juuls, J. (1999). A Clash between Game and Narrative, Institute of Nordic Language and Literature. Copenhagen, University of Copenhagen.

Konzack, L. (2002). Computer Game Criticism: A Method for Computer Game Analysis. CGDC Conference Proceedings. F. Mäyrä. Tampere, Tampere University Press.

Laurel, B. (1991). Computers as Theatre. Menlo Park, Ca, Addison Wesley.

Maaso, A. (2001). Sonification in web design: auditive possibilities for navigation.
Available: [12.10.]

Metz, C. (1985). Aural Objects. Film Sound: Theory and Practice. J. B. Elisabeth Weis. New York, Columbia University Press.

Poole, S. (2000). Trigger Happy. The Inner Life of Videogames. London, Fourth Estate.

Schaeffer, P. (1966). Traité des objets musicaux. Essai interdisciplines. Paris, Editions du Seuil.

Truppin, A. (1992). And Then There Was Sound: The Films of Andrei Tarkovsky. in: Sound Theory Sound Practice. Ed. R. Altman. New York, Routledge.

Weske, J. (2000). Digital Sound and Music in Computergames.
Available: [5.01.2003]

Wilhelmsson, U. (2001). Enacting the Point of Being: Computer games, interactivity and film theory. PhD Thesis. Department of Film and Mediastudies. Copenhagen, University of Copenhagen.

(C) Axel Stockburger 2003


Note: this document has been adapted for the sake of accessibility. You can find the original .pdf document here.