November 2007

From the Book of Meaning

Meanwhile, in another multi-dimensional space, there remained only two aggredeities after the (metaphorical but not necessarily allegorical) dust settled. Their interactions were not completely untranslatable, and are interpreted here as idiomatic dialog to facilitate human comprehension.

Pseu: "What'll it be, Reg? You must decide."

Reg was a coordinated coalition of consonant conscience and courtesy, and Pseu was a devious distributed dissonance of distraction and dissipation. Pseu was offering Reg a choice. This was the denouement of an epic whose span cannot be compassed by the likes of you or me. Its casualties were countable, but not finite; its plot twists, recursive and non-euclidean; its lessons, lost to all but the victor because the casualty count included all observers.

Only these two aggredeities remained, but this was unlike the final showdown one might expect for an ultimate battle between order and chaos. Reg (the definite undergod) did not have overwhelming odds stacked against it, because this 5.72-dimensioned space was completely deterministic. Probabilities need not even be considered. Reg was constrained to an end-game choice of only two options with no possible alternatives.

Reg: "What were those options again? I got distracted by trans-dimensional expository interference."

Pseu: "I could keep you alive for all eternity,
suffering. I could exist very comfortably on your pain alone. Or..."

Reg: "Ah, the obligatory non-option, so that you can call the outcome my choice."

Pseu: "Or I could let you die now in peace and create from your essence a whole new universe. Your intrinsic order would live on in the rules of the new universe, but I can construct it so that the very basis of those rules is randomness. I can make it so that the tiniest changes precipitate staggering catastrophes. Everything will look like it follows a pattern, but randomness will obscure the important patterns and remote coincidences will misguide any learning. Nothing will be certain. Even causality will be impossible to confirm. And any intelligent being locked in this universe will be doomed to suffer the ongoing aching angst of uncertainty."

Reg: "But if you make the universe out of me, you won't be able to meddle with it. You won't be able to break my essential rules."

"I won't have to. This universe will start slowly, but once it gets going it'll radiate the pain of trillions of frustrated intelligences. Tasty!"

Reg made the only reasonable moral choice possible: "I think I'll skip the eternity of torture, thanks, and go straight to the universe of pain. At least it won't be my pain."

Visualizing Music

Over in my guest book topic, Judah posed a question. I started composing a response but it got too long, so I've posted my stab at an answer here.

Hey, Virge, question for you: I was just listening to music and watching the visualizations on Windows Media Player, and they really seemed to flow with the music and mean something. It seemed like a strange thing that these randomly generated visualizations should have such a close bond with the music. I know that some of the input for the program comes from the music itself, but even so, how smart can the algorithm be? It doesn't know what's going on in my mind.


Is music able to convey relatively complex things through simple cues (volume, pitch, changes in pitch), and can those same cues be reflected in a visualization to the same effect? Or do the music and visualization convey only relatively simple things, which in turn are the cues for my mind to amplify on? Or is the whole thing a projection: what I hear in the music, I project onto the visualization, which in reality does not reflect the music except in a very simple way? What do you think?

As far as I can see, most of the Media Player visualizations are based on a combination of 5 elements:

  1. real time spectrum analysis of the audio--working out how much sound volume is present at each frequency, from the deepest bass to the highest sizzles,
  2. a geometric mapping--mapping the frequency and volume parts of the audio spectrum to different positions on the display and/or to different shapes or colours on the display,
  3. a time-based filtering algorithm so that a short duration change in the spectrum (e.g., the blip in the spectrum due to a single drum hit) will produce a change in the image that morphs and fades over a few seconds,
  4. some slow time-based input independent of the audio, e.g. a steady progression of the colors through parts of the palette, or a steady rotation of the whole picture,
  5. some pseudo-random input to introduce variety of images even if the audio is repetitive.

Of those elements, only #1 has any ability to connect the mood or feel of music to a visual effect. The spectrum is a very, very raw measure of the instantaneous timbre of the combined instruments.

This question reminds me of something I read recently (probably by Daniel Dennett, since I read two of his books over the last couple of months) about the problems of reductionism: we may eventually be able to completely explain in physical terms all the parts of the experience of music, but such a description will never be satisfying. Understanding each of the parts doesn't lead to an understanding of the whole. There are so many layers* in which a description of micro-behavior of one level depends on the macro-behavior of the underlying level, that our understanding of music as an experience cannot be usefully described in the language of acoustics. The information presented in a real-time updating audio spectrum is a very "mechanical" description of the music. What we experience when we listen to music is overwhelmingly determined by what we already have stored in our heads from past listening experiences. As you say, music is "able to convey relatively complex things through simple cues" and this is because we have a rich set of conditioned reactions to the language of music that we've absorbed as part of our culture.

In our westernized culture we associate certain sounds with certain emotions, e.g., a kazoo or tin whistle playing in a major key = playful; a violin playing in a minor key = wistful (even if you keep the same rhythm and tempo). The difference in tonality between major and minor makes a huge difference to the mood conveyed by a piece of music, but will be completely undetectable on a spectrum based visualization. It is possible to automatically analyze the tonal and harmonic content of music, and a visualization based on these might be able to run in real-time on a current PC, but I'm pretty sure none of the Media Player visualizers do so.

I think you're on the right track with the idea of projection, but it's not the whole story. Two other things are also contributing to the sensation:

  1. The people who design and select visualizations are actively choosing the ones that are interesting and pleasing representations of the music. Of all the possible ways to convert spectrum information to a picture, the only ones you get to see in the end product are ones that pleased the software geeks, and the marketers, and the customer focus groups, etc. So if there are certain mappings from audio spectrum -> color/shape/movement -> human image recognition -> association/memories -> emotions that do happen to correlate with the listening experience in our culture, then these will be selected for in the product development and marketing process.
  2. You are continually learning whether you realize it or not. When you listen to a piece of music and have an attractive pattern projected to your eyes it doesn't take long for your brain to start recognizing patterns (subconsciously) and building associations between those visual and audio patterns. If the visualization was mostly random, the associations wouldn't develop (except by accident). Because the visualizations are based primarily on consistent algorithmic manipulations of the audio information, the patterns are consistent and your brain learns them. Even if you go to a new visualization with different transformations, there is enough commonality in the way they process audio spectra for your brain to adapt, re-use what it has learned, and condition how you react to what you see.

* Here's my over-simplified map of the layers of knowledge required : mechanical vibrations -> sound transmission -> the ear as an auto-tuning sound transducer -> neural firings -> networked interconnections that have learned to "recognize" sounds/rhythms -> combinations of recognitions -> memory associations -> emotional states.