Generating Sound Files

This section contains information about the way the software automatically generates compact disk quality sound files in just intonation. This is definitely the least interesting part of the website, as I don't think it contains any original material. Much has explicitly been borrowed from others (such as the reverberation); some other material I invented from thin air, but others must have done the same before me (such as reading out wave tables).

Wave Tables

Whenever the software starts to generate a sound file, it first computes the wave tables: for each 'organ' stop a table of 1000 numbers, representing consecutive amplitudes within one single vibration of the sound wave. This computing is based on a list of required volumes of all partial sine waves that make up the required sound quality of the stop. The first 12 of these required volumes, in turn, where taken from [OB 1971] p. 253, 254. The remaining required volumes were added based on the volume of the 12th given volume and decreasing the volume for ever higher partial sine waves.

Actually, several wave tables for a stop are used - not just a single wave table for the whole stop. The reason is that the highest tones of a stop have to contain fewer partial sine waves. The highest pitch of an 8' stop of an organ is about 2050 Hz. Its 8th partial sine wave is about the highest a human can perceive: 8 x 2050 Hz = 16,400 Hz. You may think adding higher partial sine waves would not harm, but it does. Compact disk quality sound equipment only produces 44,100 samples per second per channel. That is more than enough for the human ear, which stops at about 18,000 Hz: Nyquist's sampling theorem implies that 44,100 samples per second faithfully reproduce sounds with frequencies up to 44,100/2 = 22,050 Hz. If one tries to go beyond that, an aliasing effect causes very unmusical phantom tones to emerge from the sound equipment, in addition to the wanted tones. The following fragment contains two high pitched octaves of parallel thirds with a piercing sound. In the background you hear softer and unmelodious tones that should not be there:

BEWARE: TURN VOLUME LOW!

Those unmelodious background tones are prevented by restricting partial sine waves to those lower than 22,050 Hz.

To actually generate a tone of the desired pitch, we need to deliver 44,100 samples per second from a wave table. Say we would read and write all 1000 samples in the wave table in turn and keep repeating that. That would produce a tone with a pitch of 44,100 Hz / 1000 = 44.1 Hz. We could also skip every other sample in the wave table and get a pitch of 44,100 / 500 = 88.2 Hz. But what if we want a pitch of say 73.47 Hz? The used mechanism starts with an index number 0.000 that points at the first sample in the wave table and reads and writes that first sample. Then it increases the index number by 73.47 x 1000 / 44,100 = 1.66598639... The index number then points somewhere between the second and the third sample in the wave table. The software reads both of those and outputs a weighted average: the index number is closer to the third sample so that has a higher weight. This is repeated and when the index number goes beyond the wave table, 1000 is subtracted.

Writing the Sound File

The amplitudes in the wave tables are expressed as floating point numbers. All subsequent calculations are carried out in floating point numbers as well. Only right before actually writing the sound file to disk, these floating point numbers are converted to 16 bits per sample for the .wav file format for compact disk sound quality. That poses an annoying problem: you only know what the highest floating point number will be when the calculations of the whole piece have finished. This highest number is needed so the samples can be neatly converted to 16 bits per sample before writing the .wav sound file. I could have chosen to store an entire piece of music in floating point numbers and then run an extra pass to convert those numbers to 16 bits per sample. But for a piece of four minutes (= 240 seconds), that would require 2 stereo channels x 240 seconds x 44,100 samples/second x 64 bits/floating point number = 1,354,752,000 bits = 161.5 Mbytes. My old laptop lacks that amount of random access memory. I assumed that storing this amount on a hard disk and reading it in for the second pass might take more time than simply recomputing the entire piece of music. So that is what I did. If you know of a solution that avoids the intermediate storage and also avoids recomputing the entire piece, I would be very interested. A smarter solution than buying a bigger laptop, that is.

Reverberation

The basic scheme of J.A. Moorer was extended as follows: each of the two stereo channels has 50 parallel delay lines where the total of 100 delay times, as measured in the number of samples delayed (44.100 samples per second), have all been drawn from a list of pseudo random integer numbers. Numbers retained are all relatively prime to one another: no two numbers have a factor in common other than one. Also early echos were added.

Manipulating Time I

One time I heard a respected organist play Nun komm, der Heiden Heiland, BWV 659 by J.S. Bach on Youtube. Whenever the fast notes came, the speed fell. And it picked up again when the fast notes were over. I tried to achieve this effect automatically as follows. You can view the notes of a piece of music as a sequence of harmonic chords, each with a duration. Whenever such a duration is expressed as a half note (minim), it is normally supposed to be twice the duration of a quarter note (crotchet). But what if we make that 1.9 times instead of twice? We would then also want the same ratio of 1.9 between the durations of ♩ and ♪ etc.

The following conversion formula does this:

t_k = c _* e ^{a * ln (t_n)}
where a = ln (1.9) / ln (2)

t_n = time (duration) according to the notes in the score
t_k = time (duration) after conversion
c = a constant so the piece keeps its original total duration

This is how the piece then sounds:

(durations manipulated)

It all sounds a bit less robot-like this way. Or you could say that the virtual organist is a bit less accomplished. A ratio of 1.9 could be too low for practical purposes though. Perhaps I shall use 1.95.

Manipulating Time II

The easiest way to synthesize a soundfile from the notes of a harmonic chord includes letting them start at exactly the same microsecond. But that is not at all easy for a human organist. I once read that synthetic music would sound more naturally if the note onset times had a small random variation around the moment dictated by the score. But what is small? I experimented. If the margin is 200 miliseconds, the result sounds as if the organist has been drinking too much:

(onset variation margin 200 msec)

An onset variation margin of 50 miliseconds is still too wide at times:

(onset variation margin 50 msec)

No variation at all and a robot hand seems to have taken over:

(onset variation margin 0 msec)

I chose 20 miliseconds for the sound files on this website.

Next: Just intonation keyboards

Previous: Automatically assigning pitches to notes

Home