Decorrelab -- Real-time Audio Decorrelation Engine for SuperCollider
The first project I'd like to share is DecorreLab, a script I wrote back in college as a project for a couple of my classes, particularly 3D Sound and Spatial Audio. While there are apparently some VST plugins that do decorrelation, I'll provide my source so you can run with it and improve it if you wish. Plus, you'll know exactly what algorithm I use. ;)
Basically, things that are correlated or even inversely correlated can exhibit insight into each other because they behave in similar or opposite ways. Things that are decorrelated bear little resemblance to each other, so such statistics don't really give much insight or meaning into each other. (See Notes 1 & 2 at the bottom.)
This decorrelation process can be obtained in several ways, to varying degrees, and produces at least five perceptual effects on what you hear:
SuperCollider makes it very easy to adjust the parameters such as intensity and delay of audio signals by using its wide array of unit generators. They can be applied to both channels because the original mono waveform gets copied into another buffer, then the filtering is applied to either or both buffers as prescribed.
As part of this project, I have included two robust convolution kernels with 256 coefficients. Convolution is the act of applying a filter to a signal, and a kernel is the description of that filter. They are shown to alter the phase shifts of specific frequencies in vastly different ways from each other while minimally coloring or altering the way it sounds to our ears. Specifically, their correlation coefficient with respect to each other is approximately 0.034, so the filters bear little resemblance to each other. As you slide the Correlation slider from right to left, the two channels change from correlated to decorrelated to inversely correlated, and the filters are applied to the two audio channels by the means described in the paper. On an actual audio sample I tested, the correlation of the two channels with respect to each other after the filters were fully applied was approximately 0.3. This means that when you slide the "Correlation" slider to the middle (to get 0, fully decorrelated), the audio signals are, in practice, not completely decorrelated. However, the effect is definitely strong enough to produce all the characteristics mentioned above. Plus, the amount of correlation you truly get at 0 is likely to change depending on the audio sample.
There are a number of tradeoffs to be made when producing the convolution kernels. For one, there are not more than 256 kernel coefficients because longer convolution kernels tend to smudge the audio and produce undesirable echo effects. In order for the precedence effect to be altered to the maximum degree, an impulse (instantaneous burst of energy at all frequencies) convolved with the filter must last less than 20 milliseconds. However, the number of coefficients must be maximized in order to obtain the greatest decorrelation effect, and has to be constrained to a power of 2 because this greatly optimizes computation time.
There is also a tradeoff to be made between timbral neutrality and impact at low frequencies. Unfortunately, to have the maximum amount of decorrelation at low frequencies, the filters will have to be built in such a way that will introduce perceptible timbral differences from the original signal (i.e. the filters will make it sound noticeably different). Changing the sound's timbre may not be desirable.
Finally, when designing the filters, one could consider applying phase shifts linearly across frequency or grouping them into critical band spacing. (Critical bands dictate the way our inner ear and brain handles various frequencies, and are quite complex to describe.) Using a linear spacing of phase shifts eliminates interference better, but using a logarithmic spacing (which is how critical bands are spaced) does a better job at creating diffuse sound fields.
The MATLAB code snippet below can be used to create 100 different no-frills filters quickly and provide the lowest correlation coefficient found between any of them:
What's SuperCollider?
It's an IDE and language (heavily based on Smalltalk) for carrying out real-time audio synthesis and effects processing. It works best on Mac OSX, and has built-in filters, sequencers, frequency generators, impulse generators, noise gates, and all sorts of goodies to let you do pretty much anything you can imagine with computer sound. You might be thinking that a purely object-oriented language such as Smalltalk would incur substantial performance penalties (i.e. how the heck is it doing real-time audio synthesis without choking all the time?), but surprisingly, it's not easy to get SuperCollider to hiccup unless you have bugs in complicated code, have memory leaks, etc.Decorre-what?
Correlation describes the similarity which two things fluctuate with respect to each other. See the examples in the table below about the three possible relationships:Relationship | Correlation Value | When Object 1 Is... | When Object 2 Is... | Under these conditions |
---|---|---|---|---|
Correlated | 0 < x < 1 | Length of time of a smoker's habit | Chance they will get cancer | Any time |
Inversely correlated | -1 < x < 0 | Price of corn | Price of beef | When there's a drought |
Decorrelated | 0 | The number of children you have | How long you live | Any time |
Basically, things that are correlated or even inversely correlated can exhibit insight into each other because they behave in similar or opposite ways. Things that are decorrelated bear little resemblance to each other, so such statistics don't really give much insight or meaning into each other. (See Notes 1 & 2 at the bottom.)
Great. What's so special about audio decorrelation?
Our brains are wired to interpret sound in particular ways, and can be fooled when we present sound in a different way than it's expecting. This is where audio decorrelation fits in. Naturally, there is a predictable (and usually minuscule) phase shift as sound travels past one ear and then the next. Decorrelation, as described here, takes one channel of audio (a mono signal) and completely alters the phase of a copy of it so you get two channels of audio (a stereo signal) that have massively different characteristics, even though they sound very similar. The phase (time at which various frequency waves experience maximum amplitude) becomes drastically different between the two channels, and this does not tend to affect the way it sounds to the human ear.
This decorrelation process can be obtained in several ways, to varying degrees, and produces at least five perceptual effects on what you hear:
- Reduces perceived distortions from constructive & destructive interference between similar sound sources.
- Causes sound in headphones to sound as if it's coming from right by your ears instead of inside your head. (This only happens if you listen to mono tracks anyway.)
- Produces diffuse sound fields, similar to the rich reverberant experience in concert halls.
- Defeats the precedence effect, a psychoacoustic phenomenon where our auditory system naturally filters out echos shorter than a particular time (50 ms or so). (You can also defeat the precedence effect by playing your recorded audio backwards.)
- Prevents sound coming from separate loudspeakers from seeming like it's coming from one speaker as you sit closer to that particular speaker (i.e. image shift).
The Program
The method I use to produce decorrelation is derived from (Kendall 1995), a fascinating paper by the class professor on the effects of and mechanisms to produce decorrelation. You can read more about how it works by downloading the paper. DecorreLab also produces several other interesting psychoacoustic effects, including presenting interaural time & intensity delays between the two sound signals. By changing the time delay and/or volume of the two signals as they reach your ears, you can fool your brain into thinking the sound is coming from different places. Combine this with the effect of enriching the sound field in general, and you can make your mono recordings sound much more realistic.
Click here to download DecorreLab. You need the latest version of the SuperCollider audio processing language to run it.
SuperCollider makes it very easy to adjust the parameters such as intensity and delay of audio signals by using its wide array of unit generators. They can be applied to both channels because the original mono waveform gets copied into another buffer, then the filtering is applied to either or both buffers as prescribed.
As part of this project, I have included two robust convolution kernels with 256 coefficients. Convolution is the act of applying a filter to a signal, and a kernel is the description of that filter. They are shown to alter the phase shifts of specific frequencies in vastly different ways from each other while minimally coloring or altering the way it sounds to our ears. Specifically, their correlation coefficient with respect to each other is approximately 0.034, so the filters bear little resemblance to each other. As you slide the Correlation slider from right to left, the two channels change from correlated to decorrelated to inversely correlated, and the filters are applied to the two audio channels by the means described in the paper. On an actual audio sample I tested, the correlation of the two channels with respect to each other after the filters were fully applied was approximately 0.3. This means that when you slide the "Correlation" slider to the middle (to get 0, fully decorrelated), the audio signals are, in practice, not completely decorrelated. However, the effect is definitely strong enough to produce all the characteristics mentioned above. Plus, the amount of correlation you truly get at 0 is likely to change depending on the audio sample.
There are a number of tradeoffs to be made when producing the convolution kernels. For one, there are not more than 256 kernel coefficients because longer convolution kernels tend to smudge the audio and produce undesirable echo effects. In order for the precedence effect to be altered to the maximum degree, an impulse (instantaneous burst of energy at all frequencies) convolved with the filter must last less than 20 milliseconds. However, the number of coefficients must be maximized in order to obtain the greatest decorrelation effect, and has to be constrained to a power of 2 because this greatly optimizes computation time.
There is also a tradeoff to be made between timbral neutrality and impact at low frequencies. Unfortunately, to have the maximum amount of decorrelation at low frequencies, the filters will have to be built in such a way that will introduce perceptible timbral differences from the original signal (i.e. the filters will make it sound noticeably different). Changing the sound's timbre may not be desirable.
Finally, when designing the filters, one could consider applying phase shifts linearly across frequency or grouping them into critical band spacing. (Critical bands dictate the way our inner ear and brain handles various frequencies, and are quite complex to describe.) Using a linear spacing of phase shifts eliminates interference better, but using a logarithmic spacing (which is how critical bands are spaced) does a better job at creating diffuse sound fields.
The MATLAB code snippet below can be used to create 100 different no-frills filters quickly and provide the lowest correlation coefficient found between any of them:
% The real part of the filter is always 1 in order to keep the sound as close as possible to the same timbre; the imaginary part applies the phase shift.
Hs = 1 + i*(2*pi*rand(256, 100)-pi); % real (1) + imaginary (i)
lowestConv = 10 + 10i; % i is the imaginary number, sqrt(-1)
for filter1 = 1:100
% Calculate 1/2 the matrix to reduce redundancy
for filter2 = filter1 + 1:100
% Find the cross-correlation vector between the two filters
c_h = xcorr(ifft(Hs(:,filter1)), ifft(Hs:,filter2)), 'coeff');
% Find the middle value of the result vector, as this is the correlation value for when the two filters were directly overlapping each other
c((filter1-1)*100 + filter2) = abs(norm(c_h(256)));
% Find out if this is lower than what we've seen before
if (c((filter1-1)*100 + filter2) < lowestConv
a = filter1; b = filter2; % a & b are the index of the best filters
lowestConv = c((filter1-1)*100 + filter2);
end
end
end
Side Notes
1. Another key point, especially when thinking of asset correlation, is that correlation of assets changes over time. The 2007 credit crunch & financial crisis made bundled assets (e.g. loans packaged in big groups & sold to Fannie & Freddie) fall steeply in value. Originally, the loans inside them were supposed to be decorrelated and thus less risky because usually some people default and some don't. When the market turned, the loans became correlated because more people were unable to pay them on time. It is thus unsafe to assume most assets will keep their exact rate of correlation for long.
2. Correlation does not prove causation. Over time, research has linked smoking to cancer, but many people make preposterous claims simply because they see two things that are related and think one must cause the other.
Hi, Im new to SuperCollider and coding in general, I was wondering if you could help. I copy/pasted your 'DecorreLab' script into SuperCollider and the post window comes up with the following:
ReplyDeleteinit_OSC
empty
compiling class library...
initPassOne started
NumPrimitives = 710
initPassOne done
compiling dir: '/Applications/SuperCollider/SuperCollider.app/Contents/Resources/SCClassLibrary'
pass 1 done
numentries = 812201 / 11241384 = 0.072
5082 method selectors, 2212 classes
method table size 12317376 bytes, big table size 89931072
Number of Symbols 11575
Byte Code Size 348685
compiled 311 files in 0.40 seconds
compile done
Class tree inited in 0.01 seconds
Cleaning up temp synthdefs...
*** Welcome to SuperCollider 3.7.0-beta2. *** For help press Cmd-D.
SCDoc: Indexing help-files...
SCDoc: Indexed 1334 documents in 1.05 seconds
booting 57110
Number of Devices: 2
0 : "Built-in Microph"
1 : "Built-in Output"
"Built-in Microph" Input Device
Streams: 1
0 channels 2
"Built-in Output" Output Device
Streams: 1
0 channels 2
SC_AudioDriver: sample rate = 44100.000000, driver's block size = 512
SuperCollider 3 server ready.
Receiving notification messages from server localhost
Shared memory server interface initialized
ERROR: Class not defined.
in file 'selected text'
line 630 char 18:
window = SCWindow("DecorreLab", Rect(100, 300, 500, 300));
window.view.decorator = FlowLayout(window.view.bounds);
-----------------------------------
ERROR: Class not defined.
in file 'selected text'
line 634 char 24:
filetext = SCStaticText(window,Rect(0,0,100,20));
filetext.string = "Path to audio file: ";
-----------------------------------
ERROR: Class not defined.
in file 'selected text'
line 636 char 19:
file = SCTextField(window, Rect(0,0,300,20));
loadBtn = SCButton(window, Rect(0,0,50,20));
-----------------------------------
ERROR: Class not defined.
in file 'selected text'
line 637 char 19:
loadBtn = SCButton(window, Rect(0,0,50,20));
loadBtn.states = [["Load", Color.black, Color.gray]];
-----------------------------------
ERROR: Class not defined.
in file 'selected text'
line 649 char 19:
corrBtn = SCButton(window, Rect(30,0,30,20));
corrBtn.states = [["OK", Color.black, Color.green]];
-----------------------------------
ERROR: Class not defined.
in file 'selected text'
line 686 char 19:
playBtn = SCButton(window, Rect(100,0,100,100));
playBtn.states = [["Play", Color.black, Color.green],
-----------------------------------
-> nil
Im running OS X 10.10.5 and the latest version of SuperCollider
Any help would be great!
Hi Jaryd,
ReplyDeleteI'll have to look into this more later, but as I recall, you have to use a version of SuperCollider somewhere between 3.0 & 3.5. I don't think the latest version of SC plays nicely with DecorreLab -- even ones that came out later in 2013. ;)
Let me know if that helps.
- Stephen
I got it working in an older version, however when i load a .wav file i get the following:
ReplyDeletea SCButton
/Users/jarydmiles/Desktop/MSc Applied Acoustics/Research Methods/A2/WAV Files/reference cd - pinknoise.wav
Boop
Beep
FAILURE /n_set Node not found
FAILURE /n_set Node not found
FAILURE /n_set Node not found
FAILURE /n_set Node not found
FAILURE /n_set Node not found
FAILURE /n_set Node not found
Thanks for the help