Reimplementing the StereoPannerNode

Last week Ruth John asked me if she could use my standardized-audio-context package for a set of Web Audio demos she is currently preparing. Sure I said, but when she showed me some early drafts I quickly realized that she is using AudioNodes which the standardized-audio-context package is not yet supporting. But there is no particular reason for that. I just never used these AudioNodes in my own projects and no one actively asked for them.
I chose the Stereo­Panner­Node as the first node to implement. It's a simple node with only one AudioParam called pan. If its value is -1 the signal will be panned to the left. If the value is 1 there will only be a signal on the right and if the value is zero nothing is going to happen. It's that easy. At least that's what I thought.

Browser Support for the Stereo­Panner­Node

The browser support for the Stereo­Panner­Node is actually pretty good. Only Safari, the web developer's best friend, has no native support. But the other browsers do closely follow the spec.

I was suprised to find out that the Stereo­Panner­Node is meant to behave differently for mono and stereo input signals. And I was even more suprised that all browsers with a native implementation do actually honor that. The panning algorithm which should be used to compute the output is even part of the spec to avoid any ambiguity.

Reproducing the Panning Effect with native AudioNodes

I'm not the first one who is trying to replicate the Stereo­Panner­Node. As with many other Web Audio problems it's worth checking the GitHub account of mohayonao before reinventing the wheel. As expected there is also an implementation of the Stereo­Panner­Node. It uses a cleverly designed graph of GainNodes and Wave­Shaper­Nodes in combination with a Channel­Splitter­Node and a Channel­Merger­Node to rebuild a Stereo­Panner­Node. However it only supports the algorithm for mono signals. Nevertheless it is a good starting point and worth taking a closer look.

A short glossary

Since we are using the terms GainNode, Channel­Splitter­Node, Channel­Merger­Node and Wave­Shaper­Node as if they are well known things from now on, I want to quickly explain what each of those nodes does. Feel free to skip ahead if you already know that.

The GainNode

The GainNode is probably the most basic AudioNode of all. It has only one AudioParam called gain. Whatever the input signal is will be mulitplied with the gain value.

GainNode artice on MDN

The Channel­Splitter­Node

The Channel­Splitter­Node has only one input but can have multiple outputs. It splits the channels of the input across all its outputs.

Channel­Splitter­Node artice on MDN

The Channel­Merger­Node

The Channel­Merger­Node is the counterpart of the Channel­Splitter­Node. It can have multiple inputs but only one output. It maps all its inputs to the channels of the output.

Channel­Merger­Node artice on MDN

The Wave­Shaper­Node

The Wave­Shaper­Node maps each sample of the input signal to a given output value. This output value is defined by the curve of the Wave­Shaper­Node. This curve is nothing else than a big lookup table.

Wave­Shaper­Node artice on MDN

Putting it all together

We are going to build our home grown Stereo­Panner­Node like a black box. There should be no way to tell from the outside what is happening inside. We just need to expose an input, an output and the pan AudioParam.

In this particular case the Channel­Splitter­Node and Channel­Merger­Node are ideal candidates for the input and output nodes. They do have one input and one output respectively and allow us to work with the raw channels in between them. This will give us a basic architecture like this:

ChannelSplitterNode(left channel)(right channel)ChannelMergerNode

We also need to expose an AudioParam to control the pan value. A ConstantSourceNode looks very promising for that job. Unfortunately it needs to be started somehow and there is no functionality of the Stereo­Panner­Node which we could abuse to start it.

A better approach is to use a Wave­Shaper­Node that is connected to our input. We give it a DC curve (which is actually a line) of [ 1, 1 ]. This will guarantee that the output of that Wave­Shaper­Node is always 1 no matter what input it has. We chain that WaveShaper with a GainNode and expose its gain AudioParam as the pan AudioParam. We end up with a simulated AudioParam that feeds its current value into the internal graph.

WaveShaperNodeGainNodegainChannelSplitterNode(left)(right)ChannelMergerNode

Next we need to apply the formula from the panning algorithm. The pan value needs to be transformed a bit in order to get the values which will then have to be multiplied with the left and right channel respectively.

// left channel
sample * Math.cos(((pan + 1) / 2) * Math.PI / 2)
// right channel
sample * Math.sin(((pan + 1) / 2) * Math.PI / 2)

The algorithm requires the pan value to be mapped to a value between 0 and 1. This is what the ((pan + 1) / 2) part of the formula stands for. Later on that value gets fed into the cosine (or sine) function to produce the final value. This can be achieved by using a WaveShaper with a lot of precomputed values. A reduced version of the curve might look like this:

[
  1,
  1,
  1,
  1,
  Math.cos(0 * Math.PI / 2),
  Math.cos(0.25 * Math.PI / 2),
  Math.cos(0.5 * Math.PI / 2),
  Math.cos(0.75 * Math.PI / 2),
  Math.cos(1 * Math.PI / 2)
]

The values of a WaveShaper's curve do always cover the range from -1 to 1. As you can see that would waste a lot of values in our curve. All values from -1 to 0 exclusively are never actually used because the range of possible input values starts at 0. A nice shortcut is therefore to not map the pan value from [ -1; 1 ] to [ 0; 1 ] in the first place and instead spread the curve across the whole range.

[
  Math.cos(0 * Math.PI / 2),
  Math.cos(0.125 * Math.PI / 2),
  Math.cos(0.25 * Math.PI / 2),
  Math.cos(0.325 * Math.PI / 2),
  Math.cos(0.5 * Math.PI / 2),
  Math.cos(0.625 * Math.PI / 2),
  Math.cos(0.75 * Math.PI / 2),
  Math.cos(0.825 * Math.PI / 2),
  Math.cos(1 * Math.PI / 2)
]

This little trick allows us to use the full range of the Wave­Shaper­Node's curve and also saves us from doing some unnecessary value mapping. However we still need two of those Wave­Shaper­Nodes as the computation for the right channel is using the sine instead of the cosine function. Those two Wave­Shaper­Nodes are abbreviated as WSN in the following diagram.

WaveShaperNodeGainNodegainChannelSplitterNodeWSNWSNGainNodegainGainNodegainChannelMergerNode

We also create a GainNode for each channel. The Wave­Shaper­Nodes are not connected to the GainNodes directly. They are controlling their gain AudioParam. In other words they multiply their value with the input signal.

And with that we built a fully working clone of the Stereo­Panner­Node. I think it's a nice example which shows what is already possible without using an AudioWorklet. But at the same time it also stresses the need for the AudioWorklet as there is no way to know if an input signal is mono or stereo without it. The only solution I can think of is requiring the channelCountMode to be explicit for now. This ensures that the channelCount is predictable and can't be changed dynamically by the input signal.

The algorithm for a stereo signal is a bit more complicated but can be built with the same technique. For the curious, the implementation of the stereo algorithm can be looked up in the source code.

It's worth noting that the accuracy of this method depends on the size of the curves used for the Wave­Shaper­Nodes.

An interesting fun fact is that the algorithm for mono signals will actually modify the signal if the pan value is zero but that is absolutely intenional.