Thursday, April 16, 2009

Phase-Coherent Pitch Shifter for ReClock

This article is for Home Theather and AV enthusiasts who use a computer as the playback device for High Definition video.

A software called ReClock is often used with HTPCs to ensure perfectly smooth video playback and also correct the sound pitch of 25fps PAL movies which have been sped up by 4% from their original 24fps.

Some people, myself included, use ReClock to speed up 24fps movies to 25fps, so that they can be displayed without any 3:2 motion judder on HDTVs which support only 50Hz/60Hz refresh rates. In this case, ReClock can also perform pitch-correction of the audio.

The Problem

Unfortunately, the default pitch-shifting library included in ReClock has some drawbacks when used to pitch-shift multichannel 5.1 or 7.1 PCM audio:

  • The phase is not coherent between the different channels. This means that the Left and Right channel might be slightly out of phase, which ruins the stereo effect.
  • The parameters used in the pitch-shifting algorithm are not optimal for movie sound quality. The subwoofer channel and low frequencies may be weakened, and some pitch-shifting artefacts may be heard in the sound.
ReClock uses the open-source SoundTouch library for pitch-shifting.

The Fix

I have modified the pitch-shifting DLL to fix the problems for 5.1/7.1 PCM audio:

  • The phase will be coherent between the L-R channels. This preserves the stereo effect and enchances the overall sound.
  • C and LFE channels should be mostly in-phase with L-R channels.
  • Optimized the parameters used in the SoundTouch algorithm, so that the subwoofer channel and low frequencies are preserved, and to decrease the pitch-shifting artefacts.
The overall result is much improved sound quality if pitch-shifting for 5.1/7.1 audio is used in ReClock.

How to install it

Updated 25-Jun-09: Updated parameters
  • Download and install the latest ReClock from here.
  • Download the enhanced DLL here:
    (a) Standard version (best quality) or
    (b) Standard version with lower CPU usage
  • Alternatively, if you use the AC3 Encoder function of ReClock and experience sound dropouts or A/V sync problems, try this version instead:
    (a) AC3 Encoder version (best quality) or
    (b) AC3 Encoder version with lower CPU usage
  • Extract and copy the timestretch.dll file to the C:\Program Files\ReClock\ folder, replacing the original file. You may back up the original timestretch.dll file if you wish.
  • If you experience sound drop-outs, you can try this: open the "Configure ReClock" application, and increase the setting of "Sound pre-buffer size" to 1000ms (default is 500ms). The "Max latency for PCM audio" should be about 10% to 20% (default is 20%).
Have fun, and hope you enjoy the difference it makes!

Click here to download the source code.

Additional tips

There are still some more things you can do to reduce the clicking sounds that can sometimes be heard when timestretching is used.
  • Set your sound codec (ffdshow audio, etc.) to output at 32-bit INTEGER PCM only (and set ReClock to output at whatever bit-depth your soundcard supports)
  • Also, if using ffdshow, in the "Processing" menu, under "Allowed sample formats for sound processing", chose ONLY "32 bit integer"
The above are some of the optimum settings I've found after a lot of experimenting.

More technical jargon about what I actually did

ReClock uses the SoundTouch library to do time stretching. SoundTouch uses WSOLA-like time-stretching routines to do this. Basically WSOLA is splitting the sound in chunks, then finding the optimal point to join the chunks back into a longer/shorter waveform depending on how much you want to stretch/shrink the sound. SoundTouch takes in several parameters which can be adjusted to optimize the WSOLA. For more information, check out http://www.surina.net/soundtouch/README.html. In my DLL, I have optimized these paratemers to obtain better sound for movies.

The other thing I did was to fix the phase coherency. SoundTouch can only process sound samples in mono or stereo. In stereo mode, SoundTouch preserves the phase coherency between the 2 channels. For 5.1 channel audio, ReClock actually creates 6 mono channels for SoundTouch to process. So, SoundTouch cuts the sound up different for every channel, and so every channel is slightly out of sync (out of phase). But in my DLL, I use multiple stereo channel processing in SoundTouch, matching up the Left-Right, etc. This pretty much ensures that the stereo imaging is still preserved.