The core objective of this project was to create an algorithm that could produce a coloured RGB image when provided with a set of three similar grey pictures, each representing one of the colour channel. To test this algorithm, a few series of pictures taken from the Prokudin-Gorskii glass plate collection were used. These plates already containing the three colour channel stacked one on top of the other are divided into three separate images, which are then aligned relative to each other and finally combined to form the resulting colour image.
First of all, the Prokudin-Gorskii pictures contain useless borders around the images. One is white, the other is black. In order to improve the final result the borders are trimmed as much as possible before anything else.
The main image is then separated in three, one for each of the colour channels. At this point, the channels need to be aligned in order to produce the corresponding RGB image. To obtain the displacement vector used to shift the second and third channel relative to the first, they are compared using the normalized cross-correlation method. The highest value returned by this algorithm indicates the position at which the two images best overlap. It is then only a matter of subtracting the new position to the original in order to find the displacement vector. Another method called the sum of squared differences could also have been used in order to compare the channels, however since the three channels do not have the same intensity (they represent different colours after all) the results that this method yield can be less than satisfactory. A sobel filter is also applied to each channel in order to get better results.
This method works well for small images (resolution wise), however for big pictures the comparison process can be timely. So in order to remedy to this problem, the image is recursively resized by half (from 100% to 50% to 25%, etc.) to an acceptable size and then the NCC is applied. On each call back, the displacement vector is multiplied by by the resize factor (e.g v[]*2 from 50% resolution back to 100%) and the SSD is computed on a very small window of pixels in order to try to reposition the vector.