TP1: Colorizing the Russian Empire

Brief description

The goal of this TP is to generate color images from Sergei Mikhailovich Prokudin-Gorskii Collection. Each element of this collection contains three separated images for R, G and B channels, therefore in order to generate color images it’s necessary to superimpose these three channels on each other. The challenge of this assignment is to find an algorithm to register these images.

My approach

I did the registration by implementing two algorithms, Sum of Squared Differences (SSD) and SURF. I have also used image pyramids in order to increase the speed of the single-scale SSD algorithm.

Single-scale SSD approach

In single-scale SSD algorithm we select one channel as our reference and then we try to find a linear translation that register the other channels on it. In order to find this translation, we iteratively shift the second image in a small window ([-15 15] for example) and calculate the sum of square difference between the image and our reference. The minimum SSD shows the best match. It is worth mentioning that there is no need to calculate the SSD on whole image, we can select an ROI and calculate the SSD for this region to increase the efficiency of the algorithm.

Multiple-scale approach (image pyramid)

Since SSD contains an exhaustive search, it would be very slow on high resolution images; using image pyramid, we could increase the speed of the algorithm significantly. Image pyramid is a collection of down-scaled copies of the image in question; usually these images are scaled by a factor of 2 so they are called image pyramid. In other words this collection is a representative of the original image in multiple scales.

Using image pyramid, we would be able to try smaller displacement on reduced sized images to estimate the final translation and try to improve this estimation going towards the bigger images through the pyramid.

The Results

Single-scale SSD approach

The following tables show the obtained results using single-scaled SSD approach with the calculated translation for G and B channels. (I assume R as my reference channel).

Click on images to see full-size version.

G Translation = [-6 , 2] , B Translation = [-10 , 1]

 

G Translation = [-3 , -2] , B Translation = [-5 , -5]

 

G Translation = [-7 , 0] , B Translation = [-11 , -1]

 

G Translation = [-3 , -1] , B Translation = [-4 , -3]

 

G Translation = [-3 , 1] , B Translation = [-6 , 0]

 

G Translation = [-11 , 0] , B Translation = [-11 , 2]

 

G Translation = [-3 , 0] , B Translation = [-3 , -11]

 

G Translation = [-6 , 0] , B Translation = [-11 , -1]

 

G Translation = [-8 , -2] , B Translation = [-11 , -2]

 

G Translation = [-8 , -1] , B Translation = [-11 , -4]

 

G Translation = [-5 , 2] , B Translation = [-10 , 2]

 

G Translation = [-7 , -1] , B Translation = [-11 , -3]

 

G Translation = [-4 , 0] , B Translation = [0 , -11]

 

G Translation = [-2 , -2] , B Translation = [-3 , -4]

 

Multi-scale SSD approach (high resolution images)

G Translation = [-47 , -14] , B Translation = [-71 , -34]

 

G Translation = [-26 , -1] , B Translation = [-42 , -6]

 

G Translation = [-68 , -8] , B Translation = [-124 , -33]

 

G Translation = [-26 , -9] , B Translation = [-13 , -20]

 

G Translation = [-17 , -13] , B Translation = [-52 , -37]

 

G Translation = [-43 , -26] , B Translation = [-86 , -32]

 

G Translation = [-34 , -9] , B Translation = [-48 , -13]

 

G Translation = [-52 , -19] , B Translation = [-88 , -39]

 

G Translation = [-59 , -17] , B Translation = [-110 , -66]

 

G Translation = [-97 , 0] , B Translation = [-155 , -13]

 

G Translation = [-48 , -1] , B Translation = [-84 , -14]

As can be seen, although up-left corner of Blue plate and down-right corner of Red plate are broken, the result is acceptable.

 

G Translation = [-71 , -140] , B Translation = [-147 , -57]

 

G Translation = [-59 , -35] , B Translation = [-97 , -8]

 

G Translation = [-24 , -1] , B Translation = [-36 , -19]

 

I am Sergei Mikhailovich!

In this part I tried the Multi-scale approach on 3 image sets that I took with my cellphone. The following tables show the results. Click for actual size pictures!




G Translation = [2 , 13] , B Translation = [3 , 15]

As can be seen it’s snowy in this picture then there are many snowflakes scattered in R, G and B images. Since the positions of the snowflakes are not identical in these three images, they caused some red, green and blue spots in the result.




G Translation = [8 , -8] , B Translation = [19 , -11]

 






G Translation = [-43 , -21] , B Translation = [-74 , -52]

In this part I tried to be creative so I moved the camera randomly when I was taking the pictures. Since the R, G and B channels are so different, this method was unsuccessful to find a proper translation to generate a good result.

Using SURF

To solve the above mentioned problem I used SURF (Speeded-Up Robust Features) algorithm. SURF is an improved version of SIFT (Scale Invariant Feature Transform) which is a feature point detection algorithm. SURF is usually used in 3D reconstruction projects to match images from different viewpoints.

Here I use SURF to detect feature points in Red, Green and Blue channels, then use an average of difference to find the proper translation to superimpose keypoints on each other to create the final RGB image.

In the following figures the selected keypoints in different color channels and their correlations are shown:

Matched feature points between R and G channels

 

Matched feature points between R and B channels

 

The final result is shown in the following table (Click for actual size pictures).




G Translation = [-411 , -467] , B Translation = [-595 , 63]

 

As shown, despite of a huge difference between R, G and B channels the quality of result is improved significantly in comparison with multi-scale SSD approach.

 

And at last but not at least (!!!) this is me separated in three parts and got back together as a color image!




G Translation = [0 , -25] , B Translation = [-2 , -7]