The goal of this project is to reconstruct an RGB version of three monochromatic images taken separately from the same scene and (hopefully) the same position.
In order to do that one can try to find a (x,y) translation of two channels over a third one that minimizes a given metric. In this work I applied the Sum of Squared
Differences over two channels, and the results are shown below.
Results: Single Scale Approach
For the alignment of images of small size I performed a simple Sum of Squared Differences (SSD) between two channels where one is fixed and the other translated
vertically and horizontally in the interval [-15:15]. In order to obtain better results the border of the images are cropped before this comparison. The details of how
the crop is done is show in the "Additional Credits" Session.
Results: Multiple Scales Approach
In order to deal with the larger size of the pictures I implemented the suggested idea of a pyramid of images. In this case the image is reduced to 10%, 20% and 30% of
its original size. To compare the cost of translating images with different sizes I scaled the SSD by the square of the scale factor. The translation that presents the
smallest cost is then scaled back to the original image size in order to have the desired translation effect.
Results: Multiple Scales Approach (10 extra images from lcweb2.loc.gov)
Here are the results on more images from the provided dataset.
Results: Put yourself in the shoes of Prokudin-Gorskii!
Here are the results of applying the technique on images I took myself.
Additional Credits(1): Cropping borders
In order to the rid of the borders I implemented a discontinuity detector in the vertical and horizontal directions. For every image channel I perform the sum
of intensities of all columns. Next I check which of the columns in the first 10% and the last 90% of the image columns pass the following test:
where "column_i" is the column index and "all_columns" is the vector containing the sum of every column. I take the rightmost and leftmost value of the columns
that passed this test and decide the image will be cropped at those locations. In the following image it is possible to see the blue column indicating the sum for
every column in the x axis. The red lines indicate where the image was cropped.
The following pictures are the result of applying the automatic crop algorithm.
Additional Credits(2): Channels alignment by measuring the image sharpness
One can measure the image sharpness by taking the average of the gradient magnitude. The idea was then measure the sharpness of an image composed by two channels overimposed.
Ideally two perfectly aligned channels would present a high sharpness since the edges will be more well defined than with two channels badly aligned. The results show that this
metric can estimate better alignment for most of the images.
Additional Credits(3): Colorization with deep learning
Here I used a deep neural network to estimate colors from a greyscale version of the rgb output of the algorithm. The results can be seen bellow.
Source: Colorize-photos