HW1: Colorizing the Russian Empire

Overview

The purpose of this homework is to automatically produce a color image from the digitized Prokudin-Gorskii glass plate images with minimal visual artifacts as possible. The glass plate images record three exposures of every scene onto a glass plate using a red, a green, and a blue filter. In order generate a color image, extract the three color channels from the glass plate, then place and align one above the other so that the combination forms a single RGB color image. Considering that simple translation (x, y) is sufficient for alignment. Due to the large size of the plate image, we will try to make the alignment fast and efficient.

Methods

The glass plate images are offered with three color channels, the filters from top to bottom is BGR. First, we divide the original image into three images with same size. Then we align the G and R channels to the B channel respectively.

The first attempt is to compare the whole image pixel by piexl. We measure the distance between two images by SSD. The translation is exhaustively search over a window of [-range, range] pixels, the range depends on the image size and the displacement between the two images. In order to automatically and properly align the images, the range is usually high when the image size is large. This simple alignment is usually slow.

Then an image pyramid is used to align the images in different scales. In each scale we find the best translation to minimize the difference between the base image and the aligned image. Then the translation is applied to the fine scale. The translation accumulated from coarse scale to origional scale.

Result

Singlescale alignment was applied to low resolution images, multiscale alignment was applied to high resolution images. First we show some results for small size JPEG images, the singlescale method is efficient to align the images. However, it is unpractical to align the large size image, that needs to search over a large window [-n,n]. The time complexity is O(n^2). To reduce the effect of noise, we apply a gaussian low pass filter to each channel. The translations are shown below, apply the translation to the equally devided channels, for example, the aligned R channel can be obtained by doing R = circshift(r,shiftR).

The multiscale method builds a image pyramid automatically, this will help to increase the processing speed. The level of the pyramid relates to original image size. In order to automatically produce the color image, the coarsest image in our pyramid is constrained by the largest scale, i.e. s = 2.^round(log(max(size(image)/30))/log(2)). In our observation, increasing the pyramid level is unnecessary when the image size decreased to 30x30, the image smaller than this size usually won't help a lot to find the translation and may lead to a bad alignment.

JPEG format - low resolution:

Aligned                                                                 Original


00106v.jpg R [ 9,-1] G [ 4, 1]


00757v.jpg R [ 5, 5] G [ 2, 3]


00888v.jpg R [13, 0] G [ 5, 1]


00889v.jpg R [ 5, 3] G [ 2, 2]


00907v.jpg R [ 5, 0] G [ 1, 0]


00911v.jpg R [13,-1] G [ 1,-1]


01031v.jpg R [ 4, 1] G [ 1, 1]


01657v.jpg R [12, 1] G [ 5, 1]


01880v.jpg R [14, 4] G [ 6, 2]


TIFF format - high resolution:

Aligned                                                                 Original


00029u.jpg R [93,33] G [40,15]


00087u.jpg R [107,55] G [48,38]


00128u.jpg R [51,38] G [34,25]


00458u.jpg R [90,32] G [42, 5]


00737u.jpg R [50,14] G [15, 6]


00822u.jpg R [123,33] G [57,24]


00892u.jpg R [58, 5] G [15, 2]


01043u.jpg R [11,17] G [-16,10]


01047u.jpg R [71,33] G [24,20]


More TIFF Images:

Aligned                                                                 Original


00055u.jpg R [98,13] G [43,13]
Get the TIFF image

Works well on a broken glass plate!


00308u.jpg R [-74,-10] G [-83, 3]
Get the TIFF image


Test on other image:

Aligned                                                                 Original


R [-92,18] G [-127, 3]

Bells & Whistles

Instead of comparing the raw image, we find the edges by filtering each channel with sobel and prewitt operator. Some results are as good as the pixel-by-pixel comparison result, some are worse. The wild edges is a potential reason for the bad alignment. For those images have clear edges sobel and prewitt operator can work well.

Aligned                                                                 Original                                                                 Sobel and prewitt

Good alignment with clear edges!


00128u.jpg R [53,38] G [37,27]

Edges cause wrong alignment!


00087u.jpg R [103,294] G [ 0,-42]


Conclusions

This is an interesting project which explored the image alignment including singlescale alignment and multiscale alignment. The multiscale method may apply to other application that processing large size images.