By Carl Ji

Overview

In this project, I developed a image processing pipeline to align RGB panels for Prokudin-Gorskii’s photo collection. The main challenge is to implement an efficient search method for panels with large dimensions, as some images can be quite large. In the following sections, I detail the iterative development of the efficient pyramid search method, describe the image preprocessing techniques utilized, and conclude with presentation of the outcomes of the panel alignment process for our dataset.

Approach

Starting Off with…

…Directly stacking the panels on top of each other without employing any matching techniques.

I wanted to get a sense of the range of displacement adjustments we are dealing with for these panels. Thus, after loading the image in as numpy ndarrays, I separated it into three panels of equal height, and directly stack the panel matrices on top of each other. Let’s take a look:

image.png

There’s noticeable misalignment in the RGB panels. We can also see some dark boarders on the sides of the images, resulting from cropping the panels.

For this image of a cathedral, the dimensions of the panels are relatively small, so we start trying to align them with a brute force search over a small region in the center.

Naive Matching

In order to do any matching, we first need metrics to measure how well 2 panels match up at a particular displacement.

<aside> 💡

To clarify, by displacement I mean how many pixels to move the misaligned panel from the base panel. For example, a displacement of (2, 3) means moving the misaligned panel by 2 pixels right, and 3 pixels down. Directly stacking 2 panels means a displacement of (0, 0).

</aside>

I decided to implement both Euclidean Distance and Normalized Cross-Correlation metrics.

Euclidean Distance

The Euclidean distance of 2 matrices measure how different the matrices are element wise. A lower Euclidean Distance value means that the elements of the 2 matrices are more similar, thus implying better alignment of the panels.

Here’s the formula I’ve used:

$$ Euclidean\ Distance = \sqrt{\sum_{i=1}^{m} \sum_{j=1}^{n} ( \text{image1}{ij} - \text{image2}{ij} )^2} $$

Normalized Cross-Correlation

The NCC is a measure of similarity between 2 matrices, which are treated as vectors for this calculation. We first flatten the matrices to vectorize them, followed by normalization.