Project by Romi Elbaz

Depth-map Super Resolution from a Single Image

Based on: Super resolution by single image", Glasner & Irani, ICCV'09

Abstract

Inexpensive 3D cameras such as Microsoft Kinect are becoming increasingly available for various low- cost applications. However, the images acquired by these cameras suffer from low spatial resolution as well as inaccurate depth measurements.
In the paper "Super resolution by single image",[1] Glasner et al. offer a fast and effective super resolution method for natural images. Their method does not rely on an external database or prior examples but exploits patch redundancy in the original low resolution image. In this project we implement this approach and expand it to depth images.

Implementation Details

The super resolution algorithm presented in the paper was implemented and is available on the website. See the following algorithm for an outline of the proposed approach and some key insights.

Single Image Super Resolution Algorithm

```Task: Reconstruct a high resolution image H=In
Input:
-  Low resolution image L=I0
-  Scale factor α and the number of resolution levels n
(final magnification factor Sn=αn)
For all resolution levels Il ; l∈ {1,...,n} do
For thus far reconstructed levels Im ; m∈ {1,...,l-1} do
1)	Employ in-scale patch redundancy:
For each pixel in Im find the k nearest sub-pixel aligned patches within Im,
resulting in a determined set of linear equations on the pixel values in Il
(see equation (1) in paper).
2)	Employ cross-scale patch redundancy:
For each patch in Im approximate k nearest patches within the cascade of downscaled
images Id ; d∈ {2m-l,...,m-1} and find the parent patches in Im,
resulting in a set of linear equations on the pixel values in Il
with respect to the appropriate relative blur kernel.
3)	Solve LS problem iteratively:
Describe the obtained set of weighted linear equations as a least squares problem
and solve iteratively by gradient method.
End for
End for

Output: Final high resolution image H=In
```

In stage 1, sub-pixel alignment can be approximated by running ANN on patches extracted from a Sl-m (the relative scale factor between Il and Im) finer grid, calculated by bicubic interpolation.

In stage 1 and 2, constraints must be weighed according to patch similarity score to approximate a linear system which consolidates the matched patches.

Results

RGB super-resolution:

Figure 1: comparison of RGB upscaled by factor of 4

Extension to Depth maps

To verify the assumption that depth images contain repetitive visual data, the statistical examination presented in [1] was repeated for RGBD images.

Figure 2: Comparison of the patch redundancy for RGB images and RGBD images in different scales.

Evaluation (after minor adjustments to parameter choice)

Quantitative evaluation:.

 2X 4X Cones Teddy Tsukuba Venus Cones Teddy Tsukuba Venus Nearest neighbour 1.094 0.815 0.612 0.268 1.531 1.129 0.833 0.368 Mac Aodoha[2] 1.127 0.825 0.601 0.276 1.504 1.026 0.833 0.337 Hornacek et al[2] 0.994 0.791 0.580 0.257 1.399 1.196 0.727 0.450 My result 1.114 0.817 0.585 0.257 1.539 1.167 0.855 0.346

Table 1: Root mean squared error (RMSE) scores for different RGBD images from the Middlebury database.

Qualitative evaluation:.

Figure 3: X2 super resolution of images from the Middlebury database. Images were upscaled by bicubic interpolation and by the implemented patch redundancy. The inlets show close-ups of the super resolved depth maps, edges are sharper and clearer.