Scene completion using millions of photographs

 

Students: Or Cohen & Alona Zadneprovski      Project Supervisor: Gur Harary

 

תמונה לאתר2.jpg

 

 

Introduction

 

What can we do with a million of photographs? In our project we present a new image completion algorithm using a huge database of a million photos. The algorithm fills out gaps or holes in a photo by finding a similar image area from a known and very large database. The project is an implementation of the paper: "Scene completion using millions of Photographs" written by James Hayes and Alexei E. Efros of Carnegie Mellon University.

 

Every once in a while we would like to erase something from our photographs. It can be a garbage truck in a beautiful Italian piazza, an old ex boyfriend in our family photo or a reconstruction site in the middle of a church. Missing data in a photo is another problem. We would like to complete hidden or damaged areas, such as crumbling yellow edges, a finger accidentally caught in the frame, or a hole in a reconstructed 3D image. Our goal is to fill out the missing or damaged area with new visual information, without seeing the difference.

 

 

Up until now there have been two different approaches of dealing with image completion:

1.       Filling out the blanks with what should have been there. In other words, using concrete information of the scene apart from our original photograph. This can be done with additional snapshots from other angles or video footage of the scene.

2.      Filling the hole with what could have been there. The most popular way to this approach is using similar areas within the photo itself to fill out the missing area. This solution is successful due to the fact that the highest probability of finding similar textures, at a right scale and lighting, is in the photo itself.

 

The second approach provides pretty results, apart from times when the area we want to complete is not surrounded by similar textures, or the data in the original image is simply not satisfactory. In this project we focused on the second approach, only by completing the data with the help of a million-photo database. This algorithm provides ways of completing images that could not have been completed by using the former approach. The end result is providing the user with several options of completing the input photo and letting him pick the one he likes best.

 

The Algorithm

 

Text Box: Step 1: Removing unwanted element

In this part, using the natural cubic spline algorithm, we select the area of the photo we want to remove.

 

teaser_input

Text Box: Step 2: Scene & colour descriptors

In part 2 the original image is depicted as a scene using scene and color descriptors. Next we scan our database of a million photos and find images with similar scenes. The 200 images with the most compatible scenes are chosen as an input for the next step.

0681_gist_photoshopped

 

mid_res_montage

Text Box: Step 3: Locating optimal strip

In part 3, we create a strip, 80 pixels wide, around the cut we made at the original photo. Next, we find the optimal location of the strip (in terms of SSD) in the 200 potential images. Finally the 20 most compatible images are chosen based on the weights of their strip and scene descriptor.

 

Text Box: Step 4: Optimal stitch

In this step we calculate the optimal location of the stitch that combines the original photo and all of the 20 options for completion. This step is implemented using the "Fast approximate energy minimization via graph cuts" algorithm. This step is actually "pasting" the two images together, with the stitch between them still apparent.

 

 

Text Box: Step 5: Image Blending

In this part, the stitch between the 20 new completion images is blurred, giving the result image a more natural look, deceiving the eye.

col.jpg

 

col.jpg

Text Box: Step 6: 20 completions and final result

In this step we provide the user with 20 completions, letting him choose the one he likes best.

col.jpg

 

 

 

 

 


Tools

 

1.       Python Flicker API- a software interface we used to create the million photo database.

2.      Matlab- all of the stages in the algorithm were implemented in Matlab, thus the reason for the long run-times.

 

Results

 

IMG_0908

IMG_0908_mask

Original image

 

Unwanted part of the image

 

The Good results!

 

The Bad results!

 

IMG_0100

IMG_0100_mask

Original image

 

Unwanted part of the image

 

The Good results!

 

The Bad results!

 

 

 

Conclusions

 

The average run time of the algorithm is about three hours. The algorithm produces 7 good results (semantically compatible and acceptable for the viewer), 7 average ones (at first sight look good but after a while look less convincing) and 7 bad results (illogical and look fake to the eyes of the viewer). To improve the results one must enhance the photo database (a half million photos so far). To improve the run time one can reduce the number of compatible scenes from 200 to 100. In addition, another stage can be added in which the user rules out incompatible scenes before they are planted in the original photo. Despite all of the mentioned above, the quality of the results is very good and can definitely fool the human eye.

 

Links

 

Project Presentation

Project Book

Code