Scene completion using millions of
photographs
Students: Or Cohen & Alona Zadneprovski
Project Supervisor: Gur Harary
Introduction
What can we do with a million of photographs? In our project we present
a new image completion algorithm using a huge database of a million photos. The
algorithm fills out gaps or holes in a photo by finding a similar image area
from a known and very large database. The project is an implementation of the
paper: "Scene completion using millions of Photographs" written by
James Hayes and Alexei E. Efros of Carnegie Mellon
University.
Every once in a while we would like to
erase something from our photographs. It can be a garbage truck in a beautiful
Italian piazza, an old ex boyfriend in our family photo or a reconstruction
site in the middle of a church. Missing data in a photo is another problem. We
would like to complete hidden or damaged areas, such as crumbling yellow edges,
a finger accidentally caught in the frame, or a hole in a reconstructed 3D
image. Our goal is to fill out the missing or damaged area with new visual
information, without seeing the difference.
Up until now there have been two
different approaches of dealing with image completion:
1.
Filling out the blanks with what should
have been there. In other words, using concrete information of the scene apart
from our original photograph. This can be done with additional snapshots from
other angles or video footage of the scene.
2.
Filling the hole with what could
have been there. The most popular way to this approach is using similar areas
within the photo itself to fill out the missing area. This solution is
successful due to the fact that the highest probability of finding similar
textures, at a right scale and lighting, is in the photo itself.
The second approach provides pretty
results, apart from times when the area we want to complete is not surrounded
by similar textures, or the data in the original image is simply not
satisfactory. In this project we focused on the second approach, only by
completing the data with the help of a million-photo database. This algorithm
provides ways of completing images that could not have been completed by using
the former approach. The end result is providing the user with several options
of completing the input photo and letting him pick the one he likes best.
The
Algorithm
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tools
1. Python
Flicker API- a software interface we used to create the million photo database.
2. Matlab- all of the stages in the algorithm
were implemented in Matlab, thus the reason for the
long run-times.
Results
|
|
Original image |
Unwanted part of the image |
|
|
The Good results! |
|
|
|
The Bad results! |
|
|
|
Original image |
Unwanted part of the image |
|
|
The Good results! |
|
|
|
The Bad results! |
Conclusions
The average
run time of the algorithm is about three hours. The algorithm produces 7 good
results (semantically compatible and acceptable for the viewer), 7 average ones
(at first sight look good but after a while look less convincing) and 7 bad
results (illogical and look fake to the eyes of the viewer). To improve the
results one must enhance the photo database (a half million photos so far). To
improve the run time one can reduce the number of compatible scenes from 200 to
Links