Tour into the Picture: Single View Modeling

Castro Cabrera, Ramses


Description of the project

The goal of this project is to create a simple, planar 3D scene from a single photograph. The project will follow the description in Tour into the Picture by Horry et al. in modeling the scene as a 3D axis-parallel box. First, we will let the user specify simple constraints on that box (the back wall plus the vanishing point). Then, it’s just a matter of extracting the coordinates of the box in 3D, and texture-mapping the faces of the box. The paper has a rather poor description of the process, so consult the lecture notes.

Approach

For this project, the main idea is to create a 3D model from a 2D picture. In this method it must be used one picture or photograph of a 3D scene as input, from which we wish to make a computer animation. Then we specify one “virtual” vanishing point for the scene. This vanishing point will be used as a reference point for the 3D model and will help to create a reference also for the points needed for making the homography.
This vanishing will provide us with some points at the limits of the image, giving us two groups of four points that we can use to make planes from the 2D picture. The next step will be to generate planes from these points with the help of the homography so we end up with a 3D box that we can "explore".



The process

1.- Have an image: Not a simple image, it must be one that has a 3D perspective, a room, a hall or even a nature road
2.- Create a vanishing point: It means that we must create a point where all the vectors will meet. At this point, we have to assure that we will create a symmetric, of the most close to symmetric point for all the image, otherwise we'll have a matrix concordance error.
3.- After create these vectors we have to create a rectangle surface in the middle of the images that will transform in our box's bottom face, and we have to make the vectors align with the rectangle corners.


4.- We have to use this information to generate a second amount of vectors that will be the guide for the 3D model planes. We have four points from the rectangle drawed, and four points from the crossing of the vectors ant image limits. That give us a total of eight points useful for creating a box
5.-As we can't "fold" the images from the points of the image limits (because the images would pair and the matrix calculation wouldn't work), we have to expand the image for make her a working space.

6.- With these points now we can do an homography for each section of four vectors that will result in the distortion of the "walls"of the box, depending on the depth given to the box
7.-We end by making these planes to meet each other depending the coordinates for the corners. Now we have a box of a 2D picture that we can "explore" with camera commands.


The results

A room in a toon




A simple supermarket




A drawing




ULAVAL tunnel




The Forest




Details and difficulties

The details:
If there's not a symmetric drawing (or at least something that seems like symmetric) the matrix transformation will fail due to the points concordance.
Drawing a bottom rectangle too small will make a bottom face with a lot of zoom, we have to realize that this bottom face must find the way to deform so she can get to the coordinates of the other faces, resulting in an excesive zoom.
Sometimes the picture makes impossible to not include on one face the information of another, for example, taking as floor something that must stay on a lateral face, such as boxes, chairs, beds, etc.

The difficulties:
First of all, the biggest problem that I had was to find the exact combination of vanishing point and rectangle drawing to make a computable matrix, sometimes even for a single pixel, the vector moved a lot making a non symmetrical amount of vectors. The following problem is to generate an homography for each face of the box, even if I hade the points, the logic thinking behind and trying to change the mind ideas from 2D to 3D made dificult to orientate, asign values and find an order in the combination of pixel pairs. The last problem wasto use the camera changing view, I mean, trying to make the camera change of position, zoom, and making the target "rotate"; for each model is different so there were times when in a roation, the camera got covered by a 3D model face looking from outside.



References

Youichi Horry*‡ Ken-ichi Anjyo† Kiyoshi Arai, Tour Into the Picture, 2011, http://graphics.cs.cmu.edu/courses/15-463/2011_fall/Papers/TIP.pdf
Ryoichi Mizuno, Tour into the picture, http://www.mizuno.org/gl/tip/
Mathworks information, http://www.mathworks.com/help/matlab/ref/camdolly.html?refresh=true
Drawings:
https://www.google.ca/search?q=dibujo+pasillo&espv=2&biw=1517&bih=741&tbm=isch&tbo=u&source=univ&sa=X&ved=0ahUKEwiw7v-cr7HMAhXGyIMKHRrXAM0QsAQIGg&dpr=0.9#imgrc=2KGFD_UTk0068M%3A
https://www.google.ca/search?q=dibujo+pasillo&espv=2&biw=1517&bih=741&tbm=isch&tbo=u&source=univ&sa=X&ved=0ahUKEwiw7v-cr7HMAhXGyIMKHRrXAM0QsAQIGg&dpr=0.9#tbm=isch&q=toon+room&imgrc=gxljj_DB6t77iM%3A