There are many other applications for Reality Flythrough ranging from improving the quality of life of the disabled to allowing people to fly ahead of their vehicles to see what traffic jams can be avoided.
Reality Flythrough (also known as Tele-Reality) was first described in the academic literature by Szeliski (94) as the ideal for immersive real-time live flythroughs through reality. Much research has been done since in texturing virtual reality with photos, but little has been accomplished in using live video feeds for the texturing. The main reason for this is that live texturing is a very difficult problem.
I propose to tackle this problem by assuming a high density of cameras and avoiding texturing altogether. Given enough cameras, it is possible to use the cameras alone to achieve Reality Flythrough simply by choosing the camera whose image is used for each frame of the rendered video.
My purpose for taking CSE167 was to learn how computer graphics could help with providing smooth transitions between cameras. This project is my first crack at providing these smooth transitions. I have made a number of simplifications in this proof of concept:
Despite these simplifications, the results are very exciting and suggest that still photographs can be a substitute for live video in many cases and can help reduce the requirement of having high camera density.
Disclaimer: I'm a grad student and this project has been supporting my research all quarter, so I've spent far more than the 20-30 hours suggested.
| Photo A: Pos(4.63, 5.00, 10.00) 35.0 deg |
Photo B: Pos(4.63, 5.00, 10.00) 75.0 deg |
Transition from A to B |
| Photo A: Pos(4.63, 5.00, 10.00) 35.0 deg |
Photo I: Pos(13.63, 5.00, 4.00) 295.0 deg |
Transition from A to I |
The screen shots don't really do it justice. You have to actually see the motion to appreciate it.
First download and install the app:
There are several different interfaces that can be used for exploring my living room. The simplest to use is one of the permutations of the "camera to camera" transitions.
There are three different kinds of camera to camera transitions. The default does a straight one to one transition. You select the transition mode by entering a number from 1 to 3. The modes are described below:
Like I said, it takes a lot of practice, and I don't have enough images of my living room to fill in all detail. This mode also reveals all of the problems with my algorithm for selecting the best camera at any position.
By now you probably have a pretty good sense of how the program works. The underlying idea is very simple. I paint each image in an appropriately sized rectangle at a position in space that is determined by treating a camera as a projector. I look directly down the camera's lense and essentially project the image onto a virtual wall that is "focus depth" feet away. The "focus depth" is pre-calculated by estimating the distance from the most dominant object in the image to the camera. (This is going to be one of the most difficult hurdles to overcome when dealing with live cameras whose position has not been predetermined.) Once all of the images have been drawn in space, performing a transition is simply a matter of moving the view from looking down one camera lense to looking down another. OpenGL takes care of the rest. Almost. I spent some time playing around with different blending methods to make the transition appear smooth and to obscure some of the artifacts that arise from using 2d images to represent space. The blending method that I settled on turns off depth testing and draws the "TO" image on top of the "FROM" image with the opacity of the "TO" image increasing from 40% to 100%.
To handle multi-camera transitions, I draw a virtual line segment from the starting point to the ending point and find all cameras that are within a delta of the line segment. I then compute a fitness value for each camera based on distance and camera angle. Ordering by fitness, I place the cameras at the approriate frames, making sure to keep the number of frames between each camera above a specified minimum.
The walk forward transition is composed of three of the multi-camera transitions: onne that does the initial rotation with as much of the translation that completes in that time; one that does the translation; and one that does the final rotation.
The bulk of the work for this project was spent on making the application scaleable. The goal is to support thousands of live video cameras. I had to make special considerations for memory utilization and speed. One example of this is how I handle texture maps. I have a texture map resource allocator, and currently there are only two textures created. Each time a camera gains focus, it acquires one of the texture maps, and loads its current image into it (remember, the goal is to support live video, so the images will always be changing). Because speed is an issue when loading these images, I do not use gluBuild2DMipmaps(). I also do not use gluScaleImage() to get my image size to be a factor of 2. Instead, I place my image in the upper left of a block of memory that is large enough to hold an image whose width and height are a factor of 2, and then specify the right and bottom coordinates of my texture map as the ratio of the original image size to the new image size.