Autostitching
and
Image Mosaics

dream and make the new reality with us

xr.berkeley.edu



Table of Contents

1. Perspective Warp
2. Harris Corner Detector and ANMS
3. Extracting and Matching Feature Descriptors
4. Image Mosaics
5. Cylindrical Mapping and 360° Panorama
6. Thank you!








1. Perspective Warp


The homography is a very versatile operation. Thanks to it, I saw what Picasso saw, and saw Aristotle seemingly say something important to Plato.






Nature morte au compotier, Picasso (1915)



The School of Athens, Rafael (1511)








2. Harris Corner Detector and ANMS


I find it somewhat funny that the first recursive function I wrote after a while was one that implemented Adaptive Non-Maximal Supression (ANMS). But when I think about it, this only makes sense: after a nice semester of relying on NumPy, I developped somewhat of an aversion to all sorts of loops. That said, ANMS was also the slowest function I wrote since a while. Having to search through all local maxima, without a NumPy magic trick to save the day, one thing was for sure, the process was going to be slow. As such, I somewhat deviated from Brown et al. (2005), by introducing an additional parameter to the ANMS, namely "greediness". The higher the greediness, the fewer points are checked for neighbors within radius length. Practically this worked well for testing, but for the final results, I decided to perform full search (the generated panoramas looked a bit better with it).

Otherwise, I set the c_robust parameter to 0.9, as Brown et al. suggest, and begin searching from a radius of 0. The following figures demonstrate the results of the Harris Corner detector, and how the points can be "better spread" by using ANMS. Notice that when c_robust is very small, as in the last figure, the procedure may not always lead to the desired number of points. This is because the distribution of keypoints may be such that many points within a really large radius (that practically spans the totality of the image), are above a certain intensity, and therefore never fail the c_robust test. The goal for the last figure was to reach 112 points, but with a c_robust of 0.3, the process halted at 204 points. This means that 204 points had all intensities above 0.3 times the intensity of the maximum, and therefore could not be eliminated. On the other hand, setting c_robust above 1 also leads to problems, but of a different type. In such a case, points with greater intensities (and therefore more likely to be features) have a chance to be eliminated before points with lesser intensities (and therefore likelihood to be features), within a given radius. In practice though, I observed that setting c_robust really high, for instance at 1.5, did not make that much of a difference, because the algorithm begins by considering the maxima anyway. I hence expect this problem to become more apparent only if a large number of maxima are clustered in a small region.





Harris Corner Detection, >2000 points


Harris Corner Detection, >2000 points


c_robust = 1.5, 255 points


c_robust = 0.9, 255 points


c_robust = 0.3, 255 points


c_robust = 1.5, 112 points


c_robust = 0.9, 112 points


c_robust = 0.3, 204 points








3. Extracting and Matching Feature Descriptors


Talk as much as you want about Aristotle and Plato, but for this project I owe my money to Pythagoras.





Randomly Sampled Feature Vector


Randomly Sampled Feature Vector


Randomly Sampled Feature Vector


Randomly Sampled Feature Vector


Randomly Sampled Feature Vector


Randomly Sampled Feature Vector


Randomly Sampled Feature Vector


Randomly Sampled Feature Vector


For the threshold against which to evaluate 1-NN/2-NN (aka. Lowe's ratio), I chose 0.4, as this seemed to be the inflection point after which comparatively more outliers will be selected than inliers.





Feature matching with 8 points.


Feature matching with 313 points.







4. Image Mosaics


For this assignment, I took pictures of the Acropolis, and also utilized pictures I had from a wood close to Lake Tahoe in Sierra Nevada, and a coastal part of Lake Balaton in Hungary. To stitch images together, I used a two-level Laplacian stack, followed by a simple weighted alpha blending for the parts of the images that overlap. For the most part, this was able to remove the seems joining the images, and blend their colors together, but not always. I noticed that pictures with highly varying exposure were very hard to blend, and almost all techniques failed on them. Other things I tried where white balancing using the grey world assumption, and histogram equalization. The following figures illustrate some of the results.





















↥ Photos Taken






Acropolis Panorama, Manual



Acropolis Panorama, Automatic



Tahoe Panorama, Manual



Tahoe Panorama, Automatic



Balaton Panorama, Manual



Balaton Panorama, Automatic

Notice how automatic stitching is slightly better, and can stitch more images.






Acropolis Panorama, Low Frequencies



Acropolis Panorama, High Frequencies



Acropolis Panorama, Two-Band Blend



Tahoe Panorama, Two-Band-Blend








5. Cylindrical Mapping and 360° Panorama


As I wanted my panorama to look a bit better, I began looking into Bells & Whistles. By reading further, I realized that most pictures we take have imperfections to them, even if we tend to think of them as rectilinear projections. From the looks of it, my pictures suffered from pincushion distortion, at different levels of severity. This was primarily evident in images where the Acropolis was away from the center. To that extend, I looked into cylindrical mapping. After quite a lot of experimentation with different types of grids and mapping functions, I realized that I simply had to project each individual image to a sphere, and then reproject it back to a plane rectilinearly.

This all sounded good on paper, until I was faced with an important problem: I didn't know the focal length that I used to take my pictures. Unfortunately for me, the EXIF data I was able to extract contained only the FocalLength (17/4), not the plane resolutions. As such, I used trial and error to approximate the camera intrinsic matrix. Long story short, after a few trials, it worked, and I was very excited! Notice how the cylindrical mapping addresses the "ghosting" effect around the Acropolis, and makes the terrain take on a much more natural form:





Acropolis, Cylindrical Mapping



Acropolis, Cylindrical Mapping + Two-Band-Blend



Acropolis, Very Low Focal Length



Acropolis, Low Focal Length



Acropolis, Medium Focal Length








6. Thank you!


Overall, I really liked this project, because it made me read up on projective geometry, solidify my understanding of the RANSAC, and appreciate the versatility of the homography. Most importantly, I now understand some of the research I do in Extended Reality and Robotics a little better. Also, I can't express how much I liked the fact that this project was about synthesizing what we had already learned in projects 1-3. There is something amazing about reading a paper, implementing it from scratch, and then reflecting back to how it "fits in" the broader picture.

Thank you for your effort running this class, the projects have been amazing!






I accidentally passed in the points of a set of planarly-mapped images
alongside cylindrically-mapped images to my stitching algorithm.

What followed had no precedent.

Chaos 





♡☮


CS 194-26: Computer Vision and Computational Photography (Fall 2020)