- The Point Correspondence Problem
- Real-World Applications of Point Correspondence
- Algorithmic Innovation = Breakthrough in Computer Vision
- Use Cases for this Breakthrough Algorithm
- Novel Methods for Using Point Correspondence in TechSee
- Vision is the Future
Ever wonder how the human brain can compute distance? How far away is the highway exit, when exactly to step off the escalator, the exact position of the lips you want to kiss? This essential capability is known as depth perception.
When both eyes focus on an object, the difference between the visual images that each eye perceives due to its unique angle – known as retinal disparity – is the phenomenon that allows the human brain to compute depth perception.
While the human brain can easily compute the difference in visual images automatically, computer vision must find ways to learn how to find points in one image that correspond correctly to points in another image. This technique is known as point correspondence or point matching.
Note that there are multiple methods of achieving the same goal of depth estimation using special hardware, such as LiDAR and IR cameras. While these methods do not require point correspondence, they are much more costly than “normal” cameras and are often not feasible at scale.
The Point Correspondence Problem
This is easier said than done.
Point correspondence – or point matching – is a very complex problem in computer vision technology. When images are captured with different points of view or at different angles, the corresponding points may appear differently across the images.
Aside from differences caused by geometric transformations and motion-based photometric conditions, other differences can be caused by changes in color, contrast, time, or texture between the two images. The problem is further compounded when moving objects are involved in the scene. These differences result in false matches that damage the point correspondence. In fact, well-known computer vision researcher Takeo Kanade once quipped that the three fundamental problems of computer vision are: “Correspondence, correspondence, and correspondence!”
It is crucial to solve the correspondence problem for computer vision applications. When image points are correctly matched, other techniques can then be applied to the images in order to detect the exact positioning and movement of the corresponding points in the scene. For example, point matching is essential when creating a panoramic scene or performing image stitching — attaching two or more images that only overlap each other slightly. To do this, sets of corresponding points must be matched correctly in order to stitch one image to another properly.
Real-World Applications of Point Correspondence
Accurate point matching is not just a key component of computer vision object detection algorithms. It has important implications for a wide range of real-world applications. Some examples:
Points from the vehicle’s laser scanners are matched to points on a grid-based reference map of the environment. To position itself in the map, the navigation system calculates the vehicle’s position using the matched points. It also uses odometry to compute the vehicle’s change in position by measuring its movement. Inaccuracies in the vehicle’s point correspondence can lead to incorrect calculations with dire consequences.
3D object reconstruction is a key technology used in computer graphics-based video games and other virtual reality applications. The 3D model is built based on images taken from a wide range of angles and scales. In order to achieve seamless reconstruction results, point matching is used to estimate the spatial relationship between different images, thereby stitching the scenes together. Inaccuracies in determining point correspondence would skew the images and negatively affect the 3D reconstruction.
Robotic systems coupled with medical imaging techniques deliver many benefits in the operating theater. To align the image space with the physical space during medical image registration, point matching is used for localization. It finds the point pair correspondence between freely distributed fiducial markers in the image and the physical space. Inaccuracies in the point matching process would necessitate a time-consuming manual matching process, or worse.
In these examples, along with hundreds of others, any errors in point matching – known as outliers – can be a significant barrier to achieving the desired outcome.
Algorithmic Innovation = Breakthrough in Computer Vision
Recognizing the need to improve accuracy in computer vision point matching to minimize any outliers, I designed an algorithm that mitigates the error rate in point correspondence, thereby enhancing point correspondence accuracy. This research, conducted along with Erez Farhan and Rami Hagege, was published as an academic paper in 2017.
The research notes that applying an affine transformation model (a geometric transformation that preserves lines and parallelism) to local regions is a particularly suitable approach for point matching. However, affine invariant methods have not been used extensively because they are computationally demanding and have limited accuracy.
The breakthrough algorithm represents a novel method for locating large amounts of local matches between images with highly accurate localization. The method is based on the accurate estimation of affine transformations, which are used to predict matching locations beyond initially detected matches. The dramatically improved estimations of affine transformations are then used to locally verify tentative matches efficiently. False matches are rejected, and the localization of correct matches usually rejected by state-of-the-art methods is actually improved.
Use Cases for this Breakthrough Algorithm
The main uses of this algorithm are applied in computer vision and augmented reality technologies: visual object tracking and homography estimation.
Visual Object Tracking
Visual object tracking is used to locate a specific object in all frames of a video, when given only its location in the first frame and without the availability of the rest of the video. Several factors can negatively affect tracking results, such as lighting changes, obstructions, angles, and being out of view. Improving the point correspondence technique results in improved abilities for computer vision algorithms to accurately perform visual object tracking.
Homography estimation maps two planar projections of an image in space. This can be used for navigation or to insert models of 3D objects into an image or video so that they are rendered with the correct perspective and appear to have been part of the original scene. Improving the point correspondence technique results in improved abilities for computer vision algorithms to accurately perform homography estimation.
Novel Methods for Using Point Correspondence in TechSee
This research has been successfully applied to TechSee’s visual assistance technology.
Visual object tracking can be used to enhance remote support. For example, if a technician needs assistance with a complex cabling job, he can use remote visual assistance technology to enlist the aid of a remote expert. The remote expert can visually track the different cables and parts to assist the technician with resolving the issue. The technique can also be used for device recognition and job verification.
Homography is a necessary element of Augmented Reality (AR). The most advanced AR user manuals help auto-identify the problem and guide a customer to self-resolution. For example, specific vehicle apps can detect and analyze common vehicle issues. Car owners can point their smartphones at different parts of the car, at which point AR overlays will display more information, such as how to change the air filter, engine oil, and brake fluid.
Vision is the Future
My breakthrough algorithm both dramatically improves the results of local affine transformation estimations and locates a massive number of accurate point matches between images while eliminating outliers. This improvement has significant implications for a wide range of computer vision applications that rely on accurate object tracking and homography estimation. The future of customer service has just expanded.