loading page

KeySight: Mathematical Analysis to Increase Speed and Accuracy of Computer Stereo Vision Systems Using Multiple Cameras with Keypoint Feature Detection and KNN Matching
  • Satvik Mahendra
Satvik Mahendra
Plano West Senior High School

Corresponding Author:[email protected]

Author Profile


In this paper, a mathematical approach was used to improve stereo vision. The speed at which objects in multiple stereo images could be matched was improved using keypoints, and the use of multiple stereo images helped to improve the accuracy of the system. A three camera stereo vision system was built as a testbed to validate the approaches proposed in this paper. The research in this project can be applied as a fast and less costly method to optimize stereo vision systems with many uses such as in household robotics and autonomous vehicles.
Stereo vision is the process of passively recovering object depth from camera images by comparing two images from different cameras of the same scene. Distance to the object is computed by comparing the shift of an object between the two images. The objects that are closer in the scene will have a larger shift between the images, while objects that are further away will have a smaller shift. This shift of the objects in the image is known as the disparity. The larger the disparity, the closer the object.
Sensitivity analysis was conducted on the stereo vision calculation formula to analyze how the separation between two stereo cameras affected the system’s ability to accurately compute the distance to objects. The results showed that as the separation between two stereo cameras is increased, the accuracy and range for distance calculation also increases. A three camera system was built with cameras 3, 4 and 7 inches apart.
Current approaches for calculating object distance compares the stereo images by matching a block of pixels in one image to a corresponding block in the other image to identify the same object in both images. This method, known as “block matching”, is time consuming, preventing usability on autonomous vehicles that rely on real-time information. SIFT keypoint feature detection along with k-Nearest Neighbor (knn) matching were studied and used to match the objects between two images. Keypoints are points in an image with unique features that can be identified on a corresponding stereo image such as corners or edges. Each keypoint has its own descriptor, which is a mathematical array and can be compared to other keypoints in the corresponding stereo image in order to match them. This approach is faster because it only requires matching of important regions of the image. Using the 3 camera test bed, data was collected in two environments. First in a noisy environment with a lot of background objects. Second, with the same objects and  a plain  background.The use of keypoints was faster and provided real-time data that was used to filter out the matches between these keypoints led to more accurate results. Furthermore, cameras with higher separation between them (7 inches) were able to more accurately detect the distances to objects further away while cameras with 3 inches and 4 inches of separation could only find the distances to nearby objects accurately.
Results showed that in the more noisy environments, the 3in and 4in camera separation system exceeded an error of ±5in when an object was 85in away, while the 7in camera separation system exceeded an error of ±5in when an object was 165in away. In less noisy environments, the 3in camera separation system exceeded an error of ±5in when object was 90in away, the 4in camera separation system exceeded an error of ±5in when object was 100in away, and the 7in camera separation system exceeded an error of ±5in when object was 200in away.