B-Pose: Bayesian Deep Network for Accurate Camera 6-DoF Pose Estimation from RGB Images
Camera pose estimation has long relied on geometry-based approaches and sparse 2D-3D keypoint correspondences. With the advent of deep learning methods, the estimation of camera pose parameters (i.e., the six parameters that describe position and rotation) has decreased from tens of meters to a few centimeters in median error for indoor applications. For outdoor applications, errors can be quite large and highly dependent on the levels of variations in occlusion, contrast, brightness, repetitive structures, or blur introduced by camera motion. To address these limitations, we introduce, BPose, a Bayesian Convolutional deep network capable of not only automatically estimating the camera’s pose parameters from a single RGB image but also providing a measure of uncertainty in the parameter estimation. Reported experiments on outdoor and indoor datasets demonstrate that B-Pose outperforms SOTA techniques and generalizes better to unseen RGB images. A strong correlation is shown between the prediction error and the model’s uncertainty, indicating that the prediction is almost always incorrect whenever the model’s uncertainty is high.
This research is supported by the Commonwealth of Australia as represented by the Defence Science and Technology Group of the Department of Defence.
Email Address of Submitting Authoraref.email@example.com
ORCID of Submitting Author0000-0001-9542-759X
Submitting Author's InstitutionThe University of Western Australia
Submitting Author's Country