Simultaneous Localization and Mapping (SLAM)

Simultaneous localization and mapping (SLAM) is an approach of building and updating a map of an unknown environment while simultaneously using the map to compute the location of the robot it is implemented on. Current state-of-the-art SLAM algorithms are based on both computational geometry and robotic vision while there are also interesting research studies on other sensing modalities. The term SLAM is coined on the paper [1] by a group awesome researchers from the institute I am currently pursuing my doctoral degree 🙂

Mathematical Problem Formulation

SLAM computes an estimate of the state of the robot xt and a map of the environment mt given a series of controls ut and sensor observations ot over discrete time steps t. Since these quantities are probabilistic, the objective to compute is defined as P(mt+1 , xt+1 | o1:t+1 , u1:t)

For example, Let’s assume we are trying to explore an unknown environment. We would first visualise what is around us, for example let’s say there is a school. Now we’ll move in some direction, for a certain distance and see a different landmark like a park and we mentally can create a map that tells us where we are and what is around us. So, in simple terms we estimated our motion dynamics, calculated where we are from known landmarks, and updated our mental map with new observations. For a robot, if it has wheels, we can get the motion details from an inertial measurement unit which will allow us to calculate odometry. However, to avoid accumulated errors, we need to also update the map while we are also localising, simultaneously.

Loop closure is the core idea behind SLAM, which means when we recognize a previously-visited location, coming back to the school after going to different places in the city from the park in the example, it is important to update the belief accordingly, eliminating accumulated errors that might have additionally occurred due to noise during localization and mapping. This can be performed by comparing two frames (image-image), comparing a frame to the map (image-map) or comparing the locations in the map (map-map). While the fundamental research started in the robotics domain, SLAM has shown promising results in augmented reality and embedded systems. Now that we have created a map, optimisation can be implemented to find the shortest path for travelling to save power and time.

For environments with immoderate salient features, spurious features and repetitive textures can reduce the robustness of reconstruction. An alternative approach to the feature-based matching is using direct methods, where the photometric error between images are minimised. While ORB-SLAM [2] is a well-known feature-based SLAM, direct method based SLAM such as LSD-SLAM [3] and Direct Sparse Odmoetry [4] show promising results. While the previous method is less sensitive to changes in variation in lighting and large motions unlike the latter method, the latter method is computation cheap with rich information. 

Bundle adjustments (BA) are one of the renowned techniques in state-of-the-art feature-based reconstruction to re-evaluate camera poses and feature poses in the map as a maximum likelihood estimate while filter methods represent these estimates as a probability density function. Extended Kalman Filters (EKF) are used in general with multivariate Gaussian to model probabilities for the state estimates and observation noise to give a Bayesian solution. Particle filters on the other hand, do not assume the shape of the function. While it is common to use conventional monocular cameras in Visual SLAM, Conventional imaging systems such as stereo cameras, depth cameras and LIDAR can generate complementary results improving the accuracy of the reconstruction. There are also other interesting research studies in SLAM such as RAT-SLAM, inspired by computational models of the hippocampus of rodents [5] and Seq-SLAM to map images captured across different weather conditions and time of the day to generalise SLAM solution [6].

Challenges

Robotics researchers have come a long way, defining SLAM and solving enormous navigation problems for autonomous vehicles with static scenes. However, the ultimate goal is to make SLAM handle autonomous driving in day-to-day, with fewer salient features, large-scale environment with dynamic objects, non-smooth movements of camera, imaging challenges including occlusion, transparency and reflection while also being cheap at computation and memory. Though we expected the solutions to these problems to be easier, now we understand [7], having a reliable self-driving cars on the roads are as not as easy as we thought, but definitely do-able.

References

  1. Durrant-Whyte, H.F., Rye, D.C. and Nebot, E.M., 1996. Localization of Autonomous Guided Vehicles.
  2. Mur-Artal, R., Montiel, J.M.M. and Tardos, J.D., 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE transactions on robotics, 31(5), pp. 1147-1163.
  3. Engel, J., Schöps, T. and Cremers, D., 2014. LSD-SLAM: Large-scale direct monocular SLAM. In European conference on computer vision (pp. 834-849). Springer, Cham.
  4. Engel, J., Koltun, V. and Cremers, D., 2017. Direct sparse odometry. IEEE transactions on pattern analysis and machine intelligence, 40(3), pp. 611-625.
  5. Milford, M.J. and Wyeth, G.F., 2012. SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights. International conference on robotics and automation, pp. 1643-1649.
  6. Milford, M.J., Wyeth, G.F. and Prasser, D., 2004. RatSLAM: a hippocampal model for simultaneous localization and mapping. International Conference on Robotics and Automation, 1, pp. 403-408.
  7. Li, R., Wang, S. and Gu, D., 2018. Ongoing evolution of visual slam from geometry to deep learning: Challenges and opportunities. Cognitive Computation10(6), pp. 875-889.
%d bloggers like this: