Fundamental Trade-offs in Imaging

Capturing images with high signal to noise ratio (SNR) allow robots to perform accurate feature detection and thus, precise localization and mapping. This is often possible in indoor environments. However, in outdoor environments, there are numerous visual challenges for a robot. Depending on the weather and time of the day, there can be snow which acts as occlusion, too much sun light saturating images and low light where there is not enough signal to capture good SNR images. The risks of using autonomous vehicles get higher in unstructured environments, underwater and in space. While we use AI to make robots see better, we also understand how important data to a machine learning model is. The exciting concepts on robotic imaging shows how we can improve capturing images or collecting data from robots such that we can perform precise navigation in visual challenges.

We use monocular cameras as one of the prominent visual sensors for capturing images. While a monocular camera is a fair choice in terms of cost and availability, it is important to understand that there are other kinds of cameras that can perform better in robotics because they capture more useful data. These include stereo cameras, RGB-D cameras, event cameras and light field cameras. Regardless of which camera we use, there are fundamental trade-offs in imaging that we need to overcome to capture useful information. Depending on the robotic applications, there are also additional complexity like back-scattering and/or contrast challenges when considering such trade-offs.

Let’s assume we are in a dark room with a controlled light and a camera mounted on a robotic arm and we are capturing a static scene. When there is enough light in the room, there is enough signal for the camera to capture the scene. So, we can use a narrow aperture to increase the depth of field of the images to capture every part of the scene in focus. Similarly, if there is enough light, we can also capture images with smaller exposure time as there will be enough signal to generate a good SNR image. This gives more opportunities for the robot to move faster in the scene and have less noise in the image.

The above assumption is also true, when there is an availability for a robot to use flash in low-light. In night time applications and/or mining environments, a robot doesn’t get enough signal without an additional light source. Photographers or imaging enthusiasts do one of the three following techniques and/or find a sweet spot by tuning two/more of these parameters to capture good SNR images in such conditions.

  • Capturing longer exposure image: If the scene is static and the camera is static, a longer exposure image records signal for a longer period time, collecting enough photons to capture good SNR images. However when there is a dynamic scene or if the camera is moving which is a very common possibility in robotics, there will be blur in the captured imagery destroying useful information
  • Widening the aperture: If the scene-of-interest is closer to the camera, by widening the aperture we allow more photons to enter and thus higher SNR images. This reduces the depth of field, allowing images closer to the lens to be in focus while background scenes to be out-of-focus. While this can bring artistic effect for portraits and street photography at night, it is not a desirable effect in robotics since the background of an image provides context in localization.
  • When widening the aperture and/or increasing the exposure is not sensible, photographers change their ISO value, which allows them to amplify the signal at the expense of noise. While this can result in visually pleasing results, this introduces anomalies to the robotic system due to spurious features and inaccurate pose estimation.

Computational imaging researchers have been working on breaking such trade-offs by using coded aperture [1], flutter shutter [2], flexible depth of photography [3] and motion-invariant photography [4]. While these can be used as a plug-in solution in robotics, these solutions come with hardware modifications and requires imaging domain expertise. Robotics relies heavily on computer vision techniques to improve such images through a variety of image processing techniques and AI solutions whereas robotic imaging can bring distinct opportunities to capture better data based on robotic applications. While this field is new and exciting in robotics, we can anticipate robots to explore the world of unknowns with more precision with the use of AI and other sensory information.

References

  1. Levin, A., Fergus, R., Durand, F. and Freeman, W.T., 2007. Image and depth from a conventional camera with a coded aperture. ACM transactions on graphics (TOG)26(3), pp.70-es.
  2. Raskar, R., Agrawal, A. and Tumblin, J., 2006. Coded exposure photography: motion deblurring using fluttered shutter. In ACM SIGGRAPH 2006 Papers (pp. 795-804).
  3. Nagahara, H., Kuthirummal, S., Zhou, C. and Nayar, S.K., 2008, October. Flexible depth of field photography. In European Conference on Computer Vision (pp. 60-73). Springer, Berlin, Heidelberg.
  4. Levin, A., Sand, P., Cho, T.S., Durand, F. and Freeman, W.T., 2008. Motion-invariant photography. ACM Transactions on Graphics (TOG)27(3), pp.1-9.
%d bloggers like this: