We perceive our environment via senses, one of which is visual cues. Light enters our eye and hits the retina, which is a light-sensitive layer. Photoreceptors on the retina turn this light into electric signals and transfer them to the brain via optic nerve. For a robot to see, this phenomenon has to be artificially fed to the embedded device which acts as the brain for control in robotics.
Currently we use multiple visual sensors, which perceive the environment and generate an output which is interpreted by the processor as an image, a point cloud or an event. Cameras are one of the commonly used visual sensors on robots due to their cost and availability.
Smart DSLR cameras are bots on their own with a processing system, sensors and actuators. In robotics, we generally use machine vision cameras which are similar to a camera, but provide more independence over hardware and software control for active vision.
We can capture an event as a video. This is similar to capturing frames continuously and are used generally for applications where missing an interest point in space or time is expensive such as object tracking. Similarly for offline applications such as object classification, capturing a single frame reduces computation cost. However, there is a sweet spot between capturing a continuous stream of frames and a single frame, called burst.
Recent mobile phones have introduced this concept as a capturing mode for high speed action photography like running, flying and especially to capture sports. In burst mode, we are allowed capture frames in rapid-fire, for example: 30 frames per second. Each of these frames in a burst captures an action different from another frame eloquently instead of introducing blur. This method of capturing multiple consecutive frames within a short period of time, generally with exposure time over a few milliseconds is called burst imaging.
Recent researches [1-3] have taken advantage of burst imaging in denoising low light images. Google mobile phones capture a burst of underexposed frames, perform hierarchical tile-based alignment  on the burst frames to account for motion and merge the frames in the burst to produce a single intermediate image of high-bit depth followed by colour and tone mapping to produce a single full-resolution output photograph within a matter of seconds.
Robots thrive in extremely low light conditions in applications such as mining and drone delivery in rural environments because they can tolerate low quality imagery, meaning relatively low SNR images compared to visually pleasing images. However, lower the light in the environment gets, higher the robot produces spurious features during reconstruction. This can reduce the accuracy or in the worst case result in failure of reconstruction. While capturing a long exposure image produces higher SNR images, it is not always feasible in robotics where most of the scene and/or platform is moving. Thus, capturing a burst of frames, aligning and merging to improve SNR of the output image is a viable solution in visual challenges such as low light .
Similarly, given there is fog, backscatter and/or blur, a burst of frames has the ability to capture the variation among the frames which can be exploited in generating an output frame. Burst imaging is perpetually proving the significance of capturing data with spatial and temporal information in robotics in addition to applications such as virtual reality (VR) and machine learning (ML).
- Hasinoff, S.W., Sharlet, D., Geiss, R., Adams, A., Barron, J.T., Kainz, F., Chen, J. and Levoy, M., 2016. Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM Transactions on Graphics (ToG), 35(6), pp.1-12.
- Liba, O., Murthy, K., Tsai, Y.T., Brooks, T., Xue, T., Karnad, N., He, Q., Barron, J.T., Sharlet, D., Geiss, R. and Hasinoff, S.W., 2019. Handheld mobile photography in very low light. ACM Transactions on Graphics (TOG), 38(6), pp.1-16.
- Liu, Z., Yuan, L., Tang, X., Uyttendaele, M. and Sun, J., 2014. Fast burst images denoising. ACM Transactions on Graphics (TOG), 33(6), pp.1-9.
- Szeliski, R., 2010. Computer vision: algorithms and applications. Springer Science & Business Media.
- Ravendran, A., Bryson, M., and Dansereau, D.G., 2022. Burst imaging for light-constrained structure-from-motion. IEEE Robotics and Automation Letters (RA-L), 7(2), , no. 2, pp.1040–1047.