Image Processing in Python

Image processing techniques allow us to perform mathematical operations on a 2D array of pixels. These operations are used to enhance images, convert images to different colour scales, rescale and extract useful information such as features for robotics. Python is a high-level, general-purpose programming language which is desirable among computer vision researchers due to free availability, active and supportive community and code readability among different embedded systems. Python uses multiple libraries such as Matplotlib, NumPy and OpenCV in imaging to act as an effective open-source alternative to the MATLAB environment. So to initialise the programming environment, we will have to first import these libraries.

import matplotlib.pyplot as plt
import numpy as np
import cv2 

We read an image of a koala, and convert the image into different colour space using the following snippet of code. OpenCV used BGR colour format over RGB colour space due to popular demands among camera manufacturers back then when it was first being developed. Thus, for the human visual system to process the image as it is, we have to convert all images to RGB colour space. For robots, a single channel information itself is enough and so, we use grayscale images to reduce computation cost during deployment.

image = cv2.imread('images/koala.png') 

imgrgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) 

imggray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
plt.imshow(imggray, cmap='gray', vmin=0, vmax=255)
An image of Koala in different colour spaces of BGR, RGB and Grayscale

Images can also be rescaled to different sizes. Multiple rescaled images are often used in building image pyramids such as Gaussian and/or Laplacian to include a wide variety of scales during vision applications [1]. Similarly, we can also change the aspect ratio of an image, crop an image, rotate such images and split and combine colour channels. The following code snippet shows how to perform such techniques using Pillow [2].

from PIL import Image

image ='images/koala.png')

scaled_image = image.resize((27, 34))

box = (350, 300, 800, 800)
cropped_image = image.crop(box)

image_rot = image.rotate(180)

image_flip = image.transpose(Image.FLIP_LEFT_RIGHT)

red, green, blue = image.split()
plt.imshow(red, cmap='gray', vmin=0, vmax=255)
plt.imshow(green, cmap='gray', vmin=0, vmax=255)
plt.imshow(blue, cmap='gray', vmin=0, vmax=255)
An image of Koala at different image size
An image of Koala (a) cropped (b) rotated and (c) flipped.
An image of Koala split into red, green and blue channels.

These image processing techniques are used as pre-processing stages for classical computer vision techniques and to create to robust datasets in machine learning. On top of these techniques, we can also perform image enhancement. Image enhancement allows us to improve contrast, colour, brightness, sharpness and other parameters digitally. Image histogram, a graphical representation of the pixel intensity of an image against the frequency of the occurrence, visualises the contrast in an image. Feature detectors like SIFT have contrast-dependent peak threshold tuning to generate true positive features for reconstruction. Similarly, improving the sharpness of an image allows precise edge detection.

from PIL import Image, ImageEnhance

image ='images/koala.png')

contrast = ImageEnhance.Contrast(image)

color = ImageEnhance.Color(image)

brightness = ImageEnhance.Brightness(image)

sharpness = ImageEnhance.Sharpness(image)
An image of Koala with enhancement in contrast, colour and sharpness.

Robots are interested only in features that are unique to the scene. It doesn’t care about all the pixels but the pixels that represent features such as contours, edges or blobs. Image enhancement is a well-established field which involves in the initial stage of complex vision applications. Removing visual challenges like contrast and noise, improving brightness and rescaling images allow robots to perceive the environment better. These techniques are the fundamental building blocks of popular image filters in social media such as Instagram and Snapchat. We can not only build image editing applications, but can use these techniques for colour segmentation, scale-invariant search and to increase generalisation of artificial intelligence.


  1. Szeliski, R., 2010. Computer vision: algorithms and applications. Springer Science & Business Media.
  2. Pillow — Pillow (PIL Fork) 9.0.0 documentation. 2022.
%d bloggers like this: