Image Processing in Python
The necessity for researchers who can process and analyze picture data has increased as computer systems have become quicker and more potent, and cameras and other imaging devices have become more prevalent in many different spheres of life. Automating this processing and analysis may be useful or even required. It is because of a computer program, due to the potential for large volumes of data to be involved, high-resolution images that occupy a lot of virtual memory or disc space, or collections of several photographs that need to be processed at once.
Image Processing: What is it?
Image processing, as the name suggests, entails processing the image, which may involve a variety of approaches before we achieve our objective.
The output can either take the form of an image or a feature corresponding to that image. This can be applied to decision-making and additional analysis.
What, though, is an image?
A picture can be represented by the 2D function F(x,y), where x and y are spatial coordinates. The amplitude of F at a specific value of x,y is used to determine the intensity of a picture at a given location. We refer to it as a digital image if x, y, and the amplitude value is finite. Pixels are grouped in columns and rows in an array.
The components of an image that include information about color and intensity are called pixels. In 3D representations of images, X, Y, and Z are changed into spatial coordinates. A matrix-shaped arrangement of pixels is used.
Python libraries for image processing
Many libraries are available in Python for image processing, including:
1. Scikit-Image
It is an image preparation library that is open-source. With just a few built-in functions, it can execute complicated manipulations on images using machine learning.
Even for individuals who are brand-new to Python, this module is straightforward and works with NumPy arrays. Among the operations that scikit image may perform are:
- Use the try-all threshold() method on the picture to implement thresholding operations. Seven global thresholding techniques will be used. The filters module contains this.
- Utilize the sobel() method in the filters module to accomplish edge detection. We must first convert the image to grayscale because this method demands a 2D grayscale image as an input.
- Use the filters module’s gaussian() function to achieve gaussian smoothing.
2. OpenCV
Open Source Computer Vision Library is what it stands for. This collection contains more than 2000 optimized algorithms for machine learning and computer vision. OpenCV can be used in image processing in several ways, some of which are given below:
- Converting images between color spaces, such as between BGR and grayscale, BGR and HSV, etc.
- Applying thresholding techniques to picture data, such as basic and adaptive thresholding.
- Blurring and applying custom filters to images are examples of image smoothing.
3. NumPy
You may also use this library to perform basic picture operations like flipping, feature extraction, and analysis.
Images can be represented using Numpy multidimensional arrays, which is why their type is NdArrays. For example, a three-dimensional NumPy array is a color image. The RGB channels of the multidimensional array can be divided.
The image can be subjected to the following operations using NumPy (the image is loaded into a variable named test img using imread).
- Use np.flipud(test img) to flip the image vertically.
- Use np.fliplr(test img) to flip the picture horizontally.
- Use test img[::-1] to flip the image (the image is named img name> after being stored as a NumPy array).
4. PIL/pillow
The friendly PIL fork created by Alex Clark and Contributors is called Pillow. PIL stands for Python Image Library. It is among the strong libraries. Many image formats are supported, including PPM, JPEG, TIFF, GIF, PNG, and BMP.
It can let you conduct numerous operations on photographs, including rotating, resizing, cropping, grayscaling, etc. Let’s examine a few of those procedures.
This library has a module called Image that can perform manipulation operations.
- The open() method can be used to load an image.
- Use the show() function to display a picture.
- Utilize the format attribute to learn the file format.
5. Mahotas
This library has more than 100 functions for computer vision and image processing. Its algorithms are frequently implemented in C++. Mahones has few dependencies because it is an independent module in and of itself.
There is no need for a NumPy module because it simply depends on C++ compilers to do numerical calculations.
Let’s examine a few of the operations that Mahotas can be used for:
- To read an image use the imread() function.
- Use the mean() method to determine the image’s mean.
- The shortest path length from a particular vertex (v) to any other vertex (w) of a connected graph is used to determine the eccentricity of an image. Use the eccentricity() method in the features module to determine an image’s eccentricity.
- Use the morph module’s erode() and dilate() methods to apply distortion and erosion to a picture.
- Use the locmax() function to determine the image’s local maxima.
Install the necessary library in Python
Installing the necessary libraries, such as OpenCV, pillow, or others, that we intend to employ for image processing will be our first step. Pip can be used to install the necessary libraries, such as
$pip install pillow
Python Open() and show image ()
Images can be turned in any direction, whether clockwise or otherwise. Therefore, we only need to build a rotation matrix that includes the rotation location, rotational force, and scaling factor.
#Import required library
from PIL import Image
#Open Image
im = Image.open("TechVidvan.jpg")
#Image rotate & show
im.rotate(45).show()
Im, the variable mentioned above, is a pillow object. We can retrieve some data on the opened image.
>>> im <PIL.JpegImagePlugin.JpegImageFile image mode = RGB size = 1080x489 at 0x65AB990< >>> im.size (1080, 667) >>> im.format 'JPEG' >>>
Python Image Processing Algorithms
1. Morphological Image Processing
Because binary regions created by straightforward thresholding can be damaged by noise, morphological image processing attempts to clean up the flaws in the binary images. Using opening and closing processes also aids in bringing the image into focus.
Grayscale pictures can be used as an extension for morphological operations. It consists of non-linear procedures connected to the organization of an image’s features. It depends on the numerical values of the pixels and their related ordering. This method compares the related neighborhood pixels with a small template called a structuring element placed in various potential positions throughout the image. A little 0 and 1-dimensional matrix is a structuring element.
Let’s look at the two core morphological image processing procedures, dilation, and erosion:
- The dilation technique increases the object borders’ pixel count.
- The erosion procedure eliminates the pixels from the object borders.
2. Gaussian Image Processing
The outcome of blurring a picture with a Gaussian function is a gaussian blur, commonly referred to as gaussian smoothing.
It is employed to lessen details and visual noise. This blurring creates a similar visual impression to see a picture through a translucent screen. It can be used as a data augmentation method in deep learning or for image improvement at various scales in computer vision.
Splitting the process into two passes is advisable to take advantage of the separable quality of the Gaussian blur. The same one-dimensional kernel is applied in the second pass to blur the remaining direction. The outcome is identical to a single pass convolving with a two-dimensional kernel. Let’s look at an illustration to further grasp what gaussian filters do to an image.
3. Fourier Transform in image processing
The Fourier transform breaks an image into sine and cosine components.
We shall consider the discrete Fourier transform as we are discussing images.
Let’s think about a sinusoid, which consists of three elements:
- The magnitude and contrast-related terms
- Brightness-related spatial frequency
- Phase: connected to information about color
4. Edge Detection in image processing
Edge detection is a method of image processing that locates the edges of objects in pictures. It operates by looking for changes in brightness.
Since most of the shape information is included in the edges, this helps obtain valuable information from the image. Traditional edge detection techniques find brightness discontinuities.
When detecting the changes in grey levels in a picture, it can react quickly if some noise is found. Edges are referred to as the local gradient maxima.
5. Wavelet Image Processing
The Fourier transform that we saw is only applicable to frequency. Wavelets take into account both time and frequency. The non-stationary signals are a good fit for this transformation.
We know that edges are among the most significant aspects of an image, and using conventional filters has been shown to blur the image while removing noise. In addition, low-frequency components can have good frequency resolution thanks to the wavelet transform’s architecture.
Converting a picture to grayscale ()
We can convert our original colored image into a grayscale version.
>>> TechVidvan_gray = Image.open('TechVidvan.jpg').convert('L')
>>> TechVidvan_gray.show()
The aforementioned sample is from the Python PIL library. For image processing, we can also utilize additional libraries like open-cv, matplotlib, and NumPy. The programs listed below are examples of using a robust library for image processing.
Displaying a grayscale image:
#Import required library
import cv2
import numpy as np
from matplotlib import pyplot as plt
im = cv2.imread('TechVidvan.jpg',cv2.IMREAD_GRAYSCALE)
cv2.imshow('image',im)
cv2.waitKey(0)
cv2.destroyAllWindows()
Changing a Picture in Python
An image is translated when it is moved within a specific frame of reference.
import cv2
import numpy as np
Pic_Name = 'TechVidvan.jpg'
# Create a translation matrix.
# Shift would be (x, y), therefore matrix would be
M = np.float32([[2, 0, 200], [0, 4, 100]])
try:
# Next, put the image back on the disc.
img = cv2.imread(Pic_Name)
(rows, cols) = img.shape[:2]
# translation matrix.
res = cv2.warpAffine(img, M, (cols, rows))
# Next, put the image back on the disc.
cv2.imwrite('result.jpg', res)
except IOError:
print ('file reading error')
Edge recognition in a picture in Python
Sharp edges in the image must be found during the image detection procedure. When it comes to object localization or object detection, edge detection is crucial. Because edges have such a wide range of applications, there are many algorithms for identifying them. Canny Edge Detection is one such approach that we’ll use.
import cv2
import numpy as np
Pic_Name = 'TechVidvan.jpg'
try:
#Image to be read from the disc.
img = cv2.imread(Pic_Name)
# Canny edge detection.
edges = cv2.Canny(img, 200, 400)
# Next, put the image back on the disc.
cv2.imwrite('TechVidvan_logo.jpg', edges)
except IOError:
print ('file reading error')
Conclusion
Several image processing packages, including OpenCV, Mahotas, PIL, and sci-kit-learn, can carry out these tasks. Deep learning’s use of Broadway nomenclature and advancements in image processing are changing the world. Because researchers are creating new techniques to optimize the image processing field, learning doesn’t stop here. Keep moving forward with TechVidvan.
