The image processing and the computer vision have gained a significant interest in last 2 decades. The image analysis can be used to detect items or people on images and videos. It is widely used in the medicine to detect cancer tissues and to improve brain, lungs and heart diseases diagnostic. The computer automation enabled analyzing terabytes of an image data, based on which we improve our life status and get insights for business decisions. In this post I present basic operations that can be applied to a simple image, all thanks to imager package by which I am truly impressed. I also present a quick entropy approach to the image binarization, which applied to images on a greyscale transforms them to the binarized black-and-white output.
Warning: you will probably not be able to bear my face anymore after this post.
- Basic image operations
- Entropy based image binarization
contains a large array of functions for working with image data, with most of these functions coming from the CImg library by David Tschumperle.
CImg is a simple, modern C++ library for image processing -
imager is a glue between the R interface and this library that works under the hood.
I must admit I read a lot of R vignettes, but I haven’t earlier seen such an accurate and comprehensive one before! Maybe RSelenium vignettes (about which I wrote in my previous post - Controlling Expenses on Ali Express with RSelenium) are written with as high sacrifice as those from
Basic image operations
Package provides a basic functionality allowing to load an image to R and to plot it with
Photos are stored as
cimg objects and are a special representation of a 4D array with dimensions called: x,y,z,c where x,y corresponds to spatial dimensions, z usually correspond to depth or time, and c is a colour. The third dimension is typically used for videos. If you are operating on a photo from a grayscale you will not need to pay the attention to the last 2 dimensions.
The advantage of storing images as arrays is that we can simply extract values corresponding to pixels to perform operations like arythemtic transformations or plotting the histogram of pixels’ intensity.
rescale parameter stands for whether you would like to store image in
[0,1] range, where for
0-255 picture are always rescaled to the maximum range. So to visualize the linear operations you need to set
Note the y axis running downwards: the origin is at the top-left corner, which is the traditional coordinate system for images. imager uses this coordinate system consistently
Image histogram and it’s equalisation
An image histogram is a type of histogram that acts as a graphical representation of the tonal distribution in a digital image. It plots the number of pixels for each tonal value. By looking at the histogram for a specific image a viewer will be able to judge the entire tonal distribution at a glance.
It can be simply plotted for a grayscaled image with:
Another approach is to turn the image into a data.frame, and use ggplot2 to view all colour channels at once:
Histogram equalisation makes histograms flat: each pixel’s value is replaced by its rank, which is equivalent to running the data through their empirical cdf, which mainly solves the problem of over-representation for some pixel intensities.
Check the below comparison of an original photo, a grayscaled photo and a grayscaled photo after the histogram equalisation.
Plotting with ggplot2
cimg object to a
as.data.frame (mdf object) we can use following format to plot the image with
ggplot2. See following code with comments:
Note: I am using
as.cimg since the result of
f function is a regular vector, that needs to be converted to
cimg object for further usage.
For such a transformed image (a grayscaled photo after the histogram equalisation) next natural step in the image analysis is performing the image binarization. This is helpful in detecting the image’s background and foreground. In this process we will substitute pixel values with 0 and 1 so that in the end the output image will be only black and white. This format is also convenient for detecting edges or blob extractions.
imager we can perform the threshold binarization with
treshold() function that can provide a threshold with quantiles or with numeric values (or somehow automatic). Below is only example which is an introduction for the next chapter:
entropy based image binarization.
thr - a threshold, either numeric, or “auto”, or a string for quantiles
Entropy based image binarization
Last year Zygmunt Zawadzki (zzawadz) developed FSelectorRcpp package (still only available at GitHub - as I am still testing it, mea culpa…) which is a Rcpp (free of Java/Weka) implementation of FSelector entropy-based feature selection algorithms with sparse matrix support. It has a functionality allowing to calculate the conditional entropy for a variable, knowing values of another feature. This is a good measure of how one variable explains another one and can be used in the feature selection to reduce
the curse of a dimensionality or can be used in selecting the threshold value in the image binarization (my contrived idea).
Entropy and information gain
For the binarization I would choose the threshold that maximizes the information gain (based on the entropy) of binarized images based on the information gathered from the image before the binarization (a grayscaled photo after the histogram equalisation).
The information gain is defined as
where states for the Shannon’s entropy:
where is the base of the logarithm used (common values of are 2), and is the conditional entropy for with a condition to .
See an example of an information gain extraction for the binarized image with the threshold = 0.1.
Final binarization - threshold selection
The final step is to check all the possible thresholds (let’s assume that those are values 0-255 divided by 255). Since every threshold can be checked independently, I am using
parallel library and
mclapply function to apply the procedure to multiple cores.
Then I present the plot of calculated information gain vs the specified threshold. It’s surprising for me that this is almost symmetric over 0.5. Maybe this is due to the histogram equalisation. In my opinion this is the great tool to detect the background and the foreground of the image.
To sum up, I present my photo in a grayscale after the histogram equalisation and after the binarization based on the information gain optimization.