In the previous lesson, we saw how to classify images from a single snapshot. In this lesson, we will take it to the next level. Instead of dealing with single snapshot, we will classify images from live web cam feed. It means the camera will keep capturing frames. Whenever we will hold some object in front of the camera, it will classify the objects.

The Video Explanation

If you prefer video explanation to reading, you can watch this YouTube video:

Our Target:

To classify images from live camera feed. The following figure illustrates the

The Code

Here is the code to meet our target. That means this code accesses the webcam, loads the GoogleNet and continuously classifies images from live webcam feed.

clear all 
the_camera = webcam;
the_network = googlenet;
required_input_size = the_network.Layers(1).InputSize(1:2)

h = figure;

while ishandle(h)
single_Image = snapshot(the_camera);
image(single_Image)
single_Image = imresize(single_Image, required_input_size);
[predicted_Item,probability] = classify(the_network, single_Image);
title({char(predicted_Item), num2str(max(probability),2)});
drawnow
end

How it Works:

If you notice carefully, the code is almost identical to the code of classifying image from a single snapshot. The only difference is the presence of a while loop. We can divide the code into two phases. They are:

  1. Initialization and
  2. The Loop.

Initialization Phase:

This block of code is the initialization phase:

1.	clear all 
2.	the_camera = webcam;
3.	the_network = googlenet;
4.	required_input_size = the_network.Layers(1).InputSize(1:2)
5.	h = figure;

Here, on the first line, we have used ‘clear all’ to make sure the webcam object is freed before another allocation. When we run this code, the MATLAB will allocate the webcam and thus it will be unavailable for any other programs. If we do not use the ‘clear all’ at the beginning, the program will run smoothly at the first trial. However, it will though an error message in the second attempt. Because the webcam will remain occupied by the first process.

In the second line, we initiated webcam object. The GoogleNet has been loaded in the third line. To use GoogleNet, we must have to know the image size it accepts. On the fourth line, we are finding out the input size of the GoogleNet and storing it in ‘required_input_size’ variable. On the fifth line, we have created a figure graphics object and store in a variable named ‘h’.

That is all we have done in the initialization phase.

The Loop

This is the block of code we used in ‘the loop’ phase:

1.	while ishandle(h)
2.	single_Image = snapshot(the_camera);
3.	image(single_Image)
4.	single_Image = imresize(single_Image, required_input_size);
5.	[predicted_Item,probability] = classify(the_network, single_Image);
6.	title({char(predicted_Item), num2str(max(probability),2)});
7.	drawnow
8.	end

The loop runs as long as the figure object is valid. That means as long as the figure object has something to plot, the ‘ishandle’ function returns true. As a result, the loop keeps running. In the second line, the ‘snapshot()’ function is used to capture images from the webcam object. On the third line, the ‘image()’ function has been used to display the captured image. It is necessary to resize the image before transmitting them in the network. It has been done on fourth line.

Once the image is resized, we can transfer it to GoogleNet and classify the image. It is done using ‘classify()’ function. It has been used in the fifth line. The ‘classify()’ function takes the network and the image captured from the webcam as input and classifies it. Then it returns the label of the predicted item and the probability. One the sixth line, the label of the predicted item and the probability on the figure as string. Finally using the ‘drawnow’ function, the image with the label and its probability are shown on the figure.

When the ‘handle(h)’ returns invalid, in other words when there is no more input from the webcam, the loop ends on eight line.

Designing a neural network and gaining acceptable accuracy might be a complicated task. However, classifying images using pre-trained network is simple and straightforward. In the previous section, we have learned the basics of classifying images from a webcam snapshot using GoogleNet. In this section, we have taken our skill one step further by learning to classify images from live webcam feed.

Written by:
Nuruzzaman Faruqui,
Lecturer, Department of Computer Science and Engineering, City University

What is the best way to learn machine learning? Machine learning is an applied subject. And thus the best way to learn machine learning is by applying it. And who is the best machine learning teacher? The best machine learning teacher is Nuruzzaman Faruqui. He is one of the few machine learning teachers how understand the best way to teach machine learning.