Friday, October 14, 2011

Basic Video Processing

In this activity we try to obtain relevant scientific data from a video. We use a video of a free falling object to try to obtain the acceleration due to gravity g.



We obtain  and crop the relevant frames from the video:


To make the processing easier, we binarize the images:


Then using a center of mass algorithm, we reduce the objects into points and graphed their location with respect to time^2:


comparing the obtained equation to 0.5gt^2 + vt we can see that 0.5g = 4.6624, or g =9.3248m/s^2.

This was an easy activity so I give myself a 9/10


Pattern Recognition 3: Neural networks

Neural networking is a computational model on how neurons work. Like an actual brain, neural networks 'learn' the recognition rules used to perform an operation through examples. The larger the example size, the more accurate the operation. This technique is preferred than linear discriminant analysis due to having a faster processing speed after it has learned.

The basic mathematical construct of a neuron is shown below:
A neuron accepts weighted inputs and then sums them up. This sum is then acted on by an activation function g which outputs the new signal z.

By connecting many neurons together, we are able to create a neural network. A typical network consist of an input layer, hidden layer, and output layer:


By applying he neural network algorithm to my object classes, I was able to obtain a 37.5% recognition rate. the poor recognition rate may be due to the small sample training size used. Neural networks need a large training sample size in order to accurately process incoming information.

Since the code for neural network was already given to us and because I hardly understood what it does, I give myself a 6/10 for this activity.

Pattern Recognition 2

In this activity we discuss about the pattern recognition technique of linear discriminant analysis (LDA).

The purpose of Discriminant Analysis is to classify objects into one of two or more groups based on a set of features that describe the objects. In general, we assign an object to one of a number of predetermined groups based on observations made on the object.

To use LDA for pattern recognition we need to know the conditional probability P(i|x) that an object belong to group i. However this is often hard to obtain. What we can obtain however is the probability of getting certain feature, given that the object is from group i, this is P(x|i). The relation between the two probabilities is given by:

This equation however, is directly impractical since we need a large sample size to obtain the relative P(x|i) for each group. A more practical way is to assume the distribution and get the probability theoretically. This is where we get the LDA formula:

Again, we apply the above equation on the 1-peso, 5-peso, leaf and card classes. Again the characteristics I used were the RGB information and the area of each object. From these characteristics, I was able to obtain a 100% recognition rate for all classes. This may be due to the fact that most of the object look very much alike. Also I solved the problem of the areas of each object being of different orders of magnitude by normalizing them. This may have helped in the processing of the algorithm. This was an easy activity since the steps on what to do are already given so I give myself a 10/10.

Pattern recognition 1

Pattern recognition is an important aspect in the quality control of products in today's society. Therefore it is important to obtain accurate machine vision else subpar products may be produced. In the next few activities, we will discuss different techniques for pattern recognition.

In this activity, we discuss the use of minimum distance classification for pattern recognition.

If we define a representative of class ωj to be its mean feature vector then:

where xj is the set of ALL feature vectors in class ωj and Nj is the number of samples in class ωj. The 'closeness' of an object to the representative can then be defined by the euclidean distance:

In order to determine which class the object belongs to, we compute for the smallest distance:
And the object belongs to the class with the smallest distance.

We test this algorithm for 4 classes
1 peso coin

Leaf

Card

5 peso coin


The patterns were recognized to an accuracy of 75%. However what was weird was that all the leaves were considered as 1 peso coin and this was the source of my errors.

All in all this was an easy activity so I give myself a 8/10 since my algorithm only have 75% accuracy.





Color Image Segmentation

Normally we use thresholding in order to separate a specific object from its background. However when the object has the same gray-level value as the background, this can be be a problem. Instead we use the difference in color information in order to separate the background and foreground. 3-D images will have shading variations so the segmentaion needs to be done irregardless of the brightness of the color. This can be done using a color space that separates brightness and color information such as the normalized chromaticity coordinates:
 where:

Using this color space, it can be observed that R+G+B = 1 so we can have B = 1-R-G. with this we can have a 2-D color coordinate plus the intensity information which makes segmentation easier.

In this activity, we will use two techniques in order to segment an object from the background.

First we obtain an image containing our region of interest:


where our region of interest is:

The first of the two techniques is the parametric probability distribution estimation. First we obtain the mean and standard deviation of the R and G coordinates of the region of interest. and then plugging them into the equation:
where r is either the green or red color information of the whole image, then after thresholding, we obtain the image:

 The other technique is histogram back propagation. We create the histogram of the region of interest:
Then using this histogram to eliminate the colors not in the histogram we obtain:

If we look quality wise, parametric probability distribution estimation is better that histogram propagation. It is also easier to use.

This activity is not that hard and although I hate histogram manipulation, having the code in 1-D from the previous activity made it a lot easier to extend into 2-D. I give myself a 9/10 for this activity.


Image Compression


In this activity we will use principal component analysis (PCA) in order to compress an image.

First, we convert the image to grayscale as shown below:

The image is then cut up to 10x10 blocks and concatenated which is then fed to the PCA algorithm.

From this we get the eigenvalues of the image as shown below:


From this we reconstruct the image using different number of principal components:


It can be observed that the less principal components we use, the more the image is degraded. However by increasing the number of used eigenimages, we also increase the size of the image.


This activity was a bit confusing but I was able to do it in the end so I give myself a 9/10

Pre-processing Text


In this activity, we will use what we have previously learned in order to extract handwritten text from an image full of lines like the one shown below.

To make it less complicated we used this part for test extraction:

First we transformed the image into a binary image:

Then, to make it easier to modify, we invert the pixel values:

In order to remove the line, we use binary closing using a straight line as a structuring element:

It can be observed that the characters for D and E are readable however the characters for M and O are fragmented to the point that they may be considered as different characters. However if used with a powerful pattern recognition algorithm it may be possible to detect the letters correctly.

Another point of this activity is to try to recognize text patterns from the image. We try to find multiple instances of the word “description” throughout the whole image using a sample image of the word.
This is the sample I used because of the prerequisite of imcorrcoef() of using a square image

Using imcorrcoef() with the sample to obtain the image below:

Then converting it to binary we obtain:

It can be observed that the algorithm was able to locate all the instances of the word “Description” in the image. This was a relatively easy activity except for using the mogrify function which I wasn’t able to use properly. Therefore I give myself an 8/10 for this activity.