Friday, October 14, 2011

Image Compression


In this activity we will use principal component analysis (PCA) in order to compress an image.

First, we convert the image to grayscale as shown below:

The image is then cut up to 10x10 blocks and concatenated which is then fed to the PCA algorithm.

From this we get the eigenvalues of the image as shown below:


From this we reconstruct the image using different number of principal components:


It can be observed that the less principal components we use, the more the image is degraded. However by increasing the number of used eigenimages, we also increase the size of the image.


This activity was a bit confusing but I was able to do it in the end so I give myself a 9/10

Pre-processing Text


In this activity, we will use what we have previously learned in order to extract handwritten text from an image full of lines like the one shown below.

To make it less complicated we used this part for test extraction:

First we transformed the image into a binary image:

Then, to make it easier to modify, we invert the pixel values:

In order to remove the line, we use binary closing using a straight line as a structuring element:

It can be observed that the characters for D and E are readable however the characters for M and O are fragmented to the point that they may be considered as different characters. However if used with a powerful pattern recognition algorithm it may be possible to detect the letters correctly.

Another point of this activity is to try to recognize text patterns from the image. We try to find multiple instances of the word “description” throughout the whole image using a sample image of the word.
This is the sample I used because of the prerequisite of imcorrcoef() of using a square image

Using imcorrcoef() with the sample to obtain the image below:

Then converting it to binary we obtain:

It can be observed that the algorithm was able to locate all the instances of the word “Description” in the image. This was a relatively easy activity except for using the mogrify function which I wasn’t able to use properly. Therefore I give myself an 8/10 for this activity.

Monday, September 12, 2011

Binary operation

Morphological operation like erosion and dilation are very useful for for image isolation. However these types of treatments often disfigure the image that it becomes of little use for further processing. Binary operations were based on these basic functions were developed to compensate for this flaw.

Opening is where one erodes the image then dilates it using the same structuring element. What it does is is that it removes all white regions whose dimensions don't fit the structuring element. If we consider the regions  with a value of zero as holes, the operator would 'open' those holes, hence the name.

Closing on the other hand is an erosion followed by a dilation using the same structuring element. It removes black regions smaller than the structuring element. It is like the closing of the holes in an image.

For this activity, we will use these operators to estimate the area of simulated cells. We will also try to isolate  cancer cells inserted with the normal cells.

First we cut up the image into 12 256 x 256 pixel images. this is to reduce the burden of the program to the computer. Then we convert them into binary images as shown below.


As we can see the images formed are very noisy. This can be fixed by opening the image using a circle that is a bit smaller than the cells as a structuring element. The images produced is shown below.


We then recombine these images back into one image:



We can see that although the noise has been dealt with, many of the cells are clumped together which may cause our average to have large error. To obtain the area we use bwlabel to index each structure and hen created a program to count for each pixel in each structure. the value we obtained was 710.28 ± 522.88 pixels. As expected, the standard deviation of the area was large.

Using the average we obtained, we will try to isolate the simulated cancer cells from the image below.


Using a structuring element that is a bit larger than the radius of the original cells we obtain the image below:



As you can see, I was able to isolate the circles larger than the original cells(cancer cells). Since this was an easy activity, I give myself a 10/10

Monday, September 5, 2011

Morphological Operations

Morphological operations are post processing done to binary images in order to extract information or remove unnecessary structures. In this activity, we discuss the different types of morphological operations that can be done in scilab.


Dilation

This morphological operation is a treatment that increases the surface area of the areas that have a value of 1. The dilation between set A and B is defined as the set of all z's which are translations of a reflected B that when intersected with A is not an empty set. That sound pretty complicated, and to my understanding, it means to use B to increase the surface area of A as seen below:



The red regions are the parts of A that overlap with B, while the yellow shaded region is the dilated part of the original structure. Note, however that this is only my understanding of the use of dilation. When I wrote a code for dilation what happened is this:

We can see that although it was correct for the most part, there have been some errors in my predictions. The most obvious error is when using either the cross or diagonal structuring element. In my predictions the corners of the dilated image would be chipped like a sawtooth. Instead what I obtain is a corner with only 1 pixel removed.


Erosion

Like an opposite to dilation erosion is used to decrease the the area of the regions with the value of 1. The erosion of sets A and B is defined as the set of all points z such that B translated by z is
contained in A. Erosion is designed such that the reduction in the area of A is defined by the structuring element B.

Unlike for dilation which I just covered the edge of the image with the structuring element, for erosion I chose an origin for the structuring element. This origin is then placed on each pixel of the image and if the whole structuring element is not inside the image, that pixel will be equal to zero. We would then obtain an eroded image as shown below:


Where the red regions are the eroded parts of the image. Writing the code for erosion we obtain these images:

Again, most of my predictions were correct however, if we look at the eroded images for the annulus and cross with the diagonal structuring element. the program obtained a different image. This may be due to my arbitrary assignment of the origin.

Thin/Skel

a) Original Image, b)Skel, c) Thin
Thin/ Skel are other morphological transforms that one can use using the SIP Tool in scilab.The Skel function "skeletonizes" the image by creating a hypothetical frame for the image. The thin function like its name says, thins the image. This is done eroding the borders of the image.

All in all it was an easy activity, however, I found drawing my predictions (the original plan) quite a hassle so I used paint for my predictions. For this activity I give myself an 8/10

Thursday, September 1, 2011

Enhancement by Histogram manipulation

I HATE THIS ACTIVITY!!There are two reasons why I'm late in posting my activities for 186, one is the submission for SPP, and the other is this activity. It took me only now to make the code for this work so I was only able to post this now.


Manipulation of an image’s histogram is one of the ways in which we can improve the quality of an image by enhancing some features of the image which are not normally seen with the naked eye. This is done by back projection using the cumulative distribution function (CDF) of the image. This back projection is shown below:
By doing so, we can enhance the image in such a way that under and overexposed parts of the image would be normalized thus increasing the amount of detail in the picture
Take Figure below for example. It is an underexposed picture of a seaside restaurant with its CDF shown:

Now we take an ideal, in this case linear CDF and use it  to enhance the image like so:
We can see that the CDF of the fixed image has the same form as the ideal CDF that we used. However note that the image brightened a little. This may be attributed to the fact that the original image was saved using a .jpg format. We learned from previous activities that .jpg has a lossy compression therefore there is little information left in the dark regions of the picture. Even with histogram manipulation, we cannot recover information that is not present anymore.
Also note that the human senses are generally non-linear so I tried different CDF’s with result shown below:
Finally I think creating a code for histogram manipulation is quite a hassle, thankfully some graphics manipulation software have already incorporated this technique in their programs. One of those programs is Gimp:


It is actually quite annoying to use histogram manipulation in Gimp. It was so easy that it made me think ’what the hell was I writing all those codes for… ’
All in all, I think this activity was quite the hassle, back propagation might seem simple when looking at Fig. 1 but when actually applied to code, it was quite confusing. My pride on not asking others for help also didn’t help. That’s the reason I’m posting this only now. Still I give myself a 6/10 for this activity, at least I finished it… >.<






Monday, July 18, 2011

Properties of the 2D Fourier Transform


For this activity, we investigated the properties of a 2D Fourier transform (FT) of an image. For the first part of the activity, we produced different images and obtained their FTs.  As we can see from figure 1, a straight edge (i. e. square) produces a line in the FT perpendicular to the edge of the image. An annulus also produce ring in its FT and the FT of an annulus with a straight edge has broken line perpendicular to the image’s edge. The FT of a double slit is a single straight slit along the horizontal. The FT of a double pinhole along the x- axis created a series of slits of different widths and spacing. 

For the second part of the activity, we simulated a sinusoid and obtained its FT. what we obtained is an image with two pinholes at the y-axis a few pixels apart.


However if we increase the frequency of the sinusoid, we can see in the FT that the pinholes moved further apart. This is because the pinholes are the spatial frequencies of the sinusoid. They are in the y-axis because the sinusoid propagates through the y-axis. The reason they move apart is that the center on the FT is equal to a DC signal or a frequency of zero. As we move further from the center, the frequency would increase so higher frequency structures can be found further from the center.



Rotating the sinusoid also causes a rotation in its FT. also like before the structures in the FT align to where the sinusoid is propagating.


If we take the FT of two superimposed sinusoids, we would obtain something like four dots in a corner of a square. I wasn’t actually expecting this since I thought that the FT would look like a cross at the center.


By superimposing another sinusoid thing get even weirder. My prediction is that upon adding another sinusoid another pair of dot would appear in the FT. Instead, there became 8 dots that are shifted from the center. 




All in all it wasn’t a particularly hard activity so I would give myself an 8/10

Wednesday, July 13, 2011

Fourier Transform Model of Image Formation


For this activity we familiarized ourselves with the technique of using Fourier transforms in image manipulation. Fourier Transform is a powerful and useful technique because it has the ability to isolate the different details of an image by separating structures of varying spatial frequencies. This allows the manipulation of only certain details (like noise and speckles) without distorting other details.

The first part of the activity is the familiarization of the Fast Fourier Transform (FFT) algorithm used to compute the discrete Fourier transform of an image. Figure 1 shows the images created for this part. We used the FFT algorithm on a circle to obtain its Fourier transform (top row second column).The image may look empty but that is because the signal can be seen at the four corners of the image. This is because the algorithm rearranges the image so that the diagonal quadrants switch with each other. This can be resolved be using fftshift()which swaps the quadrants back(top row third column). An Inverse Fourier transform is also possible by using the FFT algorithm a second time. This returns the image from frequency domain to space domain. However, as we can see in the bottom right corner figure, the image is flipped after it is subjected to another FFT.

Figure 1.

The second part of the activity is the familiarization of one of the most useful applications of FFT. Convolution is the combination of 2 mappings through the multiplication of their Fourier transform. This method allows one to create a function such that it looks both like the two original images.
For this part of the activity, we convolve circular apertures of varying sizes with an image to see its effect on the quality of the image. We can see that by convolving a small aperture to the image, we can se a low resolution image that has something like diffraction patterns. As we increase the aperture size, the convolve image’s quality improves.

Figure 2.

For the third part of the activity, we show one of the applications of convolution, which is text recognition. By convolving a text image with what you want to find, the convolution automatically correlates the image as to pinpoint matching results.

Figure 3.
 
For the Last part of the activity we used convolution for edge detection. What we did was to create a 3x3 matrix with a total sum of zero like the one below.

Figure 4.
 
This pattern is then convolved with the text image to obtain the edges of the image. The accuracy of the edge obtain seems to rely on the orientation of the negative integer in the pattern. The first edge was obtained by convolving the pattern above with the image. The second used a vertical pattern as opposed to the first one. And lastly, the third edge used a pattern were the whole edge was negative. If I were to choose, I think that the third one has the best formed edge.



All in all, I think this was an easy activity since it was basically a familiarization of something I normally use. However because there was no internet connection in the room I forgot to post this on time T^T. That is why I would give myself an 8/10 for this activity.