EE 520 Homework 3 Questions 1,3,4

Kevin Mack, Clarkson University 10/6/2017

Question 1: Classifying Images of Cats and Dogs (LDA)

First we must read in images of cats and dogs and arrange them into a data matrix. Cats and

Dogs will each have their own data matrices, where each column is one image. The code below

will implement this and display a few images of cats and dogs. The images will be displayed in

gray-scale, which is also how they will be processed for classiﬁcation.

Listing 1: Matlab Code – Read in image data and display set of images

1 % EE520 HW3 Q1

2 load(’catData.mat’)

3 load(’dogData.mat’)

5 % look at a set of 9 cats

6 cat_images = zeros(9,64,64);

7 for i = 1:9

8 image = reshape(cat(:,i), [64,64]);

9 cat_images(i,:,:) = image;

10 end

12 figure

13 for i = 1:9

14 subplot(3,3,i)

15 imshow(mat2gray(reshape(cat_images(i,:,:), [64,64])))

16 end

18 % look at a set of 9 dogs

19 dog_images = zeros(9,64,64);

20 for i = 1:9

21 image = reshape(dog(:,i), [64,64]);

22 dog_images(i,:,:) = image;

23 end

25 figure

26 for i = 1:9

27 subplot(3,3,i)

28 imshow(mat2gray(reshape(dog_images(i,:,:), [64,64])))

29 end

Assignment № 3 Page 1

Figure 1: A set of the original images used for training. The images are of low resolution (64x64)

but will still be suﬃcient for classiﬁcation. This is import because if large images are used,

computational complexity vastly increases.

Wavelet Decomposition

In order to classify the images it will be import to ﬁrst project them onto a basis that can extract

key features from the samples. In this case a wavelet basis is appropriate, due to fact that

wavelets are excellent for edge detection. Each data matrix will be sent to a function called

dc_wavelet.m, which will decompose the images into their wavelet basis. The output of this

function is a matrix that is 1024xN where N is the number of images in the original data set.

The wavelet function reduces each image to a 32x32 representation (from 64x64 original image

size). The following code will implement the wavelet decomposition and display key features of

the output. The output images are represented in Figure 2.

Listing 2: Matlab Code – Wavelet decomposition code

1 function [ dcData ] = dc_wavelet( dcfile )

2 %This function receives image data and returns a haar wavelet ←-

decomposition

3 %matrix

4 [m,n] = size(dcfile);

5 nw = 32

32;

6 nbcol = size(colormap(gray),1);

7 %close

8 dcData = zeros(nw, n);

9 for i = 1:n

10 X = double(reshape(dcfile(:,i), [64,64]));

11 [cA,cH,cV,cD] = dwt2(X, ’haar’);

12 cod_cH1 = wcodemat(cH,nbcol);

13 cod_cV1 = wcodemat(cV,nbcol);

14 cod_edge = cod_cH1+cod_cV1;

15 dcData(:,i) = reshape(cod_edge, [nw, 1]);

16 end

Assignment № 3 Page 2

Figure 2: The wavelet decomposition for the cat (on the left) and the dog (on the right) is dis-

played. The Haar wavelet was eﬀective in detecting the edges in each picture. This indicates

that even these low resolution images will be useful in a classiﬁcation algorithm.

Linear Discriminant Analysis

Following the wavelet decomposition, the data is now ready for Linear Discriminant Analysis.

The LDA algorithm is described more in depth in the report on Question 2, but in summary the

algorithm will attempt to ﬁnd a basis set that maximizes the distance between the two diﬀerent

classes in space (while minimizing the distance between intra-class samples). In this analysis,

we will make use of the SVD in order to extract the dominant modes of each decomposed image.

The following MATLAB code comprises the catdog_trainer.m function, which will perform the

LDA and output results.

Listing 3: Matlab Code – Training algorithm for cat and dog images

1 function [ sortcats,sortdogs,result,w,U,S,V,th ] = catdog_trainer( cats, ←-

dogs, feature, plots )

2 %Linear Discriminant Analysis for training images of cats and dogs

3 n1 = length(cats(1,:)); n2 = length(dogs(1,:));

5 [U,S,V] = svd([cats, dogs], 0); %reduced SVD

6 sounds = S

V‘;

7 U = U(:,1:feature);

8 cats = sounds(1:feature, 1:n1);

9 dogs = sounds(1:feature, n1+1:n1+n2);

11 if(plots)

12 figure

13 subplot(2,1,1)

14 plot(diag(S), ’ko’,’Linewidth’,[2])

15 set(gca, ’Fontsize’,[14],’Xlim’,[0 80])

16 subplot(2,1,2)

17 semilogy(diag(S),’ko’,’Linewidth’,[2])

Assignment № 3 Page 3

18 set(gca,’Fontsize’,[14],’Xlim’,[0 80])

19 end

21 [vrow, vcol] = size(V);

22 if(plots)

23 figure

24 basis = 3;

25 for j=1:basis

26 subplot(basis,2,2

j-1), plot(1:(vcol/2), abs(V(1:(vcol/2),j)),’ko-←-

’)

27 subplot(basis,2,2

j), plot((vcol/2+1):vcol,abs(V((vcol/2+1):vcol,j←-

)),’ko-’)

28 end

29 subplot(3,2,1),title(’Cats’)

30 subplot(3,2,2),title(’Dogs’)

31 end

33 m1 = mean(cats, 2);

34 m2 = mean(dogs, 2);

36 Sw =0;

37 for i = 1:n1

38 Sw = Sw+(cats(:,i)-m1)

(cats(:,i)-m1)‘;

39 end

40 for i = 1:n2

41 Sw = Sw+(dogs(:,i)-m1)

(dogs(:,i)-m2)‘;

42 end

44 Sb = (m1-m2)

(m1-m2)‘;

46 [V2, D] = eig(Sb, Sw);

47 [lambda,ind] = max(abs(diag(D)));

48 w = V2(:,ind); w = w/norm(w,2);

50 vcats = w‘

cats; vdogs = w‘

dogs;

52 result = [vcats, vdogs];

54 if mean(vcats) > mean(vdogs)

55 w = -w;

56 vcats = -vcats;

57 vdogs = -vdogs;

58 end

59 % cats < threshold < dogs

60 sortcats = sort(vcats);

61 sortdogs = sort(vdogs);

Assignment № 3 Page 4

63 t1 = length(sortcats);

64 t2 = 1;

66 while sortcats(t1)>sortdogs(t2)

67 t1 = t1-1;

68 t2 = t2+1;

69 end

70 th = sortcats(t1)+sortdogs(t2)/2;

71 if(plots)

72 bins = 30;

73 figure

74 subplot(1,2,1)

75 histogram(sortcats, bins); hold on, plot([th th], [0,10],’r’)

76 %set(gca,’Xlim’,[-1.55 -1.35],’Ylim’,[0 10],’Fontsize’,[14])

77 title(’Cats’)

78 subplot(1,2,2)

79 histogram(sortdogs, bins); hold on, plot([th th], [0,10],’r’)

80 %set(gca,’Xlim’,[31.595 31.5955],’Ylim’,[0 10],’Fontsize’,[14])

81 title(’Dogs’)

82 end

The results from the LDA are represented in Figures 3 and 4 which show the energy in each

principal mode, the ﬁrst three principal modes, and the overall results respectively. Also, ﬁgure

5 shows the classiﬁcation error when the number of principal modes (features) for classiﬁcation

is increased from 3 to 80. As can be seen from Figures 4 and 5 the algorithm performs fairly

well, and increases in accuracy as more features are used in the classiﬁcation. Even though up

to 80 features have been used, the LDA step saves this analysis from what is commonly referred

to as ,"the curse of dimensionality"; this means that LDA reduces the number of dimensions in

the projection of the data in such a way that maximizes the separation of the classes.

Assignment № 3 Page 5

Figure 3: The graphs depict the energy in the modes (left) as well as the ﬁrst three modes

themselves (right). In the ﬁrst three modes, it can be seen that there are diﬀerences in the

values between the cat and dog images. It can also be seen that the ﬁrst mode has signiﬁcantly

more energy than all of the other modes.

Figure 4: The histograms display the clustering of the data as well as the threshold for classi-

ﬁcation. It can be seen that some samples from each class fall outside the threshold. In order

to mitigate this more advanced techniques may be performed, or more sample images may be

used to train the algorithm.

Assignment № 3 Page 6

Figure 5: This graph depicts the accuracy of the classiﬁcation algorithm as the number of fea-

tures used is increased. In this case there are only a small set of images tested, and the accuracy

values ﬂuctuate slightly. It can be seen that after roughly forty features there isn’t much bene-

ﬁt to the added computation. In practice, the trade-oﬀ between computational complexity and

accuracy would be used to determine the number of features used.

Question 2: Classifying Types of Music

See attached report.

Question 3: Classifying Images of Cats and Dogs (k-Means)

In this question we are tasked with attempting a diﬀerent classiﬁcation algorithm than the LDA

from Question’s 1 and 2. In this case the k-means classiﬁcation algorithm will be used on

the cats and dogs data from Question 1. The details of reading in and preparing the data for

classiﬁcation are provided in the ﬁrst section, and will be skipped here. In this section the SVD

is used in order to reduce the number of dimensions for the dataset. Three pricinpal orthogonal

modes were used to represent the data in space in order to perform the k-means clustering

algorithm. A sub-sample of the clusters and their respective means is depicted in Figure 6. In

this case the correct percentage of classiﬁcations for the images were 0.87 for dogs, 0.95 for

cats.

Question 4:

Part A:

Question: What if you needed to make a partition of k > 2 elements?

Assignment № 3 Page 7

Figure 6: Three dimensional graph displaying the clusters of data. The circles represent the cat

image data, while the squares represent the dog image data. The means are represented by

triangles. It can be seen that using three dimensions works well in this case, and there is no

need to include higher dimensions (which may actually reduce the accuracy of classiﬁcation).

If more classes of images are used for a classiﬁcation then the algorithms used (LDA and k-

means) will need to be utilized in a more general form. In the case of LDA, it may be of interest

to partition the data in order to classify between one class and all others. In addition to this "one

against the rest" classiﬁcation, it is possible to do pairwise classiﬁcation of the samples. For

example, if there are classes A, B, and C then you would have classify between each possible

pair of classes. Additionally, it is possible to classify between m number of classes in LDA. This

would require computing more inter and intra-class scatter matrices, as well as the statistics for

each class, in order to perform dimensionality reduction and classiﬁcation.

In the context of this homework, if we added images of giraﬀe faces to the mix of data we could

attempt to pick out the images that are of a certain class (cats for example). If we do this for

each class, then we can apply each individual linear discriminant to be able to determine the

class of a given image.

Part B:

Question: What if you want the method itself to suggest a good number of partition elements?

In the case of LDA, the number of classes must be known before the algorithm is applied. This

is because LDA is part of the supervised learning class of algorithms. There are other types,

unsupervised learning methods, which will achieve their results without the apriori knowledge of

the number of classes. A few unsupervised methods include k-means, mixture models, neural

networks, and hierarchical clustering.

Assignment № 3 Page 8