MTH6101: Introduction to Machine Learning

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

Main Examination period 2020 – May – Semester B

MTH6101: Introduction to Machine Learning

Question 1 [10 marks].

(a) Describe the problem of dimensionality reduction in unsupervised learning. [6]

(b) List two techniques for this problem. [4]

Question 2 [29 marks]. As part of Karhunen-Loeve expansion of the covariance matrix of a centered data set X with n = 100 observations in p = 5 variables, the following matrix was computed.

(a) Complete the following table and determine a number of components using an
80% threshold. [12]

(b) Using the matrix Λ above, determine if the data was scaled to compute the covariance matrix and briefly explain why. [4]

(c) Write (do not derive) the formula that links Λ with D. Recall that Λ is the eigenvalue matrix of the Karhunen-Loeve decomposition of the covariance matrix Σ; and that D is the matrix of eigenvalues of the singular value decomposition of matrix X. [6]

(d) Use the formula you wrote to determine numerically the eigenvalues di of the singular value decomposition of the data matrix X. [7]

Question 3 [20 marks].

(a) Explain what is meant by single linkage in agglomerative clustering. [3]

(b) Consider the following distance matrix

where row and columns are indexed as usual by individuals.

(i) If agglomerative single linkage clustering were to be performed, which individuals would be merged first and why? [4]

(ii) Explain why in the first step the result is the same regardless of the linkage used. [3]

(iii) Assume you are at a step in agglomerative clustering in which individuals 1,2,3 belong to one cluster and individuals 4,5 belong to another cluster. Using single linkage, find the distance between these two clusters. [5]

(iv) Using average linkage, give the distance between clusters in Question (biii). [5]

Question 4 [23 marks]. The following data are the results of a classification analysis. The output includes the validation output Ytrue and the classifications obtained with three trained classification algorithms termed Y1, Y2 and Y3.

## Ytrue Y1 Y2 Y3

## [1,] 1 1 0 1

## [2,] 0 0 1 0

## [3,] 1 1 0 0

## [4,] 0 0 1 1

## [5,] 0 1 1 0

## [6,] 0 0 1 1

## [7,] 0 0 0 0

## [8,] 0 0 1 0

## [9,] 0 0 0 0

## [10,] 1 1 0 0

## [11,] 1 1 0 1

## [12,] 1 1 0 0

(a) Complete the following confusion matrices. [9]

(b) Compute the False Positive Rate (FPR) and True Positive Rate (TPR) for each confusion matrix, completing in the table below. [6]

(c) Plot your results in the ROC graph below and briefly comment on the performance of classifiers. Which is the best classifier? [8]

Question 5 [18 marks].

(a) The Lasso criterion is L = 2/1||Y − Xβ||2/2 + λ||β||1. Explain what the components of the Lasso criterion are. [3]

(b) Explain what are the solutions to lasso as λ → 0. Also as λ → ∞. [2]

(c) The following table contains output from a lasso fit to model with d = 3 variables and n = 20 observations. For each row in the table, compute s, the proportion of shrinkage defined as s = s(λ) = ||β(λ)||1/ maxλ ||β(λ)||1 and write its value in the correct position to complete the table. [6]

(d) Using your completed information, add the lasso paths to the following plot. In your plot, label each path according to its corresponding variable. [7]

发表评论

电子邮件地址不会被公开。 必填项已用*标注