Chapter 5

Exercise Solution for 5.1

This exercise asks us to interpret and validate the consistency within our clusters of data. To do this, we will employ the silhouette index, which gives us a silhouette value measuring how similar an object is to its own cluster compared to other clusters. The silhouette index is as follows:

S (i) = \frac{B (i) - A (i)}{m a x_{i} (A (i), B (i))}

The book explains the equation by first defining that the average dissimilarity of a point

x_{i}

to a cluster

C_{k}

is the average of the distances from

x_{i}

to all of the points in

C_{k}

Vocabulary for Chapter 5

Chapter 5 covers Clustering Analysis for large scale data anlysis like DNA/RNA sequencing outputs. These methods produce so much data that more unbiased approaches are required when attempting to make correlations. unsupervised method A learning method where all variables are treated with the same status, rather than one variable being considered as an outcome or target. status A variable’s classification as an outcome/predictor (e.

Exercise Solution for 5.1

Chapter 5 vocabulary quiz

Vocabulary for Chapter 5