Data Analysis - Week 3

一、Exploratory Graphs
  • boxplot() - Produce box-and-whisker plot(s) of the given (grouped) values. Parameter formula such as y ~ grp means y is a numeric vector of data values to be split into groups according to the grouping variable grp (usually a factor).
  • barplot(height, ...) - Creates a bar plot with vertical or horizontal bars.
  • hist(x, ...) - Computes a histogram of the given data values.

  • plot() - For more details about the graphical parameter arguments, see ?par.
  • hexbin{hexbin}
  • qqplot() - Produces a QQ plot of two datasets.
  • matplot() - Plot the columns of one matrix against the columns of another.
二、Expository Graphs

xlab, ylab, main, legend

  • par(mfrow=c(1,2)) - A vector of the form c(nr, nc). Subsequent figures will be drawn in an nr-by-nc array on the device by columns (mfcol), or rows (mfrow), respectively.
  • mtext(text="(a)",side=3,line=1) - Text is written in one of the four margins of the current figure region or one of the outer margins of the device region. (side: 1=bottom, 2=left, 3=top, 4=right; line: on which MARgin line, starting at 0 counting outwards.)

Saving files in R is done with graphics devices. Use the command ?Devices to see a list.

The top ten worst graphs: http://goo.gl/pN5fQ

三、Hierarchical Clustering

dist() - This function computes and returns the distance matrix computed by using the specified distance measure to compute the distances between the rows of a data matrix.

Cluster Dendrogram

四、K-means Clustering

kmeans() - Perform k-means clustering on a data matrix.

kmeans

五、Dimension Reduction

This part seems harder to understand than any other part...

Speak Your Mind

*