Correlation matrices

6 08 2009

Correlation matrices are a great way to visualize covariation in large datasets. Thanks to Sarkar (2008) and Friendly (2002), we can borrow these examples. First we need to load a few packages:

library(lattice)
library(ellipse)

We then go on to importing our data:

MyData <- read.table("/FILE/LOCATION/FILE.txt", header=TRUE,sep='\t',quote='')

Then on for some tedious work. We’ll have to manually type in the name (as it named i the dataset!) of each of the variables we’re interested in (unless off course you’ll want to analyze the whole shebang – which often isn’t desirable).

var <- c("var1","var2","var3")

We then proceed to make a correlation matrix with the selected variables vector “var”:

cor.MyData <- cor(MyData[,var],use="pair")

And here’s the money shot; setting the order of the variables in the matrix based on the result of a cluster analysis. This is for easier visual interpretation.

ord <- order.dendrogram(as.dendrogram(hclust(dist(cor.MyData))))

We then can make our Corrgrams. First an easy one:

print(levelplot(cor.MyData[ord,ord],xlab=NULL,ylab=NULL,
at=do.breaks(c(-1.01,1.01),101),scales=list(x=list(rot=90)),
colorkey=list(space="top"),
col.regions=colorRampPalette(c("red","white","blue"))))

Simple correlation matrix

Then on to the more elaborate ones, first a informative one using the ellipse package. We’ll have to write our own panel function here (again, thanks to Sarkar 2008), but this is completly generic, so cut ‘n’ paste should suffice.

panel.corrgram<-function(x,y,z,subscripts,at,
level=0.9,label=FALSE,...
{require('ellipse',quietly=TRUE)
x<-as.numeric(x)[subscripts]
y<-as.numeric(y)[subscripts]
z<-as.numeric(z)[subscripts]
zcol<-level.colors(z,at=at,...)
for(i in seq(along=z)){
ell<-ellipse(z[i],level=level,npoints=50,
scale=c(.2,.2),centre=c(x[i],y[i]))
panel.polygon(ell,col=zcol[i],border=zcol[i],...)
}
if (label)
panel.text(x=x,y=y,lab=100*round(z,2),
cex=0.8,col=ifelse(z<0,'white','black'))
}

To create a *.pdf of your output we’ll run following code (remember your working directory!):

pdf('corrgram_ellipse.pdf',10,10)
levelplot(cor.MyData[ord,ord],
at=do.breaks(c(-1.01,1.01),20),xlab=NULL,
ylab=NULL,colorkey=list(space='top'),
scales=list(x=list(rot=90)),
panel=panel.corrgram,label=TRUE)
dev.off()

Correlation matrix - ellipse

However, if you’re more in the pacman thing, this next one might be a better choice. We’ll start by writing a new panel function:

panel.corrgram.2<-function(x,y,z,
subscripts,at=pretty(z),scale=0.8,...)
{
require('grid',quietly=TRUE)
x<-as.numeric(x)[subscripts]
y<-as.numeric(y)[subscripts]
z<-as.numeric(z)[subscripts]
zcol<-level.colors(z,at=at,...)
for(i in seq(along=z))
{
lims<-range(0,z[i])
tval<-2*base::pi*
seq(from=lims[1],to=lims[2],by=0.01)
grid.polygon(x=x[i]+.5*scale*c(0,sin(tval)),
y=y[i]+.5*scale*c(0,cos(tval)),
default.units='native',
gp=gpar(fill=zcol[i]))
grid.circle(x=x[i],y=y[i],r=.5*scale,
default.units='native')
}
}

Which we’ll export:

pdf('corrgram_pacman.pdf',10,10)
levelplot(cor.MyData[ord,ord],
at=do.breaks(c(-1.01,1.01),101),xlab=NULL,
ylab=NULL,colorkey=list(space='top'),
scales=list(x=list(rot=90)),
panel=panel.corrgram.2,
col.regions=colorRampPalette(c('red','white','blue')))
dev.off()

Correlation matrix - pacman

Of course, changing the output size (and type) is easy. Changing pdf –> png (or the format of your choosing). Size were in these examples 10 by 10 inches.

Advertisement

Actions

Information

2 responses

3 11 2010
Evert Mouw

Great post! I often use correlation matrices for a fast exploratory scan of my data. Last time, I exported from R to CSV, then imported to Excel and used conditional formatting… It’s easy, but not so sophisticated and beatiful as this method. I’ll save your script.

31 03 2011
nano réta

bonjour
ma question est: comment savoir la corrélation entre les variables

merci

Leave a Reply

Fill in your details below or click an icon to log in:

Gravatar
WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.