r - exctract correlated elements of a correlation matrix -
i have correlation matrix in r , want know how many groups (and put these groups vectors) of elements correlate between them in more 95%.
x <- matrix(0,3,5) x[,1] <- c(1,2,3) x[,2] <- c(1,2.2,3)*2 x[,3] <- c(1,2,3.3)*3 x[,4] <- c(6,5,1) x[,5] <- c(6.1,5,1.2)*4 cor.matrix <- cor(x) cor.matrix <- cor.matrix*lower.tri(cor.matrix) cor.vector <- which(cor.matrix>0.95, arr.ind=true)
cor.vector
contains:
row col [1,] 2 1 [2,] 3 1 [3,] 3 2 [4,] 5 4
that means, expected, vectors 1,2 , 3 correlate between them, , 4 , 5.
what need 2 vectors c(1,2,3)
, c(4,5)
final result.
this simple example, processing large matrices though.
here's approach using igraph
package:
require(igraph) g <- graph.data.frame(cor.vector, directed = false) split(unique(as.vector(cor.vector)), clusters(g)$membership) # $`1` # [1] 2 3 1 # $`2` # [1] 5 4
what find clusters in graph g (disconnected sets), illustrated in figure below. since vertices used create graph in order entered (from cor.vector
), clustering order comes in same order. is: vertices c(2,3,5,1,4) clusters c(1,1,2,1,2) total of 2 clusters (cluster 1 , cluster 2). so, use split using cluster group.