taiyun / corrplot Goto Github PK
View Code? Open in Web Editor NEWA visual exploratory tool on correlation matrix
Home Page: https://github.com/taiyun/corrplot
License: Other
A visual exploratory tool on correlation matrix
Home Page: https://github.com/taiyun/corrplot
License: Other
Hi, I start using corrplot
and appreciate for this nice package, but it might be better if corrplot()
can plot a matrix with NA values.
it throws an error like below:
> M
[,1] [,2]
[1,] 1 NA
[2,] NA 1
> corrplot(M)
Error in if (min(corr) < -1 - .Machine$double.eps^0.75 || max(corr) > :
missing value where TRUE/FALSE needed
it may be a simple solution to plot nothing if a cell value is NA.
Thanks in advance.
I am trying to have numeric diagonal labels on corrplot
. I have the correlation matrix M
and the labels ids
. I opened also a thread in SO about the case where link below. I think the line colnames(p.mat) <- rownames(p.mat) <- colnames(mat) <- c(ages)
in the thread's function cor.mtest
should associate ids
with the diagonal labels.
ids <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
cor.mtest <- function(mat, ...) {
mat <- as.matrix(mat)
n <- ncol(mat)
p.mat<- matrix(NA, n, n)
diag(p.mat) <- 0
for (i in 1:(n - 1)) {
for (j in (i + 1):n) {
tmp <- cor.test(mat[, i], mat[, j], ...)
p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
}
}
colnames(p.mat) <- rownames(p.mat) <- colnames(mat) <- diag.labels
p.mat
}
M<-cor(mtcars, diag.labels=ids)
corrplot(M, type="upper", order="hclust", tl.pos=c("td"), method="circle", tl.cex = 0.5, tl.col = 'black', diag = FALSE,
p.mat = p.mat, sig.level = 0.05)
OS: Debian 8.5
R: 3.3.1
Related: http://stackoverflow.com/q/40494979/54964
type="lower"
, is it possible to put title in upper section, instead of top? Sometimes, the title is far from the figure itself.line 241 in corrplot.R.
original:
symbols(Pos, add = TRUE, inches = FALSE, circles = rep(0.5, len.DAT) * 0.85)
nonvisual circle perimeter:
symbols(Pos, add = TRUE, inches = FALSE, circles = rep(0.5, len.DAT) * 0.85, fg = bg)
I'm not sure why, but lintr always reports the following warning when built on Travis.
R/corrplot.R:310:12: warning: no visible global function definition for ‘corrMatOrder’
ord <- corrMatOrder(corr, order = order, hclust.method = hclust.method)
^~~~~~~~~~~~
If I run lintr locally in RStudio, I don't see any lints:
> lintr::lint_package()
Fabian Roger asked per email:
Is there any way to for the output of corrplot (with is.corr = F) to be quadratic for a matrix that is not?
Hi, there.
Thank you very much for developing such a good package for plotting the correlation matrix!
Unfortunately, I found this package is unavailable for R 3.3.2.
So, I am wondering will you update this package soon?
if not, maybe I need to use the previous version of R.
Thanks again.
Please take a look to this error when you try to plot a correlation matrix with all values lower or equal to significant level.
Thank's in advance
Juan
require(Hmisc, quietly = TRUE)
require(corrplot)
p=rcorr(as.matrix(attitude[,1:3]))$P
cor_=cor(attitude[,1:3])
corrplot.mixed(cor_, upper='number',lower='ellipse', p.mat=p,
insig='blank')
Error in symbols(pos.pNew[, 1][ind.p], pos.pNew[, 2][ind.p], inches = FALSE
ind.p has length zero because there is no value greater than significant level.
currently corrplot.mixed
and plotCI
are not really compatible.
corrplot::corrplot.mixed(M,lower='number',upper='circle',low=L,upp=U,plotCI='rect')
results in the following error:
Error in corrplot(corr, add = TRUE, type = "lower", method = lower, diag = (diag == :
method should be circle or square if draw confidence interval!
And only draws the upper half with confidence plots.
It should also draw the lower triangle with numbers.
It is also currently not clear how to specify which (the upper or lower) triangle to be used by "rect"
For a package that creates a visual it makes a lot of sense to have at least an example of those visuals on the frontpage of your repo. If you add something to your readme that could help get people excited about your package.
Code
library("psych")
library("corrplot")
M <- mtcars
M.cor <- cor(M)
p.mat.all <- psych::corr.test(M.cor, adjust = "none", ci = F)
alpha <- 0.05
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
lapply(
c("r","p","t"),
function(ID) { # http://stackoverflow.com/a/40531043/54964
x <- p.mat.all[[ID]]
corrplot( M.cor,
p.mat = x,
sig.level = alpha,
insig = "blank",
)
})
Output: spearman r
diagonal is white, please see the linked thread for the figure
R: 3.3.1
OS: Debian 8.5
Related: http://stackoverflow.com/q/40533069/54964
As pointed out by @taiyun we should distinguish between 0 and NA in the plot.
It is not a good idea that plot nothing if a cell value is NA:
M1 = M2 = cor(mtcars) diag(M1) = 0 diag(M2) = NA # should not be the same corrplot(M1) corrplot(M2)
How about using '?' instead ?
Newer versions of R support a method for hclust
called ward.D2
which implements the Ward's minimum variance method. The original method, simply ward
is now deprecated (now called ward.D
). Its implementation was flawed in that it did not square the dissimilarities scores.
When using hclust.method="ward"
in corrplot, a warning is flagged:
The "ward" method has been renamed to "ward.D"; note new "ward.D2"
However using hclust.method="ward.D2"
(or ward.D) results in an error:
Error in match.arg(hclust.method) :
'arg' should be one of “complete”, “ward”, “single”, “average”, “mcquitty”, “median”, “centroid”
If you corrplot a correlation matrix where variables in the data have very long names, then the plot cuts off the top labels.
For example, the following code
set.seed(123)
rmat <- matrix(runif(100), ncol = 10)
colnames(rmat) <- c("the quick brown fox jumps over the lazy dog", "and then went to get ice cream", "A", "B", "C", "D", "E", "F", "G", "H")
M <- cor(rmat)
corrplot(M, type = "upper", tl.pos = "td",
method = "circle", tl.cex = 0.5, tl.col = 'black',
order = "hclust", diag = FALSE)
Removing the tl.cex still has the problem, yielding the following plot:
Note that both plots also have a fairly excessive amount of white space to the left of the plot, but that is not the issue here.
As a workaround, I found that if I comment out line 183 of corrplot.R then the problem is reduced or resolved, although the colorlegend last value (-1) gets cut off the bottom of the plot.
the range of parameter lim.segment in function colorlegend() should be [0, 1], not [-1, 1]?
The Black Duck Open Hub (formerly Ohloh.net) is an online community and public directory of free and open source software (FOSS), offering analytics and search services for discovering, evaluating, tracking, and comparing open source code and projects.
Thanks for a very versatile package!
I'm trying to adjust the size of the displayed correlation coefficient. On this issue:
#36
it is suggested to use number.cex=0.5
I have v0.73 installed from the repos, and when I add this to a basic call, as suggested there:
corrplot(cor(mtcars), method='number', number.cex=0.5)
I get the following error:
Warning messages:
1: "number.cex" is not a graphical parameter
2: "number.cex" is not a graphical parameter
3: "number.cex" is not a graphical parameter
4: "number.cex" is not a graphical parameter
5: "number.cex" is not a graphical parameter
6: "number.cex" is not a graphical parameter
7: In text.default(pos.xlabel[, 1], pos.xlabel[, 2], newcolnames, srt = tl.srt, :
"number.cex" is not a graphical parameter
8: In text.default(pos.ylabel[, 1], pos.ylabel[, 2], newrownames, col = tl.col, :
"number.cex" is not a graphical parameter
9: In title(title, ...) : "number.cex" is not a graphical parameter
and there is no change to the displayed text size.
Also, if I remove the repo version and try to install the developer version:
devtools::install_github("taiyun/corrplot")
I get the following error message:
Downloading github repo taiyun/corrplot@master
Error in system(full, intern = quiet, ignore.stderr = quiet, ...) :
error in running command
Any suggestions as to what to do about these?
Thanks.
Code
library('corrplot')
# http://www.sthda.com/english/wiki/visualize-correlation-matrix-using-correlogram
cor.mtest <- function(mat, ...) {
mat <- as.matrix(mat)
n <- ncol(mat)
p.mat<- matrix(NA, n, n)
diag(p.mat) <- 0
for (i in 1:(n - 1)) {
for (j in (i + 1):n) {
tmp <- cor.test(mat[, i], mat[, j], ...)
p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
}
}
colnames(p.mat) <- rownames(p.mat) <- colnames(mat)
p.mat
}
M <- cor(mtcars)
p.mat <- cor.mtest(M)
title <- "ECG p-value significance"
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(M, method="color", col=col(200),
diag=FALSE, # tl.pos="d",
type="upper", order="hclust",
title=title,
addCoef.col = "black", # Add coefficient of correlation
# Combine with significance
p.mat = p.mat, sig.level = 0.05, insig = "blank"
)
Output in the related thread where the title position is outside the window space, about half of the title.
R: 3.3.1
OS: Debian 8.5
Related thread: http://stackoverflow.com/q/40509217/54964
Describe somewhere the process of deployment to CRAN
README.md
or a separate documentWhen making a corrPlot that only has one column the color bar in the legend has width = 0
See discussion on stack overflow:
I am trying to show the correlation r values in at least two decimal instead rounded up (e.g. 0.99 instead of 1) as corrplot always rounded the r values.
May I ask is there any parameter in the corrplot package allowed to do so?
Hi, I would like to plot some other pairwise statistics (e.g. odds ratio) with corrplot()
with is.corr = F
option.
I found out that your codes makes following error when the matrix(corr)
contains NA
values.
# Error in if (max(corr) * min(corr) < 0) { :
# missing value where TRUE/FALSE needed
I tried following modification adding na.rm = T
on your corrplot()
codes:
if (!is.corr) {
if (max(corr, na.rm = T) * min(corr, na.rm = T) < 0) {
intercept <- 0
zoom <- 1/max(abs(cl.lim))
}
if (min(corr, na.rm = T) >= 0) {
intercept <- -cl.lim[1]
zoom <- 1/(diff(cl.lim))
}
if (max(corr, na.rm = T) <= 0) {
intercept <- -cl.lim[2]
zoom <- 1/(diff(cl.lim))
}
However, after I modified, I got a new error as follows:
# # Error in (n + 1 - n2):(n + 1 - n1) : result would be too long a vector
# # In addition: Warning messages:
# # 1: In min(corr, na.rm = TRUE) :
# # no non-missing arguments to min; returning Inf
# # 2: In max(corr, na.rm = TRUE) :
# # no non-missing arguments to max; returning -Inf
# # 3: In max(Pos[, 2]) : no non-missing arguments to max; returning -Inf
# # 4: In min(Pos[, 2]) : no non-missing arguments to min; returning Inf
Please let me know your advice.
Thanks.
Is there a way of changing the variables names such that plotmath expressions could be used? I've tried adding a labels
argument with a vector of new labels thinking that it may be passed through to text
, but that didn't seem to work.
Disadvantages:
The question is whether the speed is still a problem and what can we do about it.
Now, the Travis-CI badge and Codecov badge point to my forked repository vsimko/corrplot
.
What needs to be done:
For Travis-CI:
taiyun/corrplot
repository in https://travis-ci.org/profile/taiyunFor Codecov:
After that, every push to taiyun/corrplot
repository will be automatically built and code coverage report will be generated.
According to this blog:
http://www.r-bloggers.com/travis-ci-to-github-pages/
It is possible to publish HTML vignettes after Travis-CI build to http://corrplot.github.io/
It seems this bug #7 hasn't been fixed in the case when is.corr = F
.
x <- matrix(0, ncol = 5, nrow = 5)
x[1,1] <- NA
corrplot(x) #this works
corrplot(x, is.corr = F) #this does not
Support for NAs in corrplot has been discussed in issues #55, #49, #46 and #7.
Some code utilizing the NAs is already located in tests/testthat/test-corrplot.R
.
However, we also need some demo code snippets and plots in examples and vignettes, i.e.
vignettes/example-corrplot.R
vignettes/corrplot-intro.Rmd
Is there any way to have the correlation coefficients colored in black instead in in red and blue in the corrplot.mixed
plots?
I found this question on stackoverflow:
http://stackoverflow.com/questions/30207260/corrplot-label-printing
df <- data.frame(
x1 = rnorm(20), x2 = rnorm(20), x3 = rnorm(20), x4 = rnorm(20),
x5 = rnorm(20), x6 = rnorm(20), x7 = rnorm(20), x8 = rnorm(20),
x9 = rnorm(20), x10 = rnorm(20), x11 = rnorm(20), x12 = rnorm(20))
cormatx <- cor(df)
corrplot(cormatx, method="color")
Now I can alter the position of the labels by adding tl.pos=... which, according to the package manual, only takes "lt", "ld", "td", "d" or "n" as arguments. These are "left and top", "left and diagonal", "top and diagonal", "diagonal" and "NULL" respectively. (To my knowledge all the arguments involving the "diagonal" option won't even work with method="color").
Is there a way to print only the top labels. I tried tl.pos="t", without any luck. I think that argument just isn't supported so it returned "default".
I saw that the cor.mtest
function is being used:
Maybe we should include it to corrplot as a function.
https://github.com/taiyun/corrplot/blob/master/inst/doc/Makefile
otherwise you only see Error: could not find function "corrplot"
after the package is built
I think that we actually don't need the github wiki.
I suggest to turn it off in: Settings->Features->Wiki checkbox
Hi,
Thanks for the great library. I have a correlation plot with 46 variables. When plotting with method="number", the size of the number exceeds the size of the square. Is it possible to change to font size of the correlations of coefficients? Also, is it possible to change the number of decimals?
If not, this would be two great features!
Best,
Kristoffer
mar
is set to c(0,0,0,0)
by default and affects the whole environment outside corrplot
function, which is usually not expected . Better to store par
settings and restore it back after plotting.
There is a parameter na.label
. However, it currently only allows for a single character (default "?").
It might be useful to support more characters with some reasonable upper limit.
Example:
corrplot(M2, na.label = "NA", number.cex = .7) # works from v0.76
@taiyun Before I implement this, let's discuss the following:
When computing a weighted correlation matrix with wt.cor(.., cor=TRUE)$cor
, diagonal correlations can be slightly above 1 or slightly below -1, but corrplot
raises an error which is likely to confuse users since the printed values are either 1 or -1. The checks could be made a bit more tolerant by checking for x <= 1 + 2 * .Machine$double.eps
and x >= -1 - 2 * .Machine$double.eps
(which worked in my case, but one could be a bit more tolerant...).
Thanks for the great package. I've installed the update, but I still get an error when running a corr matrix with NAs. When will this be available to the public? Also, why not just display "NA" rather than "?"? The value may not be in question ("?"), but may be genuinely not applicable ("NA"). Thanks.
The function corrRect.hclust
is exported (see NAMESPACE
) but doesn't have any documentation.
Some parameters, such as k
or method
are still documented within corrRect
.
Hi,
Not sure if I've missed this in the documentation, but it would be nice to be able to define plot ranges.
For example, if I'm not interested in extreme correlations +1/-1 and, for the purposes of clarity, I wanted to remove those from the plot and concentrate on, say, the 0.6/-0.6 range there is currently no mechanism in corrplot to do that (as far as I can tell).
This might be a bug, inconsistent API, missing feature or bug in the documentation.
In corrplot.mixed.R
the parameter addgrid.col
is documented as:
#' @param addgrid.col The color of grid, if \code{NULL}, don't add grid.
In corrplot.R
the parameter addgrid.col
is documented as:
#' @param addgrid.col The color of grid. The default value is depends on
#' \code{method}, if \code{method} is \code{color} or \code{shade}, the
#' default values is \code{"white"}, otherwise \code{"grey"}.
It looks like it should be possible to use addgrid.col
to disable the rendering of the grid.
But it does not work. See the following code snippet:
M <- matrix(runif(2500, 0.5, 1), nrow = 50)
corrplot(M, method = "color", cl.pos = "n", tl.pos = "n", addgrid.col = NULL)
I couldn't find any option to expand the margins for plot generated by corrplot.mixed. How can I do it? Thanks!
I'm using ver 0.79 2016-08-17 Github (ab158bb)
Currently, the default value for lim.segment
parameter in colorlegend
function is NULL.
I would like to propose a better, more intuitive value "auto".
I found the following blog post:
Visualize correlation matrix using correlogram
It contains a very nice collection of plots that would be a nice vignette for our package.
Any opinion on this ?
Try to write some test for the new number.digits
parameter.
Here, we can collect links to stackoverflow that contain interesting bug reports, questions, feature requests, tips, hacks etc.
I saw that we have a non-mergeable pull request from @zhilongjia who suggested to add separate parameters tl.x.col
and tl.y.col
instead of the current tl.col
. This looks like a reasonable feature.
However, it would also break the current API.
Therefore, I would like to suggest to rather keep the tl.col
parameter that can either contain the color name as a string e.g. "red" or separate colors as a pair, e.g c("red", "blue")
- red for X, blue for Y label.
Any suggestions ?
Return value from corrplot.mixed differs from corrplot.
The main corrplot
function now returns invisible(corr)
which is useful for testing.
However, the function corrplot.mixed
does not return anything.
I suggest to return the same: invisible(corr)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.