mtorchiano / effsize Goto Github PK

View Code? Open in Web Editor NEW

104.0 104.0 11.0 281 KB

Effsize - a package for efficient effect size computation

License: GNU General Public License v2.0

R 100.00%

effsize's People

Contributors

Stargazers

Watchers

Forkers

piptoma nuthanmunaiah tatatupi xiaojingli56 douglaswhitaker egaudrain jarekbryk vplagnol earcanal paulsharpey jbarsotti

effsize's Issues

cohen.d in effsize package

I use effsize_0.7.6.
stdev = sd[2]
must be
stdev = s[2]

Problem with Cliff delta for double factors

When both the values and the groups are factors the cliff delta (both the main and the function interface) break up. The values must be transformed into numeric values with as.numeric().

MWE:

# this is the data
d <- data.frame(v = c("A","B","A","C","B","C","B","B","C","B"),
                f = rep(c("G1","G2"),each=5))
# this breaks
cliff.delta(v ~ f, data=d)
# this breaks
cliff.delta(d$v , d$f)
# this is working
cliff.delta(as.numeric(d$v) , d$f)

feature request: Cohen's d and Hedge's g for Welch's tests

Recently a couple of papers have highlighted the importance of always defaulting to Welch's variants of t-test and ANOVA, which don't assume equal variance [1, 2].

In this spirit, I was wondering if it would be possible to introduce var.equal argument to effsize::cohen.d(). This way, when the user implements Welch's variants, they can set var.equal = FALSE and get appropriate d and g values and their confidence intervals.

Maybe you have a better idea about how to implement this, but this is one approach I found online
(https://stats.stackexchange.com/questions/210352/do-cohens-d-and-hedges-g-apply-to-the-welch-t-test).

References:

Delacre, M., Lakens, D., & Leys, C. (2017). Why Psychologists Should by Default Use Welch’s t-test Instead of Student’s t-test. International Review of Social Psychology, 30(1), 92. https://doi.org/10.5334/irsp.82
Delacre, M., Lakens, D., Mora, Y., & Leys, C. (2018). Why Psychologists Should Always Report the W-test Instead of the F-Test ANOVA. https://doi.org/10.17605/OSF.IO/WNEZG

feature request: Cohen's d and Hedge's g for one-sample t-test

As far as I can tell, this is currently not supported.

The syntax can look something like this, for example-

treatment = rnorm(100,mean = 10)
control = rnorm(100,mean = 12)
cohen.d(~d, mu = 0)

How can you specify a specific SD in Cohen's d calculation?

My understanding is that it's sometimes desirable to specify the SD of one of the two factors (e.g. the baseline SD) when calculating Cohen's d. This might already be possible, but it wasn't clear to me when reading the cohen.d() docs.

Minor mistake in documentation for cohen.d

Hi,

in the current docs for the "pooled" parameter of cohen.d on page 5 it says: "pooled: a logical indicating whether compute pooled standard deviation or the whole sample standard deviation. If pooled=FALSE (default) pooled sd is used[...]" I think the correct version should be "pooled=TRUE(default)".

Best regards,
Xethic

Wrong magnitude labels for `VD.A()`?

I think the labels for the magnitude in VD.A() are not correct. Look e.g. at

effsize::cohen.d(rnorm(10000), rnorm(10000, .51))
effsize::VD.A(rnorm(10000), rnorm(10000, .51))

A function to convert A to d can be qnorm(A)*sqrt(2).

Write a complete vignette

The doc should include performance considerations and refer to other Cliff delta implementations.

Missing check on cliff delta input

If the input vectors contain just one element there is an unexpected error.

cliff.delta( 1, factor("a",c("a","b")) )

returns the following error message:

Error in if (d == 1) { : missing value where TRUE/FALSE needed

Compute Glass's Delta when `pooled=FALSE`

When pooled=FALSE, currently the cohen.d functions computes the sd on the whole sample.
This kind of computation is seldom (if never) useful.

What about letting the function compute Glass's delta where

$$ \Delta = \frac{\overline{x_1} + \overline{x_2}}{s_2} $$

Error in the CI formula?

I am not sure if I understand that correctly, but it seems to me that the line 265 in R/CohenD.R which is currently

Z = -qt((1-conf.level)/2,df)

should be

Z = -qt((1-conf.level/2),df)

Typo in sample code

[From email]

In file CohenD.R, the sample code at the end

d <- rnorm(200)
f <- rep(c(1,2),400)
cohen.d(d ~ factor(f))

should be:

d <- rnorm(800)
f <- rep(c(1,2),400)
cohen.d(d ~ factor(f))

Paired cohen.d with missing data

Invoking

cohen.d(gr.1.measure, gr.2.t.measure, paired = T, na.rm = T)

saying that a paired text is not doable with NA's. However, I am specifying na.rm = T. Excluding the pairs involving NAs manually and re-doing cohen.d works fine.

The error is "Paired computation requires equal number of measures."

Maybe the code does not eliminate the pairs when only one item of a pair is missing

CRAN maintainer feedback on v0.5.3

Possibly mis-spelled words in DESCRIPTION:
Vargha (9:65)
Found the following (possibly) invalid URLs:
- URL: http://softeng.polito.it/software/effsize/effsize_0.5.3.pdf
  From: man/effsize-package.Rd
  Status: 404
  Message: Not Found
- URL: http://softeng.polito.it/software/effsize/effsize_0.5.3.tar.gz
  From: man/effsize-package.Rd
  Status: 404
  Message: Not Found

No package encoding and non-ASCII characters in the following R files:

R/CliffDelta.R

4: ##     A Comparative Study of Cohen<e2><80><99>s d and Cliff<e2><80><99>s Delta Under Non-normality and Heterogeneous Variances
5: ##     American Educational Research Association, San Diego, April 12 <e2><80><93> 16, 2004
10: ## Using SAS to Calculate Tests of Cliff<e2><80><99>s Delta
11: ## Proceedings of the Twenty-Fourth Annual SAS<c2><ae> Users Group <ef><bf<bc>International Conferenc, Miami Beach, Florida, 1999

R/VD_A.R

5: # Journal of Educational and Behavioral Statistics, 25(2):101<e2><80><93>132, 2000

The Title field should be in title case, current version then in title case:
- ‘Efficient effect size computation’
- ‘Efficient Effect Size Computation’
The Description field should not start with the package name, 'This package' or similar.
checking S3 generic/method consistency ... NOTE
Found the following apparent S3 methods exported but not registered:
VD.A.default VD.A.formula cliff.delta.default cliff.delta.formula
cohen.d.default cohen.d.formula print.effsize
See section ‘Registering S3 methods’ in the ‘Writing R Extensions’ manual.

Problem with paired `cohen.d`

From an email received today:

library(BSDA)
data(Fitness)

x <- Fitness$number[Fitness$test == "Before"]
y <- Fitness$number[Fitness$test == "After"]

d <- y-x

t.test(d, mu = 0)

t.test(number ~ test, data = Fitness, paired = TRUE)

effsize::cohen.d(number ~ test, data = Fitness, paired = TRUE)
effsize::cohen.d(d, f = NA)

I realized that the two ways to calculate Cohens D give different results, and I don't understand why. Shouldn't these two versions be the same? In my opinion, the second version gives the correct result.

My other question is on the confidence intervals. We see above that we get a significant result in the paired t test. Can we conclude then that the 95 confidence interval for Cohens D should not contain 0? I was wondering that when I saw that the confidence interval of the second version does contain 0.

Update tests using testthat package

The goal is to write extensive tests for all functions and use the existing package testthat.

Hedges's correction for one-sample Cohen's d use a different formula

I've tried to correct the cohen's d using the hedges's correction however I've realized that using the cohen.d() formula, degrees of freedom are calculated using n-2 even in a one-sample design. From Borenstein (2009) book the correction for the paired t-test situation is calculated using n-1.

```{r}
# Vector
x = rnorm(10, 55, 1)
pop = 50
x = x - pop
n = length(x)

# Effect size
eff_size_uncor = cohen.d(x ~ 1, hedges.correction = F) # uncorrected
eff_size_cor = cohen.d(x ~ 1, hedges.correction = T) # corrected

heg_corr = 1 - (3/(4*(n-2)-1)) # hedge's correction

eff_size_uncor$estimate * heg_corr
eff_size_cor$estimate

heg_corr = 1 - (3/(4*(n-1)-1)) # hedge's correction n-1

eff_size_uncor$estimate * heg_corr

# diff

(eff_size_uncor$estimate * heg_corr) - eff_size_cor$estimate

Is there a reason for this formula? Thanks!

Standard Deviation of cohen's d

You have
S_d = sqrt(((n1+n2)/(n1*n2) + .5*dd^2/df) * ((n1+n2)/df))

and reference The Handbook of Research Synthesis and Meta-Analysis (2009) p 238. However, on the copy of the book I viewed online (through Project Muse), p 238 doesn't have any information on this, but p 226 gives the variance of d as:

v_d = (n1 + n2)/(n1*n2) + d^2/(2*(n1+n2))

Where does the extra term of ((n1+n2)/df) come from?

Effect size is wrong for paired data

rstatix::cohens_d and lsr::cohensD report different effect sizes than effsize::cohen.d for paired data. Not related to #48 (i.e. does not improve by changing sorting order).

Working example exhibiting the issue:

# Paired samples effect size
df <- data.frame(
  id = 1:5,
  pre  = c(110, 122, 101, 120, 140),
  post = c(150, 160, 110, 140, 155)
)
df <- df %>% gather(key = "treatment", value = "value", -id)

set.seed(1234)
(df %>% rstatix::cohens_d(value ~ treatment, paired = T))$effsize
#1.75

lsr::cohensD(value ~ treatment,data = df,method = "paired")
#1.75

effsize::cohen.d(value ~ treatment, data=df, paired=T,hedges.correction=F)$estimate
#1.32

Confidence intervals on d rather than g

Hi,

I noticed that for the cohen.d function, the confidence intervals change when Hedges' correction (g) is requested. Cumming (cited as a reference), on page 305 of his book Understanding The New Statistics, advises that intervals should be calculated on d not g.

Would it be possible to specify an option to determine which is used to compute the CI separate from which is used for the point estimate?

Thanks!

Thresholds for qualitative assessment of Vargha and Delaney A measure VD.A

Hi!
It definitely is a minor issue, but just as a comment: The documentation for effsize does not include a description how the thresholds of the VD.A measure are defined and thus, what the qualitative assessment is based on.

I've found in the R code that you actually base the assessment on Hess and Kromrey, 2004.
Thanks!

Confidence interval computation correctness

I try to calculate the CI for Cohen’s d.
I used two packages for this but the results don’t match.

Please consider this example:

data(sleep)
(tTestResult <- t.test(extra ~ group, sleep, paired = TRUE))

if (!require(psych)) install.packages("psych")
psych::cohen.d.ci(apa::cohens_d(tTestResult),
                  n1 = tTestResult$parameter + 1,
                  alpha = .01)

with psych I get the result

         lower    effect     upper
[1,] -2.655439 -1.284558 0.1132465

If I use your function I get

if (!require(effsize)) install.packages("effsize")
effsize::cohen.d(extra ~ group,
                 data = sleep,
                 paired = T,
                 conf.level = 0.99
)

d estimate: -1.284558 (large)
      inf        sup
-2.6983736  0.1292585

However psych calculates with noncentral distribution. Therefore I used it in your function as well

effsize::cohen.d(extra ~ group,
                data = sleep,
                paired = T,
                conf.level = 0.99,
                noncentral = T)

However, now the results are totally messed up. The CI are outside from the effect.

d estimate: -1.284558 (large)
     inf       sup
0.1729418 2.5309425

Can you tell me: Do I use your function wrongly?

Improve reference for effect sizes Cliffs delta

The code references Hess and Kromrey, without page number. I'm unable to find the values in the paper. https://www.semanticscholar.org/paper/Robust-Confidence-Intervals-for-Effect-Sizes%3A-A-of-Hess-Kromrey/b042a70162663d0c1d9a335fb79c15bd1428321a

Effect size with factors having multiple levels but only two levels present in data

It should be possible to compute effect size when the factor has several (>2) levels but only two are present in the data.

Cohen.d gives wrong value when data is not arranged by f

when using cohen.d(d ~ f, data)
if data is not arranged by the f, it might give a wrong value of Cohen'd

Problem with noncentral confidence interval estimation

From email on March 19, 2017

I have a question regarding cohen.d function.
I am trying to use this function to calculate confidence intervals for effect size using non-central t distribution for 500 samples. However, the function does not run inside a for loop if non-central=TRUE, even for small loops such as a loop of 2.
Here is the code that I have used:
# I generate two independent normal random samples size 10 with 500 repetition

x1 <-matrix(0,10,500)
x2 <-matrix(0,10,500)

for (i in 1:500) {	
 x1[,i] <- rnorm(10,9,2)
 x2[,i] <- rnorm(10,8,2)

      }

####Then I calculate confidence intervals for effect size using non-central t distribution

lb<-c(0)
ub<-c(0)
for (i in 1:500) {
 
 cohen.dres<-cohen.d(x1[,i],x2[,i],hedges.correction = FALSE, noncentral =TRUE)
 lb[i]<-cohen.dres$conf.int[1]  ##saving the lower limit in a vector
 ub[i]<-cohen.dres$conf.int[2]  ##saving the upper limit in a vector
 }
However, as I mentioned function cohen.d in the above for loop does not work.

VD.A Magnitude not being computed on Windows

I found a problem when computing the Vargha and Delaney's A on Windows. It does not print the magnitude of the difference. Also, using the $magnitude variable the result is NULL.

I am using Windows 10, R 3.3.0 and the latest effsize package. Here is a simple call:

require(effsize)
VD.A(c(1,2,3), c(1,2,3))

It prints:

Vargha and Delaney A

A estimate: 0.5 ()

I have tried using other values and nothing changes. In Ubuntu with R 3.2.3 it works fine, but in Windows not. I also tried using R-devel and R 3.2.5 to no success. It also happens in other Windows machines I have here.

effsize Hedges g with paired

From email received by maintainer:

If I run something like cohen.d(myData$after,myData$before, na.rm=TRUE, pooled=TRUE, paired=TRUE, hedges.correction=TRUE)

I suspect the Hedges correction that is used is: 1 - 3/(4*(n1+n2-2)-1), or as on Wikipedia 1 - 3/(4*(n1+n2) - 9), where n1 is the number of cases is the first variable, and n2 of the second. This is the correction for independent samples, not for paired.

The paired version should be I think: 1 - 3/(4*(n - 1) - 1), where n is the number of pairs.

This can be seen for example on page 9 and 10 in chapter 4 of Introduction to Meta-Analysis. Michael Borenstein, L. V. Hedges, J. P. T. Higgins and H. R. Rothstein (chapter available here: https://www.meta-analysis.com/downloads/Meta-analysis%20Effect%20sizes%20based%20on%20means.pdf )

Also on page 275 of Gibbons, Hedeker, and Davis (1993) they write that "this is perhaps a greater problem in the paired sample case since v = n - 1 and not n1 + n2 - 2, as in the two independent sample case". Hedges (1981) original article on p 114 also shows the general formula to be 1 - 3/(4m-1), where m is the degrees of freedom.

References used:

Gibbons, R. D., Hedeker, D. R., & Davis, J. M. (1993). Estimation of Effect Size From a Series of Experiments Involving Paired Comparisons. Journal of Educational Statistics, 18(3), 271–279. doi:10.3102/10769986018003271
Hedges, L. V. (1981). Distribution Theory for Glass’s Estimator of Effect Size and Related Estimators. Journal of Educational Statistics, 6(2), 107–128. doi:10.2307/1164588

Inconsistent results from cohen.d

From an email reporting a strange behavior

With effsize 0.5.2 and R 3.2.0

When entering:

set.seed(12345)
d <- rnorm(5000)
f <- rep(c(1,2),each=2500)
effsize::cohen.d(d ~ f)

The output is

Cohen's d

d estimate: -1.914095 (large)
95 percent confidence interval:
     inf       sup 
-1.961439 -1.866751

While the two samples should be very similar and therefore the estimate should be close to 0.

cohen.d with concentral=TRUE hangs

The function cohen.d when invoked with the noncentral argument set to TRUE hangs, possibly in an infinite loop.

Reproducible with:

  set.seed(52)
  x = rnorm(100,mean=10)
  y = rnorm(100,mean=12)
  d = (c(x,y))
  f = rep(c("A","B"),each=100)
  eff.d = cohen.d(d,f,noncentral=TRUE)

(This issue formalizes an email from Sven)

`cliff.delta` formula does not use same input as `cohen.d` or `VD.A`

The current implementation of cliff.delta does not allow one to separately specify the values and the factors, unlike in the functions for cohen.d and VD.A. From the documentation for the latter two:

d can be a numeric vector giving either the data values (if f is a factor) or the treatment group values (if f is a numeric vector)

This is not the case for cliff.delta; instead one must specify 2 separate numeric vectors.

Re-writing the cliff.delta function to match the syntax of the other 2 would be very helpful, especially when using standard evaluation to programmatically select the desired effect size.

cohen.d does not agree with other cohens_d measures

I'm an R newbie, so I'm probably missing something but the effsize package gives a radically different value for cohen's d from effectsize - which agrees with JAMOVI and hand calculation in excel.:

cohen.d(guilty$sel[guilty$Type == "probe"],guilty$sel[guilty$Type == "irrelevant"], paired = TRUE)

Cohen's d

d estimate: 0.3609801 (small)
95 percent confidence interval:
lower upper
0.2174820 0.5044782

Whereas: from the effectsize package (and Jamovi)
cohens_d(guilty$sel[guilty$Type == "probe"],guilty$sel[guilty$Type == "irrelevant"], paired = TRUE)
Cohen's d | 95% CI

1.02 | [0.54, 1.49]

Paired cohen-d CI error

From email:

Considering formula 4.28 from Michael Borenstein, L. V. Hedges, J. P. T. Higgins and H. R. Rothstein. Introduction to Meta-Analysis

with d=1.422, r=0.95 and 8 + 8 samples

results in a variance of 0.0251

square root of the variance= 0.1584

SE of 0.1584 * critical t value (2.365) = 0.374

0.374 + 1.422 = 1.79
0.374 - 1.422 = 1.048

ES (95% CI) would be 1.422 (1.048, 1.79)

Eff Size reports: 1.422 (0.22, 2.62)

Order of labels change the outcome of VD.A

I ran into this issue the other day. Try to run the following code:

require(effsize)

nsga2<-c(0.59, 0.6, 0.58, 0.59, 0.65, 0.6, 0.59, 0.59, 0.59, 0.6, 0.62, 0.59, 0.61, 0.58, 0.6, 0.61, 0.6, 0.59, 0.59, 0.59, 0.59, 0.6, 0.58, 0.62, 0.56, 0.64, 0.6, 0.62, 0.58, 0.6)
moead<-c(0.47, 0.48, 0.46, 0.48, 0.47, 0.47, 0.44, 0.45, 0.45, 0.47, 0.45, 0.44, 0.46, 0.46, 0.44, 0.41, 0.44, 0.45, 0.45, 0.46, 0.49, 0.44, 0.46, 0.42, 0.5, 0.5, 0.49, 0.48, 0.45, 0.45)

categs <- rep(c("NSGA-II", "MOEA/D"), each=30)
VD.A(c(nsga2,moead), categs)

categs <- rep(c("MOEA/D", "NSGA-II"), each=30)
VD.A(c(nsga2,moead), categs)

The first call of VD.A yields 0 and the second yields 1 by just inverting the labels (wrong answers by the way). Notice that the vectors are not inverted when calling VD.A, only the labels. If one runs VD.A(c(nsga2,moead)) the correct answer 1 is given.

I am not entirely sure, but this does not look like the intended behaviour. I have tested it on Linux and Windows. My version of R is 4.0.1.

Thanks.

The confidence interval does not contain the point estimation value.

Version Found In 0.7.6 https://cran.r-project.org/web/packages/effsize/effsize.pdf

Date 2019-07-17

Steps to Reproduce
testData.txt

Read in the textData.txt into R. I am using R 3.6.1.
Assign the following command:
alpha <- 0.01
cohen.d(d = X, f = Y, pooled = TRUE, paired = TRUE, hedges.correction = FALSE, conf.level = 1-alpha, noncentral = TRUE)

Current Behavior
I got the following results based on my testing:
Cohen's d

d estimate: -1.237032 (large)
99 percent confidence interval:
lower upper
-1.0447818 -0.7636336

The confidence interval did not contain the point estimation.

Expected Behavior
If I did not misunderstand, at least -1.23 should be contained in the confidence limits.

Direction

When using cohen.d(), the factor level order no longer allows you to dictate which group set comes first. This causes the direction of cohen's d to be off depending on the situation.

Is there any way to input what level you want as group 1 vs 2 to ensure the direction of the effect size is correct, such as in time series data?

cohen.d.formula(..., paired=T) gives different results depending on data's sorting order

This is related to #27 but applies to paired data. In the paired=T case, effsize::cohen.d.formula expects the data to be arranged by group first. Although this is a standard convention in non-formula use-cases, it is usually not expected in the formula interface (i.e. t.test(..., paired=T) or other Cohens d implementation do not expect data to be ordered like this).

Code that made me discover this behaviour (see bottom of post for working example):

    dfA <- aggregate(value~subject+group, data=df, FUN=mean)
    d1 <- effsize::cohen.d(value~group, data=dfA, paired=T)
    #d estimate: -0.6141064 (medium)

    dfB <- aggregate(value~group+subject, data=df, FUN=mean)
    d2 <- effsize::cohen.d(value~group, data=dfB, paired=T)
    #d estimate: -0.1663497 (negligible)

Relevant code section in CohenD.R is here:

if( paired ){
    [...]
    s.dif = sd(diff(d,lag=n1))
    [...]
}

Working example showing the issue:

Example:

#Setup some dummy data
set.seed(1234)
numSubjects <- 20
subject <- 1:numSubjects
before <- runif(numSubjects)
after <- before - 0.35*runif(length(before))
df <- data.frame(
  subject=rep(subject,2),
  group=c(
    rep("before",numSubjects),
    rep("after",numSubjects)
  ),
  value=c(before, after)
)

##################################

valueBefore <- df[df$group == "before",]$value
valueAfter <- df[df$group == "after",]$value

# d1 == d2
d1 <- effsize::cohen.d(valueAfter, valueBefore, paired=T)
d2 <- effsize::cohen.d(value~group, data=df, paired=T)

##################################

# Sorted by subjectId
df <- df[order(df$subject),]
valueBefore <- df[df$group == "before",]$value
valueAfter <- df[df$group == "after",]$value

# d1==d2==d3 is the same: d estimate: -0.5307307 (medium)
# but d4 is not: -0.145407 (negligible)

d3 <- effsize::cohen.d(valueAfter, valueBefore, paired=T)
d4 <- effsize::cohen.d(value~group, data=df, paired=T)

##################################

# Sorted by group
df <- df[order(df$group),]
valueBefore <- df[df$group == "before",]$value
valueAfter <- df[df$group == "after",]$value

# d1 == d2 == d3 == d5 == d6 are the same (although still false due to #49)
d5 <- effsize::cohen.d(valueAfter, valueBefore, paired=T)
d6 <- effsize::cohen.d(value~group, data=df, paired=T)

##################################

[question] Computation of cohen.d(..., paired = TRUE)

I just used your cohen.d function with the argument paired = TRUE, since I want to compute the effect size of two means from a repeated measures design. In the help file of the function, it is not stated how the function calculates d when paired = TRUE. Are you using the method reported in Morris and DeShon (2002; see equation 8)?

New version of R

Message from CRAN

Please see the problems shown on
https://cran.r-project.org/web/checks/check_results_effsize.html.

Specifically, see the problems shown for the r-devel Debian checks.

These can be reproduced by checking with --as-cran using a very current
r-devel (r77865 or later), which makes data.frame() and read.table() use
a stringsAsFactors = FALSE default, which is planned to become the new
default for the upcoming R 4.0.0.

Please see
https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/index.html
for more information about this change.

Can you please fix your package to work with both the old and new
default? In principle, this can easily be achieved by adding
stringsAsFactors = TRUE to the relevant calls to data.frame() or
read.table() [or other read.* function calling read.table()], but please
only do this if the sort order used in the string to factor conversion
really does not matter (see the blog post about the locale dependence of
the conversion). Otherwise, please change to create the factors with
explicitly given levels.

The new problems may be from code in a package you depend on: in this
case, please let me know, and get in touch with the maintainer of that
package.

cohen.d not recognizing numeric values

cohen.d(df$X1TXMSCR~df$poverty)
Error in cohen.d.default(df$X1TXMSCR, df$poverty) : First parameter must be a numeric type

is.numeric(df$X1TXMSCR)
[1] TRUE

is.factor(df$poverty)
[1] TRUE

Edit: The issue is that effsize doesn't recognize objects of "haven_labelled" type as "numeric".

class(df$X1TXMSCR)
[1] "haven_labelled"

When I changed the variable to numeric, it worked.

cohen.d() is giving different CIs depending if paired is used or the equivalent 1-sample t-test. Also different to JASP and SPSS

Using effsize 0.8.1 in R cohen.d gives a different CI for Cohen's d compared to JASP 0.16.4 and SPSS 28.0.1.1 (which agree).
Also cohen.d gives different CI for Cohen's d when using the paired option, or if using the differences i.e. for the equivalent one-sample t-test.

Data:
Patient Before After Diff
1 1 201 200 1
2 2 231 236 -5
3 3 221 216 5
4 4 228 219 9
5 5 284 260 24
6 6 267 255 12
7 7 232 221 11
8 8 215 203 12

cohen.d(chol$Before, chol$After, paired = TRUE, within = FALSE)
Cohen's d

d estimate: 0.9989066 (large)
95 percent confidence interval:
lower upper
0.07052508 1.92728810

cohen.d(chol$Diff, NA, mu = 0)
Cohen's d (single sample)

d estimate: 0.9989066 (large)
Reference mu: 0
95 percent confidence interval:
lower upper
-0.7743462 2.7721594

Cohen's d is the same but now has a wider CI. (?)
The CI is still symmetrical around d

Note: JASP 0.16.4 and SPSS 28.0.1.1 both give the 95% CI for Cohen's d as
[0.115, 1.839] which disagrees with both CIs given here.

SPSS 28.0.1.1 has 3 options for the denominator used in estimating the effect sizes

The SPSS CI above uses the sample standard deviation of the mean difference
The standard error of the mean difference is s/sqrt(n) = 8.63444/sqrt(8)= 3.05274

using the sample standard deviation of the mean difference adjusted by the
correlation between measures gives the 95% CI for Cohen's d as [-0.014, 0.577]

using the square root of the average variance of measures (Bonett) gives the
95% CI for Cohen's d as [-0.381, 1.048]

pooled=T causes error

When pooled=T is used, an error mentioning "sd[2]" is raised.

cliff.delta missing values issue

I'm sorry to add potentially more stress at what I'm sure is a difficult time due to the COVID-19 pandemic - we are all under shelter in place orders where I am, and I'm sure it is much worse where you are - but I ran into this problem. When NA values are inputted into Cliff's delta, it seems to yield a different result from when they are excluded. Here is an example:

cliff.delta(master_TD$vine_adaptive_ss,
             master_ASD$vine_adaptive_ss,use.normal=F)

cliff.delta(master_TD$vine_adaptive_ss[which(master_TD$vine_adaptive_ss>0)],
             master_ASD$vine_adaptive_ss[which(master_ASD$vine_adaptive_ss>0)],use.normal=F)

cliff.delta(master_TD$vine_adaptive_ss[which(!is.na(master_TD$vine_adaptive_ss))],
            master_ASD$vine_adaptive_ss[which(!is.na(master_ASD$vine_adaptive_ss))],use.normal=F)

The former yields an effect of .75, but the second and third ones give an effect size of .97.

There is no conflict among these instances when return.dm = TRUE is specified; all three of the following yield effect size .97:

cliff.delta(master_TD$vine_adaptive_ss,
            master_ASD$vine_adaptive_ss,use.normal=F,return.dm=TRUE)

cliff.delta(master_TD$vine_adaptive_ss[which(master_TD$vine_adaptive_ss>0)],
            master_ASD$vine_adaptive_ss[which(master_ASD$vine_adaptive_ss>0)],use.normal=F,return.dm=TRUE)

cliff.delta(master_TD$vine_adaptive_ss[which(!is.na(master_TD$vine_adaptive_ss))],
            master_ASD$vine_adaptive_ss[which(!is.na(master_ASD$vine_adaptive_ss))],use.normal=F,return.dm=TRUE)

I am using effsize 0.7.6.

Error with pooled cohen

From email receive on Feb 8, 2020

table<-read.table(header=T,text="
Student Before After
1 45 49
2 52 50
3 63 70
4 68 71
5 57 53
6 55 61
7 60 62
8 59 67"
)

library(effsize)
attach(table)
### pooled ###
ef_pol<-cohen.d(After,Before)      <-  this works

#### based on treatment ####
ef_std<-cohen.d(After,Before, pooled=FALSE)     <- this doesn’t work, why?
Error in sd[2] : object of type 'closure' is not subsettable

Fix cohen d for within-subjects (paired) pretest-posttest measures

See:
Feingold, A. (2009). Effect Sizes for Growth-Modeling Analysis for Controlled Clinical. Psychological Methods 14, 43-53.

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2712654/

when paired==TRUE use the change sd instead of the pooled sd

Missing CI boundaries for Cliff

Running Cliff's delta with the following code

apeak=read.csv(file.choose())
str(apeak) 
summary(apeak)

library(effsize) 
cliff.delta(apeak$RFCMJCON,apeak$RFSTPCON) # the one that doesnt work
cliff.delta(apeak$RFCMJCON,apeak$RFBDCON) # the one that does work

Where the csv is:

RFCMJCON	RFBDCON	RFSTPCON
100	96.53060989	69.97976011
100	88.25384811	88.32498771
100	81.37560321	69.14113142
100	78.19838158	76.39975355
100	81.30983164	78.60408527
100	78.84196753	34.12150605
100	91.82523169	78.58812467
100	83.99838241	53.82589386
100	89.0228822	41.2351026
100	91.10459716	48.58016505
100	75.13616203	58.10870381
100	85.36038763	54.39906358
100	121.3087267	67.0017805
100	78.70148616	81.02330166

The following output is produced

Cliff's Delta

delta estimate: 0.9285714 (large)
95 percent confidence interval:
inf sup
NaN NaN
Warning messages:
1: In sqrt(S_d) : NaNs produced
2: In sqrt(S_d) : NaNs produced

Cliff's delta for repeated measures?

I report here an anonymized mail:

I have used to calculate the Cliff's Delta effect size for independent variables (Mann Whitney U test), but I'm wondering if the same cliff.delta algorithms can be used to calculated Cliff's Delta effect size for repeated measures (Wilcoxon test).

In negative case, I would appreciate if you could tell me, if you know, the Delta for repeated measures.

Thank you so much for your attention,

Change the magnitude result into an ordered factor

The magnitude field of the effsize return structure of all effect size computation functions is currently a character, it should be transformed into an ordered factor so to be able to compare the values.

Currently unfortunately the ordering of the values is alphabetic, and it does not reflect the semantics behind the different levels. That is:

alphabetic order is "large" < "medium" < "negligible" < "small"
semantic order should be: "negligible" < "small" < "medium" < "large"

mtorchiano / effsize Goto Github PK

effsize's People

Contributors

Stargazers

Watchers

Forkers

effsize's Issues

Whereas: from the effectsize package (and Jamovi) cohens_d(guilty$sel[guilty$Type == "probe"],guilty$sel[guilty$Type == "irrelevant"], paired = TRUE) Cohen's d | 95% CI

Recommend Projects

Recommend Topics

Recommend Org

Whereas: from the effectsize package (and Jamovi)
cohens_d(guilty$sel[guilty$Type == "probe"],guilty$sel[guilty$Type == "irrelevant"], paired = TRUE)
Cohen's d | 95% CI