Git Product home page Git Product logo

many_analyses's Introduction

Many Speech Analyses

Welcome!

This project also lives on the OSF.
General project information can be found here.

Brief summary

We recruited collaborators interested in forming data analysis teams to examine the same dataset. The data analysis teams wrote a journal-ready methods and results section and then peer-reviewed each others analyses. Next, we used meta-analytic techniques to explore the data analysis teams' findings.

Overview

Recent efforts to replicate published findings have uncovered surprisingly low success rates across disciplines. Moreover, several studies have highlighted the large degree of analytic flexibility in data analysis which can lead to substantially different conclusions based on the same data set. Thus, researchers have expressed their concerns that these researcher degrees of freedom might facilitate bias and can lead to claims that do not stand the test of time. Even greater flexibility is to be expected in fields in which the primary data lend themselves to a variety of possible operationalizations. The multidimensional, temporally extended nature of speech constitutes an ideal testing ground for assessing the variability in analytic approaches that derives not only from aspects of statistical modelling but also from decisions regarding the quantification of the measured behavior. In the present study, we gave the same speech production data set to 30 teams of researchers and asked them to answer the same research question. Using Bayesian meta-analytic tools, we have observed substantial variability between teams and submitted analyses, with analytic and researcher-related predictors having little or no effect on the reported effects.

More...

The first MSA project paper is available as a pre-print on PsyArXiv. You can read it here: https://psyarxiv.com/q8t2k/.

Want to play around with the data? We also have a shiny app available here.

many_analyses's People

Contributors

jvcasillas avatar timo-b-roettger avatar

Stargazers

Ladislas Nalborczyk avatar

Watchers

 avatar Stefano Coretta avatar

many_analyses's Issues

RR does not compile after including ForkingPaths.png

I am trying to fix the compilation issue introduced by the including the image ForkingPaths.png.

There were a few typos in the code, but now that those are fixed it still doesn't compile and the culprit is the papaja format. Do we need to use this format? Or can we just switch to a cleaner/simpler format? If we can, I am happy to change this (it's a quick thing), so we have better control over the settings and output.

(What was the journal we were thinking of submitting too again? Maybe they have a latex template and we can just use that as the tex template for knitr).

List of "many analyses" papers

  • Bastiaansen et al., 2019 Time to get personal? The impact of researchers choices on the selection of treatment targets using the experience sampling methodology
  • Boehm et al., 2018 Estimating across-trial variability parameters of the Diffusion Decision Model: Expert advice and recommendations
  • Botvinik-Nezer et al., 2020 Variability in the analysis of a single neuroimaging dataset by many teams
  • Breznau et al., 2020 The Crowdsourced Replication Initiative: Investigating Immigration and Social Policy Preferences
  • Dulith et al., 2019 The Quality of Response Time Data Inference: A Blinded, Collaborative Assessment of the Validity of Cognitive Models
    Functional Imaging Analysis Contest 2006
  • Fillard 2011 Quantitative evaluation of 10 tractography algorithms on a realistic diffusion MR phantom
  • Maier-Hein 2017 The challenge of mapping the human connectome based on diffusion tractography
  • Salganik et al., 2020 Measuring the predictability of life outcomes with a scientific mass collaboration
  • Schweinsberg et al., 2020 Radical dispersion in estimates when independent scientists operationalize and test the same hypothesis with the same data
  • Silberzahn et al., 2018 Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results
  • Starns et al., 2019 Assessing theoretical conclusions with blinded inference to investigate a potential inference crisis
  • van Dongen et al., 2019 Multiple Perspectives on Inference for Two Simple Statistical Scenarios

Other stuff:

Minutes 2021-02-10

Action items

Discuss

  • Meta-analytical model.
    • Prior for the group-level standard deviation: HalfCauchy(0, 1) might be too wide.
  • Considering making a many_analyses.github.io website.

Merge changes from master

I have pushed changes from my and Timo's branches into master. Now it's a good time to merge from master into trdev.

Notes on final analysis

Some notes for us to discuss later

Dependent variable

  • (1) confirmed hypothesis vs. null
  • (2) standardized effect magnitude (will be difficult with GAMs etc.)
  • (3) uncertainty? (some standardized measure of certainty?)

Predictors (don't have to use them all later on, but things that would be interesting)

  • career stage (grad student, postdoc, faculty)
  • self assessed expertise in inferential stats / acoustic analysis
  • familiarity with phenomenon
  • estimated quality of analysis by external reviewers
  • preregged? (binary)
  • did analyst go back to acoustic analysis after starting statistical analysis? (binary)
  • Numbers of DVs, corrected for multiple testing?
  • Random effect specifications? (I think the data allow for random slopes for subjects only, but items are a relevant random intercept variable)

--> many predictors will make collinearity an issue (need to check Tomaschek et al. 2018)

Sample

  • Sample size justification (sample = # of analysis teams)? (might not be relevant if we avoid testing hypotheses explicitly; could easily take some of the variability observed in the other papers and run a power simulation tho)

Model

  • Bayesian multilevel analysis (logistic link for 1, gaussian link for 2)
  • DV ~ Co-variates + (1 | teams)

Minutes 2021-03-24

Action items

  • Upload PDFs of surveys to OSF > Questionnaires > pdfs [@jvcasillas].

Discuss

  • Experimental procedure timeline.

    • Maybe split the timeline in two so that we can interweave speaker and confederate into a single line.
    • Or find other alternatives to make the timeline clearer.
    • Keep and make clear.
  • Sample size justification.

    • We might leave it for now as is or hedge it like "we will aim at as many teams as possible, although we estimate a necessary minimum of 12" or the like. Or even we can take it out and just state that we will make sure to maximise number of teams or something.
  • Ask about ask about semantic atypicality.

  • Make new component on OSF.

  • Create consent form.

Minutes - 2021/01/13 and previous ones

  • Timo will send us the presentation file of the video and we can have a look at the script to see if/what we can simplify a few parts.
  • Read through docs/ExperimentalDetails to see if we can make it simpler and check experimental procedure against what is in the presentation video.
    • Maybe write a shorter Experimental Details doc? and link to extra info on OFS or the like.
  • Research Question: move focus from "referring expression" (i.e. the green banana) to "utterance"? We don't want to bias the analysts to focus on the noun only.
  • Check Joseph's google forms and docs (links here #1).

Registered report

Possible phases of completing the RR. Feel free to add, remove, edit, etc.

  • create RR template in Rmd
  • outline sections
  • determine writing responsibilities
  • complete draft
  • Abstract.
  • Cover letter.

Exploration laundry list

Let's collect our ideas for exploration here:

Interesting covariates:

  • quality (from reviewers)
  • posthoc changes to acoustic analysis = either binary or count
  • self-reported RDF exploitation = count
  • acoustic dimension (f0, int, dur, ...) = categorical
  • temporal window (segment, syllable, word, phrase) = categorical
  • Uniqueness of model specification or acoustic analysis via Soerensen = continuous
  • Conservativeness of model (number of random effect parameters) = count/continuous
  • exclusion of data = binary
  • demographics of analysis teams
    • experience (maybe time from PhD? can be negative) = continuous
    • initial belief in effect = continuous
  • frequentist vs Bayesian?
    ...

Possible exploration ways

  • logistic regression on dichotomous inferential conclusion (there is an effect vs. nope)
  • random forest classification yielding a relative ranking of predictor
  • visual map, e.g. uniqueness of acoustic measurement against model conservativeness against effect size

Minutes 2021-03-03

Action items

  • RR [@stefanocoretta]
    • Add data description.
    • Update peer-review section.
    • Update links.
    • Edit simulation.
  • Update OSF [@troettge @jvcasillas]
    • READMEs.
  • Update per-review Goggle form: use 0-100 scale in all Qs [@jvcasillas]

Discuss

  • Peer review (#44).
  • Questionnaires.
  • Clean-up Gdrive.

Minutes 2021-02-24

Action items

  • RR
    • Edit section on reviewers' ratings.
    • Simulate ratings and update appendix.
    • Upload data to OSF and WIki.
    • F.A.Q.?
    • Detailed workflow (Timo)

Discuss

  • Peer-review survey.
    • Rating for acoustic and statistical analysis: separate or together?
    • For the other criteria text or numeric scale?
  • Divergent transitions in simulation of factors model.

Minutes 2021-03-10

🌌 - Project status overview.

Action items

  • RR [@stefanocoretta]
    • Update peer-review section.
    • Update links.
    • Add comments on parts that defo need editing.
    • Edit simulations.Rmd.
  • Update peer-review survey (see #44 comments).
  • Update OSF [@troettge @jvcasillas]
    • READMEs. Also see #6.
  • Clean unused files on GDrive [@jvcasillas]

Discuss

  • Comments in RR draft.
  • Original "Blue bananas" study details to be included in the RR (as in related links/publications/other).

Terms

  • analysis team
  • reported effect sizes: effects reported by the analysis teams.
  • standardised (refitted) model: refit of analysis team's model in Bayesian.
  • standardised effect size: effects as returned by the standardised model.
  • Bayesian random-effects meta-analysis and meta-analytic(al) model
  • analytic(al) and demographic predictors (formerly factors).

landing page - alpha

First attempt, but I'm struggling with the we design. Ill ask Liz later. Have a look and tell me what you think. There is still a lot of text, I'd love to reduce this by half or so. The buttons are not yet clickable. I envision leading to the agreement survey and to an email address if there are questions.

@jvcasillas @stefanocoretta

Agenda for 20th Jan

AGENDA for meeting at 20th Jan:

[] Discuss landing page (Timo)
[] Discuss analyst journey and experience (Timo)

Merge from master

I am done for now. I won't make any more changes until after our meeting tomorrow. πŸ™‚

Storing dataset

Qs regarding dataset:

All materials are right now on a privat OSF. Relevant files are

  • the sound files (duh). Pretty big file folder (30files Γ  100+MB)
  • relevant textgrids that we generated according to our prereg (see Q below)
  • experimental files for two experiments: 1. A norming study to norm the typicality of modifier plus noun (e.g. blue banana) and 2. the production experiment. For 1., there is a github repository from a student of Michael who programmed that one.
    For 2. we have images and trial lists in the OSF repo and I'm sure Simon has the code for the browser-based experiment somewhere.

I think we should make all of this available to the analyst. It would be potentially helpful to make a screen recording of both the norming study and the actual experiment so that analyses teams can watch them and get a feel for the experiment.

Okay my Qs:

  • How should we go about storing the dataset. OSF strikes me eventually as the better option.
  • How should we go about the dataset per se. Maybe we should offer the analyses teams a text grid that at least labels relevant utterances and their experimental condition? What do you think?

Summary of neuroscience, cognitive modeling, clinical, predictive models in RR

List of similar papers:

  • https://docs.google.com/document/d/12XlCX0UWKLH1RJ9NsSciWxXn4DD0oh4slqd_TmKjp0A/edit
  • Bastiaansen et al., 2019 Time to get personal? The impact of researchers choices on the selection of treatment targets using the experience sampling methodology
  • Boehm et al., 2018 Estimating across-trial variability parameters of the Diffusion Decision Model: Expert advice and recommendations
  • Botvinik-Nezer et al., 2020 Variability in the analysis of a single neuroimaging dataset by many teams
  • Breznau et al., 2020 The Crowdsourced Replication Initiative: Investigating Immigration and Social Policy Preferences
  • Dulith et al., 2019 The Quality of Response Time Data Inference: A Blinded, Collaborative Assessment of the Validity of Cognitive Models
    Functional Imaging Analysis Contest 2006
  • Fillard 2011 Quantitative evaluation of 10 tractography algorithms on a realistic diffusion MR phantom
  • Maier-Hein 2017 The challenge of mapping the human connectome based on diffusion tractography
  • Salganik et al., 2020 Measuring the predictability of life outcomes with a scientific mass collaboration
  • Schweinsberg et al., 2020 Radical dispersion in estimates when independent scientists operationalize and test the same hypothesis with the same data
  • Silberzahn et al., 2018 Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results
  • Starns et al., 2019 Assessing theoretical conclusions with blinded inference to investigate a potential inference crisis
  • van Dongen et al., 2019 Multiple Perspectives on Inference for Two Simple Statistical Scenarios

Other stuff:

Need input

@jvcasillas @troettge I need your input on these items (we can discuss on Wed).

  • 1.2 Crowdsourcing alternative analyses
    • "Sum-up of Neuroscience Cognitive modeling Clinical Predictive models"
    • I am not sure what I should add here.
  • 2.4.1 Descriptive statistics
    • "We anticipate that the majority of statistical analyses will be expressible as a (generalized) linear regression model." ADD FORMULA
    • Is this just the general glm formula?
  • 2.4.2 Meta-analytical estimation
    • Which predictors are we gonna include in the meta-analytical model?
  • Is a CC-BY license for the data ok?

(linking the meeting minutes: #9)

Jvc retry

Ok, this one should work. I made some light edits to the RR and the experimental design .Rmd files. I also put each sentence on its own line so that it is easier to track changes moving forward. The RR also includes a second bib file for references (separate from the one papaja generates automatically for R packages). I've only added two refs so far, but I wanted to get it started.

Minutes 2021-02-17

Action items

  • Continue with RR updates.
  • Create org, website (@jvcasillas).
  • Import issues from old repo.

To discuss

  • many_analyses ported to many-speech-analyses org (issues not imported)
    • Imported issues with the GitHub REST API.
  • (B4SS workshop: learnB4SS/learnB4SS#16).
  • RR
    • Reviewers' ratings: mean + SD.
    • Move formulas to appendix.

Licensing

We need to decide licensing for data code, teams code/data, etc.

Licensing for the production study data depends on the consent and on what you guys (of that project) prefer.

@jvcasillas @troettge

Bib

bib updates (about 90% done)

master branch restored

Just a ping that I restored the master branch with work that was removed. If you have changes in your local repo, just to be on the safe side, stash them, and then do git pull --force. Then you can pop the stash and continue working.

@troettge @jvcasillas

Minutes 2021-03-17

Action items

  • RR [@stefanocoretta]
    • Edit based on last meeting's discussion.
    • Update links.
    • Edit simulations.Rmd.
    • References: parker et al (eco), summerfield, charles et al [@troettge @jvcasillas].
    • Info on consent form [@troettge].
  • ForkinPaths.png [@troettge]
    • Connect panels in A.
  • Upload .svg versions of images. [@troettge]
  • Update peer-review survey (see #44 comments).
  • Update OSF [@troettge] #6
    • Upload wav file to Data > production > audio.
  • Clean unused files on GDrive [@jvcasillas]
  • Prep list of terms to be discussed [@stefanocoretta]. #51

Discuss

  • Audio data. #6
    • eudat.eu
    • We will upload the audio data when the time comes to share them with the teams. In case reviewers will request to listen to them, we will share them directly.
  • Terms for models, demographic, analytic/al. See #51
    • What we have is fine, but we need to think of something better for "demographic" (this includes research expertise and prior belief). I will also make sure that there is consistency of use of analytic(al).
  • Sample size justification.
    • We might leave it for now as is or hedge it like "we will aim at as many teams as possible, although we estimate a necessary minimum of 12" or the like. Or even we can take it out and just state that we will make sure to maximise number of teams or something.
  • Ask teams if they know about papers.
    • Decide on when and where to ask.
    • At the end of the analyses and peer review.
  • PDFs of surveys instead of links to them?
    • We will upload pdf versions of the surveys to OSF and link to those in the RR.
  • Original "Blue bananas" study details to be included in the RR (as in related links/publications/other).
    • @troettge I'd like to be more explicit about the fact that the data comes from a prev study and was not designed for this one specifically. Is there any info that would help doing this?

Merge from master

I am done for now. I won't make any more changes until after our meeting tomorrow. πŸ™‚

Minutes - 2021/01/27

Action items

  • Write summary of prior meta studies (@troettge #31).
  • Check missing references in RR (@troettge).
  • Check https://journals.sagepub.com/home/amp for tex template (@jvcasillas).
  • Landing page, copy text to gdoc (@troettge ).
  • Analysts journey (all).
    • Think about how to divide in steps (conceptually and visually).
    • Think about possible icons.
  • Continue RR draft (@stefanocoretta).
    • Edited the factor analysis section (still WIP).
  • Continue OSF prepping.
    • Prepared TextGrids.

Discuss

  • Timo will work on the landing page.
  • Structure of Reviewers questionnaire.
  • Check transcription of objects and colours.

Peer review

As it currently stands, there is a mismatch between the text in the RR and the survey. But a few general questions:

  • Does the rating of the overall analysis include both the measurement and the statistical parts?
  • Are all ratings supposed to be a number between 0-100? Or are all ratings except the one of the overall analysis supposed to be text?
  • How are we gonna use the text responses?

@troettge @jvcasillas

OSF Data component

  • Data dictionary.
  • Update Data component Wiki. @jvcasillas @troettge
  • Create an updated version of the trial lists to include:
    • typicality of AN combo (categorical)
    • mean typicality rating of AN combo
    • sd of typicality rating of AN combo
    • trial number
    • sentence
    • english glosses
  • Make an infographic of the data sources. @troettge are we doing this? @troettge
  • Update OSF with osf/manage-osf.Rmd. @jvcasillas @troettge
    • That uses osfr. You need to create a PAT and add it in a file called .secrets in your home folder with the syntax
      Sys.setenv(OSF_PAT = "<your-pat-here>")
    • Create missing Readme.Rmd file.
    • Upload Readmes and data to OFS by running/rendering manage-osf.Rmd.

Minutes - 2021/01/20

Action items

  • Port questionnaires to Google Forms (@jvcasillas, #8).
  • Continue work on RR draft (@stefanocoretta).
    • Made several edits, mostly in Step 4 section (still WIP).
  • Edit ExperimentalDetails.
    • Restructuring of sections.
    • Reformulation and extension.
  • Edit forking paths figure (@troettge).
  • Prepare .bib file for RR (@jvcasillas).
  • Prepare graphics for analysis workflow (all) to be given to teams as aid (#26).
  • Update OSF Data component (@stefanocoretta).
    • Started prepping of TextGrids.

Discuss

  • RR
    • πŸ’‘Idea: would be nice to use a measurement-error model also in the analysis-factors analysis, so that we don't just use a point-estimate of the deviation from the meta-analytical mean.
      • GO.
    • How will we enter the demographic factors in the factors analysis since we have a score per individual rather than team?
      • Averaging scores.
      • Scale 1-100 belief.
    • Help with summarising neuroscience, cognitive modeling, clinical, predictive models in RR.
      • Timo will do it @troettge.
    • Discuss missing/ambiguous refs in RR
    • Use of papaja in RR draft (#28).
  • Discuss landing page (Timo)
    • A few things should be highlighted: co-authorship, improving standards.
  • Discuss analyst journey and experience (Timo)
    • In RR and landing page.
  • Materials
    • Info on German determiner and adjective ending for prepping TextGrids.
  • Project management
    • GH project Meetings.
    • GH Actions (assign label meetings and project Meetings automatically if issue title contains "Minutes").

Typicality

@troettge Timo, could you let me know how the typicality categories were created? I just need to know if the mean or the median (or else) was calculated for each Adj/N pair, so I can include that in the trial lists (now it has mean, sd and median).

Minutes 2021-02-03

Action items

  • Landing page @troettge: #35.
  • Reviewers questionnaire @jvcasillas: draft in gdrive.
  • Check transcription of objects and colours.
  • Continue RR drafting @stefanocoretta.
  • OSF prepping @stefanocoretta: #16.
    • Corrected TextGrids IPA transcription (@troettge would be good if you could check here and here).
    • Added extended trial lists (with additional columns).
    • Added typicality ratings from the norming study, including a summary of it. [NOTE: there were encoding issues in the original file, for Moehre and Waescheklammer, which I fixed]

Discuss

Analytical approach collection questionnaire

Points to discuss on the questionnaire for the collection of the analytical approach:

  1. S et al have included a box for pasting a script (if any). Do we want that or do we want to collect scripts differently?
  2. S et al have a question about their research questions and ask how likely the teams think that soccer referees tend to give more red cards to dark skinned players (and there's a Likert scale from very unlikely to very likely)? How could our question about prosodic variation be formulated?

Materials

See here for list (not exhaustive) of materials of we'll need.

List of necessary materials


STEP 1: recruitment


STEP 2: primary data analyses

  • Dataset:
    • Curate dataset (@stefanocoretta).
    • Data dictionary (variables description) (@stefanocoretta).
    • Detailed description of experiment (plus norming study): here.
    • Screen Recording of experiment (plus norming study)
  • Analytical approach collection email (SS p. 53). (Stefano)
  • Analytical approach collection questionnaire (SS p. 54). (Stefano, #8)
  • Familiarity with analytical approach questionnaire (SS p. 72). (Joseph draft)

STEP 3: peer reviews

  • Peer review survey of final analytical choices (SS p. 74): draft.

STEP 4: evaluate variation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.