langtonhugh / asreview_irr Goto Github PK

Code to automatically produce a report from ASReview on inter-rater reliability.

License: Apache License 2.0

HTML 66.20% R 4.12% CSS 1.37% JavaScript 8.19% C 0.31% TeX 0.09% C++ 16.11% Makefile 0.12% Shell 0.01% AppleScript 0.01% Raku 0.01% SCSS 3.31% Less 0.01% Batchfile 0.01% Lua 0.07% Awk 0.08% Perl 0.01%

asreview rmarkdown

asreview_irr's People

Contributors

Stargazers

Watchers

Forkers

mruderman mvansteenbergen

asreview_irr's Issues

Testing

Currently, there is no automatic robust testing. We need to implement testing so that each time a change is made, there are some automatic controls that check whether the changes break stuff. We need to have test files that have different sizes (both small and big) so that all of the different components, such as the waffle graphs, look good.

Minimum and maximum file size

Currently, it will take in any file. We need to add limitations on the file size so the software doesn't break.

Integrate using "renv"

Hey Sam! I'll be the new colleague Rens hired to look at ways to calculate the IRR, and to see if I could extend your package / do stuff with it.

I got some dependency problems, nothing I couldn't solve myself, but it still could be made a bit more user friendly by using an R environment. This encapsulates those packages in an isolated folder. That way, people can just download and run the code without having any trouble with dependencies and what not.

Update and error fix

The report has now been updated and expanded to generate new information, including written explanations of the descriptive statistics. One major change has been to remedy an error in the calculation of n relevant v. unreviewed. If you ran the report prior to December, 2023, please re-run it to check the impact.

Waffle for large data

For data containing > 5,000 abstracts (roughly), the waffle output is rendered rather useless. Could be fixed with ifelse statements, plotting differently for different sample sizes. For example:

less than 5000 abstracts, 1 square = 1 paper
more than 5000 abstracts, 1 square = 10 papers
etc.

license

Can you add a license to your work? As it currently is, no one can use your scripts, but I don't think this is intentional :-)

Implement different ways to calculate the inter-rater reliability

Currently, only the kappa statistic is calculated for the IRR. Implement different ways of calculating the IRR that are for example robust to missing data. Give options for comparing the different ways of calculating the inter-rater reliability, and statistics for comparing them.

Add file validation for the input files

Add validation code to make sure that the two ASReview input files match. If the two files does not match, it should return an error (for now, later it can be extended a bit so that small incongruencies can still work).

record_ids don't match: how to solve

Hi there,

I tried out your script, it works very well.
However, it turns out that the record_ids differ between me and my colleague.
Therefore, I cannot use the results.
Is there any automated way to update the record_ids so they match between the .csv files?
(e.g. by matching on title?)

Different Columns names with Asreview 1.2

Hi,

First of all, thank you the the amazing tool! I think the names of the columns changed in ASReview since this tool was last updated. I personally use Asreview 1.2 with a Mac and tried to use your tool with datasets from simulation projects.

To make it work I had to change some lines of code in report_example.Rmd:

I had to replace all the occurrences of “$included” by “$label_included”
I commented: # select(record_id, first_authors, publication_year, primary_title, notes_abstract). And uncommented the line that was above and remove ‘year’ as I did not see year exported in ASReview dataset : select(record_id, authors, title, abstract)
Performed same actions for following lines
And ran the code again without issues

report_example.Rmd.zip

Make report generation independent of column name or breaking changes

Currently, if the name of the column changes then the reports break. We need to implement the handling of the files in such a way that this issue doesn't happen. I will do that by checking the codebase of ASReview and discussing with Rens how to change that so that it functions without issues.

Format of statistics output

Currently, the report just prints the Console output from the inter-rater reliability tests (e.g., Kappa). It would be clearer and easier for users if this output was formatted into a basic table (similar to how broom works for regression outputs).

File validation

Currently, the files that are given as input for the are not validated to be identical. We need to implement a robust way to handle file validation to check if the files are the same. There's one r script that we can work from in one of the folders.