A collection of R scripts to calculate the magnitude of the intervention effect.
mrc-cso-sphsu / effect_estimates Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v3.0
License: GNU General Public License v3.0
Statistics needed for whole population and groups (by time and run) important to the study of inequalities (education, households type, gender, age group, income quintile)
dhm
)dhm <= 24
)les_c4 == "EmployedOrSelfEmployed"
)laboursupplyweekly
to numerical)equivaliseddisposableincomeyearl
)atriskofpoverty == 1 || atriskofpoverty == null
)dgn == "Male"
)dgn == "Female"
)dag >= 25 && dag < 45
)dag >= 45 && dag < 65
)n_children_1-17 != 0
)n_children_1-17 == 0
or missing)les_c4 == "EmployedOrSelfEmployed"
)les_c4 == "NotEmployed"
)grp_emp == 1 && out_poverty == 1
)grp_emp == 0 && out_poverty == 1
)deh_c3 == "Low"
)deh_c3 == "Medium"
)deh_c3 == "High"
)scenario | run | time | grp_all | grp_male | grp_female | grp_age25 | grp_age45 | grp_hchild | grp_nchild | grp_emp | grp_unemp | grp_povin | grp_povout | grp_edlow | grp_edmed | grp_edhi | out_ghq_base | out_ghq_ref | eff_ghq | rank_ghq_base | rank_ghq_ref | rank_eff_ghq | out_ghqcase_base | out_ghqcase_ref | eff_ghqcase | rank_ghqcase_base | rank_ghqcase_ref | rank_eff_ghqcase | out_emp_base | out_emp_ref | eff_emp | rank_emp_base | rank_emp_ref | rank_eff_emp | out_emphrs_base | out_emphrs_ref | eff_emphrs | rank_emphrs_base | rank_emphrs_ref | rank_eff_emphrs | out_income_base | out_income_ref | eff_income | rank_income_base | rank_income_ref | rank_eff_income | out_poverty_base | out_poverty_ref | eff_emp | rank_poverty_base | rank_poverty_ref | rank_eff_poverty |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
scenario number or description | the run number, missing for overall results combining all runs | this is the year | dummy variable (1 if the results relate to the whole population, zero otherwise) | dummy variable (1 if the results relate to the male population, zero otherwise) | dummy variable (1 if the results relate to the female population, zero otherwise) | dummy variable (1 if the results relate to the age 25-44 population, zero otherwise) | dummy variable (1 if the results relate to the age 45-64 population, zero otherwise) | dummy variable (1 if the results relate to households with children in population, zero otherwise) | dummy variable (1 if the results relate to households without children in population, zero otherwise) | dummy variable (1 if the results relate to the employed population, zero otherwise) | dummy variable (1 if the results relate to the unemployed population, zero otherwise) | dummy variable (1 if the results relate to the in-work poverty population, zero otherwise) | dummy variable (1 if the results relate to the out-of-work poverty population, zero otherwise) | dummy variable (1 if the results relate to the low education population, zero otherwise) | dummy variable (1 if the results relate to the medium education population, zero otherwise) | dummy variable (1 if the results relate to the high education population, zero otherwise) | mean of continuous GHQ12 score for baseline | mean of continuous GHQ12 score for the reform | effect of reform on continuous GHQ12 score for this run and year (out_ghq_base minus out_ghq_reform) | rank of out_ghq_base for group and year | rank of out_ghq_ref for group and year | rank of eff_ghq for group and year | mean of dummy GHQ12 caseness for baseline | mean of dummy GHQ12 caseness for the reform | effect of reform on dummy GHQ12 caseness for this run and year (out_ghqcase_base minus out_ghqcase_reform) | rank of out_ghqcase_base for group and year | rank of out_ghqcase_ref for group and year | rank of eff_ghqcase for group and year | mean of employment dummy for baseline | mean of employment dummy for the reform | effect of reform on employment dummy for this run and year (out_emp_base minus out_emp_reform) | rank of out_emp_base for group and year | rank of out_emp_ref for group and year | rank of eff_emp for group and year | mean of employment hours for baseline | mean of employment hours for the reform | effect of reform on employment hours for this run and year (out_emp_base minus out_emp_reform) | rank of out_emphrs_base for group and year | rank of out_emphrs_ref for group and year | rank of eff_emphrs for group and year | mean of income for baseline | mean of income for the reform | effect of reform on income for this run and year (out_income_base minus out_income_reform) | rank of out_income_base for group and year | rank of out_income_ref for group and year | rank of eff_income for group and year | mean of poverty dummy for baseline | mean of poverty dummy for the reform | effect of reform on poverty dummy for this run and year (out_poverty_base minus out_poverty_reform) | rank of out_poverty_base for group and year | rank of out_poverty_ref for group and year | rank of eff_poverty for group and year |
This should make any code reviews much easier.
As a quick initial run of creating graphs, here's the running of the code currently in R/outputting graphs.R
:
library(readr)
library(tidyverse)
library(SPHSUgraphs)
out_data <-
read_csv("C:/Programming/covid19_effect_estimates/data/new_data.csv",
show_col_types = FALSE)
# tidying dataset ---------------------------------------------------------
compare_results <- out_data |>
filter(grp_all == TRUE, !is.na(run)) |>
select(-contains("eff"), -starts_with("grp")) |>
pivot_longer(
-c(scenario, run, time),
names_to = c("metric", "outcome", "policy"),
values_to = "val",
names_pattern = "(.*)_(.*)_(baseline|reform)"
) |>
pivot_wider(
c(scenario, run, time, outcome, policy),
names_from = metric,
values_from = val
)
# faceted graph -----------------------------------------------------------
compare_results |>
ggplot(aes(time, out, colour = policy, fill = policy)) +
geom_vline(xintercept = 2019, colour = "red") +
stat_summary(
fun.data = mean_se,
geom = "ribbon",
alpha = 0.5,
colour = NA
) +
stat_summary(fun.data = mean_se, geom = "line") +
stat_summary(fun.data = mean_se, geom = "point") +
facet_wrap(~ outcome, scales = "free_y") +
scale_fill_manual(
"Policy",
aesthetics = c("fill", "colour"),
labels = c("Baseline", "Covid policy"),
values = sphsu_cols("University Blue", "Thistle", names = FALSE)
) +
theme(legend.position = "bottom")
As a small initial point - these outputs currently have a very small range (accidently put just as standard error of means across 50 runs in file). Should intervals combine the sd's of the means of each run, rather than taking the variance between the mean outputs?
Suggested edits so far:
Created on 2022-07-15 by the reprex package (v2.0.1)
This code needs some data validation.
@dkopasker I'd like you to describe here what you expect from every variable in the raw data files, that includes their range, possible NA
or NaN
, etc. In addition, we need to clearly state how the code processes such values. Common options include dropping such entries, asking the aggregate functions to ignore them, or replacing with some imputed values (mean of some sort, median).
This approach should make the data analysis much more reproducible.
We should also consider LABsim output as potentially corrupted as the code itself is not tested properly. Constant changes in the code do not help here either. That means this script must notify every user in the case any input value is out of expected range.
Hi @vkhodygo,
The R code used to aggregate the results from 1,000 runs of the simulation has an unusual amount of ties up to the eighth decimal place. For example, two observations for out_ghqcase_baseline
in grp_age25
in 2020 have a value of 0.54180604. This happens multiple times across various outcomes, groups, and years. Could you please review the code to ensure there is not an error?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.