philsf / philsf.workflow Goto Github PK
View Code? Open in Web Editor NEWphilsf's data analysis "mis en place".
Home Page: https://philsf.github.io/philsf.workflow/
License: GNU General Public License v2.0
philsf's data analysis "mis en place".
Home Page: https://philsf.github.io/philsf.workflow/
License: GNU General Public License v2.0
We could probably use sed
to systematically update the version number in all relevant files
Don't auto-download the PDFs, show them in the browser instead.
We forgot to include the new scripts templates in SAR-init
after creating them.
Improvements in readability, and structure of the repo README
Include the new best practices after the good experiences tested in SAR-2021-003
and SAR-2021-004
.
knitr
and here
input
in SAP, just as results
is sourced in SARanalytical_mockup
structure by default in documentsMost links are broken.
Now that there are more than one document type, change the header and footer to acknowledge
Header
Footer
obs: Consider creating first page header/footer for better reading flow
Some tasks are not being executed when there are special chars in a dir/path, including spaces.
We need to escape every variable that contains a path when passing it to a command like git
.
Create a .Rbuildignore
file to pass the devtools
check.
Include
Options:
R CMD
from the shell (future-proof)Github doesn't offer good render of pander
tables in the simple
format, but rather we must use rmarkdown
format.
Set table.style
in panderOptions
in all templates.
After running basefiles
, include a standard line + comment to ignore private datasets.
Don't commit this change to allow opt-out of this option in the workflow.
Use SAR (Statistical Analysis Report) instead of "data analysis" in pt)
yyyy-SAR-NN-v01
yyyy-SAP-NN-v01
SAR_pt
SAR_en
SAP_pt
SAP_en
scripts
README
Consider batch renaming all previous report for easier identification of documents;
Consider including semantinc information in the document code
The SAR-sync
script could accept an argument to sync one one remote, instead of all of them. Ideally filter lsit of available remotes from given arguments
It would be nice to use best practices and declare workflow versions in relevant files.
We've been ignoring table sizes for far too long. It's time to pick a size.
Tables with too many columns (e.g. analytical data) can be divided into smaller tables, but we don't want too many subdibisions
Tested in SAR-2021-003:
philsf-biostat/org#35
Now that there are more than one document template, the filename styles.docx
clashes when more than one template is used in the same project. Create individual filenames for each style to avoid this.
Create mockup examples for objects required for the documents templates, or disable the objects in the templates. results.R
must run for it to be enabled by default.
analytical_mockup
for SAP templatesresults
in SAR templatestab_desc
?tab_inf
?install
checks for !Linux to configure the environment for Windows. This is wrong.
We should check for MINGW64/MINGW32 instead to set that environment.
pander()
is throwing a warning that it can't handle the output of as_kable()
.
Check if it has ever worked (and how). We are passing the "caption" argument to pander()
, but kable()
also accepts it.
Maybe it is time to configure knitr::kable() to the options we're using with pander() and switch.
SAR-sync et al issues warnings when the current branch is not available in the remote to be fetched.
We can do better, by tweaking the ignored strings
So we shall.
Create a new Rmarkdown template to create Protocols for future studies and projects.
Use the same formatting choices as the report template.
For the templates to appear in the templates list we need an R package structure.
We might need to change the repo name
The github_document
format from rmarkdown
automatically creates a .md
file that is compatible with GFM. This file is suitable for online visualization, complete with figures and links.
This might be a useful output format for reports, although it may disrupt current practices.
One could also replicate this behavior by setting
keep_md
table.syle
to rmarkdown
by defaultresults
in SAR docsInclude options for the most frequent content?
Most scripts detect usage mistake only when user doesn't provide any arguments.
A major overhaul of all scripts arguments processing is needed.
Create a template with a standard agreement of how I approach the consulting:
dataset
)
Use a similar template from the SAR/SAP documents, and send over e-mail in PDF format.
Parse remote-all-set
argument list to check if there is an optional list of SAR_DIRs to remote-set
install
to bin/
philsf.workflow-*
BM-setup
anymoresync.repo
-> SAR-sync
, *.all
)Include the new best practices after the good experiences tested in
SAR-2021-003
and SAR-2021-004
.
input
: default dataset names
data.raw
from read_excel
analytical
and analytical_mockup
labelling
describe
: tables setup (before inference
)
gtsummary
, gt
gtsummary
default theme + optional pt translationeffectsize
inference
:
infer
add_p
add_difference()
modeling
:
broom
, broom-mixed
plots
:
ggplot2
themeplots-save
:
as_rtf()
+ writeLines()
(commented out)results
:
modeling
to before inference
Adapt the README in english from a recent analysis repo
Using here::here()
produces a full path, instead of a relative path.
We need to use error = FALSE
to pass a relative path that works.
Even so, we lose the ability to preview in Rstudio editor panel.
At least we can check the viz of the document in HTML.
Most consulting jobs are performed remotely, so printing and signing a document is unfeasible.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.