b-rodrigues / rap4all Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://raps-with-r.dev/
License: Other
Home Page: https://raps-with-r.dev/
License: Other
Hi Bruno,
In 6.3.4 Data frames, the code:
nested_unemp %>% mutate(nrows = map(data, \(x)filter(x, year == 2015)))
Why do we need to use an anonymous function? Would it not work fine as:
nested_unemp %>% mutate(nrows = map(data, function(x){filter(x, year == 2015)}))
Just a few comments reading the page about functional programming, not an issue per se:
maybe
monad, I would have been interested to see how you manage this in the frame of a pipe, how you can substitute a default value to Nothing(), etc.cmd.exe
, ls
is not there. It is only available in Powershell, at least on Windows 10, no idea on Windows 11 though.On the page about git, you tell people to use git add .
then git commit -am
. But git commit -am
corresponds to git add .
followed by git commit -m
. -a
= all (stage all files) and -m
= message.
There is also a confusion between git am
which applies a series of patches coming from a mailbox and git commit -am
.
For more info you can use git help am
and git help commit
.
Hi Bruno,
I have cloned your repo - targets-minimal, activated renv using renv::activate()
and then ran renv::restore()
. The packages download but do not install. I get the following error message from trying to install the MASS
package:
With a load more error output. I am wondering wether this is because I am using a newer version of R (R 4.3.1) and renv (renv 1.0.2). Or should this still work?
(I am using a Mac)
Hi, I enjoyed reading your blog (such as about running R code using older versions of R), so I couldn't miss your book.
After reading the fist "real" chapter, I feel it is a bit rambly and mixes several concepts in a single paragraph, and several ideas are spread around multiple paragraphs. I think that using more subheadings would improve the clarity and the organisation of the text.
I see some points that could be highlighted and their current position in the text:
The current text could be reorganised to follow said highlights, which would enhance the reading
Hi Bruno,
I have cloned your repo - targets-minimal, activated renv using renv::activate()
and then ran renv::restore()
. The packages download but do not install. I get the following error message from trying to install the MASS
package:
Installing MASS ... FAILED
Error: Error installing package 'MASS':
================================
* installing *source* package ‘MASS’ ...
** package ‘MASS’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
using C compiler: ‘Apple clang version 14.0.0 (clang-1400.0.29.202)’
using SDK: ‘MacOSX13.1.sdk’
clang -arch x86_64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I/opt/R/x86_64/include -fPIC -falign-functions=64 -Wall -g -O2 -c MASS.c -o MASS.o
MASS.c:37:23: error: unknown type name 'Sint'; did you mean 'int'?
With a load more error output. I am wondering wether this is because I am using a newer version of R (R 4.3.1) and renv (renv 1.0.2). Or should this still work?
Thanks
Jake
Would pak
help simplify the installation of system-level dependencies in the Docker chapter?
PS Love the book.
Hello Bruno, I took up your challenge from Mastodon to read a bit of your draft. One thing that stands out is how you use the . separator rather than , separator for 000s. e.g. 400.000 rather than 400,000. As a native English reader, I usually expect to see the , and the . is used as a decimal point. If I do a PR with typos, do you want me to mark these for change also?
Do you prefer a single PR for every single change, or one PR for all changes I make to a single file?
In the preface, you use some acronyms like ONS, RAP, PI, etc. The text could be easier to read if they would be associated with their definition, at least for their first appearance, ex: Office for National Statistics (ONS).
Some of them are not defined, ex: Principal Investigator (PI). I don't know if this is so well known outside of R&D circles.
Line 353 in 4a4f7cd
Thank you for the book! I'm loving the message so far. I'm always trying to explain this concept to newer R-programmers reluctant to learn git or document code, and now I have a much better resource :)
I know you are still working on it, but I figured I'd point out anything I see to save you some time. I don't mean to nitpick, just trying to help.
Anyway, the line above is missing a "be".
So what does this all mean? This means that reproducibility is on a continuum,
and depending on the constraints you face your project can be not very reproducible
to totally reproducible.
Purpose of the book: teach practitioners (in research or industry, doesn’t matter), how to make workflows reproducible. Do we agree on that?
As for an outline:
I think that we could skip any intro to R, and state that readers need to be familiar with R already. I would say at least comfortable with writing functions already?
What do you think?
Hi Bruno,
Have you thought about updating your scripts which are prepared for inflation by fusen to be concurrent with the syntax used to deal with indirection and tidyselect. This allows you to access the variables directly within the pipe, i.e., to filter on the column locality, you need to call it using .data$locality. An example from your code is:
make_commune_level_data <- function(flat_data){
flat_data |>
filter(!grepl("nationale|offres", **.data$**locality),
!is.na(**.data$**locality))
}
Without this, when inflating, there are many warnings which appear telling you either:
make_country_level_data: no visible binding for global variable
‘locality’
Undefined global functions or variables:
locality
Or, in a tidyselect you cannot use .data$ and instead just enclose the variable in "", or you get the warning:
Use of .data in tidyselect expressions was deprecated in tidyselect 1.2.0. Please use
I am slightly unsure about whether this is best practice and it doesn't seem to be too clear online. But I thought I would share as a food for thought as by doing so removes all the warnings. If you are interested, I am happy to share my code to save you time.
Here are some links:
https://dplyr.tidyverse.org/articles/programming.html#indirection
https://community.rstudio.com/t/use-of-data-in-tidyselect-expressions-is-now-deprecated/150092
a space is missing in the command cat.Rprofile
.
I suppose it should be cat .Rprofile
.
On the page about git, I would recommend using apt
instead of apt-get
. It is not false per se, but why not use the more modern and potentially more user friendly version for beginner users?
More info there: https://itsfoss.com/apt-vs-apt-get-difference/
sudo apt update
sudo apt install git
just some things I must not forget
title
Thanks for making this OS. I have read some parts of the book, attended one of the workshops held online, and found everything in your talk helpful! One minor thing I noticed is that, as far as I am concerned, RStudio projects should be mentioned to improve reproducibility. At least for those using the RStudio IDE (maybe the majority, nowadays), this is very handy.
Reference:
Formally, "data" is considered a plural noun.
For example, see here: https://www.britannica.com/dictionary/eb/qa/Is-Data-Singular-or-Plural-
Nevertheless, it is common for people to use it as a singular noun. I think this is a decision that needs to be taken by you, and to be implemented consistently throughout the book, so I won't open PRs on this manner.
Regardless of which chapter I am viewing of the book at: https://raps-with-r.dev/, the 'Edit this page' link returns a '404 - page not found' error.
minor typo in chapter 13 right above section 13.3 (I think)
The pipeline is nothing but a list (told you lists where were a very important object) of targets.
Sorry, not really a R developer, this is probably a stupid question.
On this page, could you please explain quickly what those packages are doing?
library(dplyr)
library(purrr)
library(readxl)
library(stringr)
library(janitor)
I was expecting some explanation, then it did not came ;) Or is it useful to mention loading packages here?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.