Comments (32)
Note that there is a specific semantic for adding reviewers to the Author
field of the DESCRIPTION file. We've just added it to our recommendations
in the packaging guide:
https://github.com/ropensci/packaging_guide/blob/master/README.md#authorship
On Tue, Apr 12, 2016, 4:49 PM Scott Chamberlain [email protected]
wrote:
@masalmon https://github.com/masalmon yes, sounds good
—
You are receiving this because you are subscribed to this thread.Reply to this email directly or view it on GitHub
#24 (comment)
from software-review.
Based on a single repo of mine, it doesn't look like Appveyor puts info into .git/config
, with which to confuse itself later, should you rename things. Maybe this is unique to travis.
from software-review.
@masalmon I think we're ready to approve and have you move over your repo. I've marked it as approved. I'll close this issue, and move over to ropenscilabs
org - let me know if you need help with that.
from software-review.
"Moving repository to ropenscilabs/ropenaq. This may take a few minutes."
from software-review.
Cool I can see the redirection, so this is my last message here. 😸
from software-review.
@masalmon Thanks for your submission. We are looking for reviewers now.
from software-review.
Ok, thanks for the update!
from software-review.
from software-review.
Review
General Comments
- This is a really useful package providing access to an equally useful service that I can see filling a large need. I know quite a few air quality scientists that use R, and I'm sure this will be a big help to them. I'm actually surprised that this is the first package providing this functionality - nice work @masalmon!
- The functions all perform as expected, and from what I can see provide access to the full suite of servcies that the API offers. Overall, the documentation is clear and concise.
- I had a look over at the github page for the OpenAQ API, and it appears that the API is not yet fully mature. This must make developing an R package for it a bit challenging (but also likely helps raise important questions about its functionality and expose hidden bugs so is also probably quite useful in that regard).
- All examples in the README and function documentation run without errors.
- Running
devtools::check()
turned up fiveNOTE
s, oneWARNING
, and oneERROR
. The error is related to failing tests (see note below), and I think you should be able to address the rest relatively easily from the messages. Many of them are due to files/folders that should be listed in.Rbuildignore
(.lintr
,.travis.yml
,README.Rmd
, etc.). - The vignette doesn't build due to a Gateway Timeout (as noted in ropensci-archive/ropenaq#5)
develop
folder - It's probably a good idea to move this to a development branch and merge features into master as they are developed.
rOpenSci Packaging Guidelines
- Title - To conform to rOpenSci guidelines, you could make the name all lowercase (
ropenaq
) or even justopenaq
? - You've chosen nice, simple, descriptive function names that mirror the openaq API endpoints
- Consider adding a code of conduct to let people know that this is a friendly, open, space to work in :)
- The README is great - nice examples and descriptions. The only thing missing is the title of the package at the top.
- It's generally considered best practice to use
Imports:
instead ofDepends:
for dependencies (see https://github.com/ropensci/packaging_guide#-package-dependencies & http://r-pkgs.had.co.nz/namespace.html#imports)
DESCRIPTION
file
Title
: I think 'An R package for ...' is frowned upon by CRAN - perhaps something like "Access Air Quality Data From OpenAQ"?- In the
Maintainer:
field, put your name (rather than "Who to complain to") :) - The
Description:
field should contain one or more complete sentences describing what the package does. License:
I think the correct specification isGPL (>= 2)
. @sckott: does ropensci have policies/suggestions on licensing?
Dependencies
- As noted above, I don't think any of your dependencies need to be in the
Depends:
section.httr
,dplyr
, andlubridate
can all be inImports:
, and from what I can see,XML
is not needed at all?testthat
should be inSuggests:
as it is not necessary to use the package, only to run the tests. - I feel like
OpenStreetMaps
(inSuggests:
), which depends onrJava
, is a pretty heavy dependency just to build the vignette. Can you useggmap
for the mapping in the vignette instead ofOpenStreetMaps
?
Function Usage and Documentation
- All functions:
Value:
Should specify that adata.frame
("tbl_df"
) is returned instead ofdata.table
.- I think it would be helpful to specify in a bit more detail what the columns in the returned data frame contain. E.g.,
locations
andcount
.
cities
:- In what format(s) can
country
be specified? Only the two-letter ISO code as in the example, or will a full name do as well?
- In what format(s) can
locations
:- It would be useful to have a bit of an explanation about what combinations of
city
,country
,location
are allowed. For example, canlocation
be specified withoutcountry
andcity
? What happens if you specify a city that's not in the country specified? Perhaps also reorder tocountry
,city
,location
so they are in decreasing order of resolution (and the same order as inmeasurements
). - It doesn't look like
city
,country
,location
, andparameter
can take more than one value, though that might be a nice feature. (eg.locations(country = c("IN", "CN"), parameter = c("pm25", "o3"))
). I took a quick peek at the API docs and it doesn't look like the API accepts more than one... but could you query the api once for each country and combine them? This could be something to think about for a future version though! date_from
anddate_to
: Perhaps mention what class of inputs these arguments accept - (e.g.,Date
,POSIXct
,POSIXlt
,character
in the form YYYY-MM-DD?)
- It would be useful to have a bit of an explanation about what combinations of
measurements
: Similar comments as forlocation
in terms of documentation.latest
:measurements
has a limit param with default value of100
.latest
seems like it should have the same for consistency (unless of course the API doesn't have it).
Source Code
General
- I like how thorough you are in your checking of inputs. This will cause the functions to fail informatively when the user inputs something incorrectly.
- Code modularization:
-
Try to write code that is modular to reduce repetetion (the 'DRY' principle: Don't Repeat Yourself). Try to write many more smaller functions that encapsulate discrete operations and that can be reused. Look across the functions and find code that is repeated, and wrap it up as functions to be called within each larger function. For example, you can create a function that returns the base url for all queries:
base_url <- function() { "https://api.openaq.org/v1/" } ## Then you can use that function in each exported function: countries <- function() { url <- paste0(base_url(), "countries") httr::GET(url) etc... }
That way if the url changes etc., you only have to change it in one place.
-
It's common practice to put these functions that get used in a lot of other functions in a file called
utils.R
orzzz.R
.
-
- If you're using the
pkg::function()
syntax, you don't need to use@import
/@importFrom
to add these to theNAMESPACE
file. As long as the packages are in theImports:
section of theDESCRIPTION
, you can use the functions directly using::
without including them in the package namespace. See Hadley's recommendations for this here: http://r-pkgs.had.co.nz/namespace.html#imports. Of course there are many differing opinions on this!
Getting and parsing content
-
Use
httr::stop_for_status
to throw an error when the request fails, and/or usehttr::status_code
if you want to check for Gateway Timeout (Error code 504) specifically. That way you don't have togrepl
for the specific text "Gateway time-out". -
httr::GET
can take a named list of arguments to build the query. This can make constructing the call a bit cleaner, so that you don't have topaste0
them all together with&
s. You can check the arguments near the top of the function, then construct a list of all arguments to pass to thequery
argument ofGET
. For example:url <- "https://api.openaq.org/v1/locations?" country = "IN" city = "Mumbai" location = NULL arg_list <- list(country = country, city = city, location = location) httr::GET(url, query = arg_list) # Any NULL elements of the list supplied to the query paramater are automoatically dropped
-
In most (all?) cases you should be able to get the results from
page <- httr::GET(query)
more easily and directly by usinghttr::content(page, as = "text")
, and then parsing from json to a data frame usingjsonlite::fromJSON()
, rather than parsing the columns individually withlapply
. This should result in more concise code, and hopefully be quite a bit faster.- You could write a function to use everywhere for parsing content such as:
get_content <- function(x) { content_type <- httr::headers(x)["content-type"] if (!grepl("application/json", content_type)) { stop("Unexpected content type") } cont <- httr::content(x, as = "text") parsed_content <- jsonlite::fromJSON(cont, simplifyDataFrame = TRUE, flatten = TRUE) parsed_content$results } cont <- get_content(page) ## cont will be a nice data frame
- If you use the code above in
latest()
, themeasurements
column will be a list-column, with each row in the column containing a data frame with columns:parameter
,value
,lastUpdated
,unit
. To then expand the data frame returned fromget_content()
you can usetidyr::unnest_(cont, "measurements")
, which will expand themeasurements
column and fill in the other columns appropriately.
-
For
dplyr
functions, it is probably best to use the non-standard evaluation (NSE) versions of the functions (i.e.,mutate_()
instead ofmutate()
. See the vignette as a good resource to get started. -
measurements
:- I'm not sure you need to specify
page=1
or setlimit
to 100 if it'sNULL
, as these are the default values for the API.
- I'm not sure you need to specify
Tests
- Generally, they look great! There's not a lot you can do test actual values, but you could also add some tests ensuring the dimensions of the returned tables are what you expect - number of columns, and especially number of rows when a limit value is supplied.
- Good job having
skip_on_cran
for the functions that query the API. - Best practice is to have one
context()
call at the top of each file, so you could split the tests into multiple files namedtest-cites.R
,test-countries.R
, etc. - All tests passed for me, except for
latest
almost always fail with a Gateway Timeout, but as noted this is likely an API issue. You could skip them for now until the API issue is sorted?
This is my first code review, so thanks for letting me take part! Let me know if you have questions or comments; I'm happy to discuss further.
from software-review.
Thanks a lot @ateucher for your thorough and very useful review! I'll start working on this as soon as possible. It will make the package really better, thank you!
I guess there is no devtools function for easily changing the name of the package? I bitterly regret naming it too fast.
from software-review.
@masalmon it was my pleasure - I really enjoyed doing it :) If there's anything you run into that you need help with, ping me in an issue and/or on twitter. Also, I am definitely not an expert on httr
and parsing json content, so if any other rOpenSci people have advice that is better than mine please chime in (@sckott?)
from software-review.
@masalmon, I should have linked to this in my review: Here's the httr vignette on writing API packages: https://cran.r-project.org/web/packages/httr/vignettes/api-packages.html
Ther's a lot of stuff on authentication that shouldn't apply to this package, but the other tips might be useful.
from software-review.
@ateucher cool, and again, thank you! I hope you'll like the final result. ☺ I'll make a point-to-point answer hopefully next week. I have started improving things but now for some other ones I need to spend time reading (lazy evaluation for instance) -- I am very grateful for the links you put in the review. Regarding httr and jsonlite your advice already made my code much simpler! My old solution was quite clumsy. Have a good week-end!
from software-review.
nice work on the review @ateucher
@masalmon happy to answer any httr
questions - one that comes to mind right now:
httr v1.1.0
has some changes https://cran.rstudio.com/web/packages/httr/
content()
warns now if you don't set encoding, so use theencoding
parameter when callingcontent()
from software-review.
review of Ropenaq
I really like this package! Its an elegant means of accessing a very interesting (and large!) online database. I have a few pieces of advice that I think will make the code more modular and easy to develop as it grows. And grow it will, because it is quite useful indeed!
general
i had quite a challenge to try to run this package! I am using Linux (Ubuntu 14.04), and for the longest time I could not get R to talk to the API. This was extra frustrating, because i was able to use the API from curl
at the command line! I finally found the answer by posting this StackOverflow Question. h/t @richfitz for his assistance.
To correct the problem I had to run
apt-get install libcurl4-openssl-dev
I also uninstalled then reinstalled curl
:
install.packages("curl")
You can apparently find out if your user will be having this problem by running curl::curl_version()$ssl_version
, and if it answers GnuTLS
, then perhaps there should be a warning or a message directing them to this advice.
Function reviews
general thoughts on API functions
I really like how you've got five functions, and each maps to the five different API queries that are available. I was thinking that the names of functions might be a bit similar to functions in other packages that also use geography. What do you think of prefixing every function with a package-specific string? Perhaps something like aq_
before all the user-facing functions (i.e. countries()
becomes aq_countries()
). This is similar to another rOpenSci package, geonames
, which uses GN
(as in GNcities()
). This has the added benefit of playing very nicely with Rstudio, which will show all the package functions as completions when users type aq_
and hit tab.
I'd suggest using httr
convenience functions more. This will help when you are assembling queries and checking responses:
- assembling queries would be easier if you use the query
argument accepted by httr::GET()
. For example, in cities.R
, line 24
, you could use this approach. It will build the API call for you, and handle the encoding too:
URL <- "https://api.openaq.org/v1/cities"
indonesia <- httr::GET(url = URL, query = list(country = "ID"))
Status codes are also easily parsed with httr::status_code()
. For example, instead of grepl("Gateway time-out", toString(contentPageText))
, you could use httr::status_code(page)
to find out if 200 (everything is OK), 504 ("Gateway time-out") etc. Then, you could write some logic to stop the functions with warnings, or messages, etc as appropriate.
For parsing responses, I'd suggest you try out a two step approach - first use grep
to identify if the content of the response is json. Then use jsonlite::fromJSON()
to parse it. That is probably simpler and more robust than parsing by hand (e.g. as you do in cities.R
L30
to L34
:
## confirm that the response is what you expect:
grepl("json", headers(page)$`content-type`) # is there a better way?
## parsing content
contentPageText <- httr::content(page, as = "text")
jsonlite::fromJSON(contentPageText)
countries()
This function looks great to me, besides the general tips above I have nothing to add. The help file might mention that the codes for countries follow the "ISO 3166-1 alpha-2" specification. (er, that is, assuming that they actually do). In a future version of the package, perhaps you could include the package countrycode
to allow more flexible naming of countries.
cities()
I really like how you are checking for available countries! I was noticing that you check to see if a country is available in all datasets, and I suggest you write a little country-checking function for the purpose. Perhaps something like:
check_country <- function(x){
ctrys <- as.character(countries()$code)
is_country <- x %in% ctrys
if(is_country){
x
} else {
msg <- paste0("x must be one of ", paste0(ctrys, collapse = ", "))
stop(msg)
}
}
check_country("CA")
or perhaps something via assertthat
:
assertthat::not_empty(country)
ctrys <- as.character(countries()$code)
assertthat::validate_that(country %in% ctrys)
If you rely on httr
to handle proper URL encoding, perhaps you could do away with the cityURL
column in this output? I think it could be distracting for users to see the country information appearing twice.
Ropenaq::locations
If Ropenaq
is not loaded, the example for this function doesn't work:
Ropenaq::locations(city='Houston', parameter='co')
Error in Ropenaq::locations(city = "Houston", parameter = "co") :
could not find function "%>%"
you could fix this by adding:
importFrom dplyr `%>%`
to the function.
You have a large chunk of if-else statements devoted to confirming that the inputs are actually available. You could accomplish a very similar result using match.arg
:
match.arg("pm25", c("pm25", "pm10", "so2","no2", "o3", "co", "bc")
In fact, a lot of this function (Lines 34 to 154) is about checking the inputs. You could make it more succinct by doing something like this:
1 make a list that grabs all the optional arguments (arglist <- list(parameter = parameter, has_geo = has_geo, value_from = value_from, value_to = value_to
etc)
2 get rid of any that are null , e.g. by doing Filter(arglist, f = Negate(is.null))
3 check the remaining ones (using match.arg
or assertthat
)
4 if they are OK, pass them httr::GET
encoding
Wow, you're getting lots of different encodings in your results from the API! I ran the following:
brz <- Ropenaq::locations(country = "BR")
brz$location %>% stringi::stri_enc_mark(.)
And I found that you've got a mix of UTF-8 and ASCII encoding! I'm afraid I'm no expert in this stuff. I was looking at this SO question to try and figure out how it works. There should be a way to change the whole vector to the same encoding. Perhaps you could write the file out to a tempfile()
, then read it back in?
Also, I know that this is probably the fault of the API and not your package, but the places named in brz$location
are not places in city of Sao Paulo, but cities in the state of Sao Paulo. Yet the resulting data frame has "Sao Paulo" listed for all rows, in the column "city".
Vignettes
maps.Rmd
I wasn't able to actually get the vignette to build, until I went in and removed "NL".
When running the for
loop at L31
to L33
, you're using c()
to grow a vector. That can have the unfortunate effect of slowing down your computation. I think it could be faster (and simpler) to grab all the location info like this:
country_codes <- Ropenaq::countries()$code
## create a list to hold output
locations_country <- vector(mode = "list", length = length(country_codes))
names(locations_country) <- as.character(country_codes)
for (country in country_codes) {
print(country)
locations_country[[country]] <- Ropenaq::locations(country = country)
}
dplyr::bind_rows(locations_country)
Rather than plucking out the lat and long columns, this downloads the location info and simply combines all the outputs with dplyr::bind_rows()
.
I only get this "Gateway time-out, but try again in a few minutes." error when I look for country=NL
. Perhaps you could add tryCatch
to this for loop, so it doesn't automatically fail when only one country does not work? tryCatch()
could also go inside Ropenaq::locations()
itself, to prevent it from failing in situations like these.
That said, I wonder if it is worth actually downloading this data every time the vignette is built. Could you instead download a version of the data and place it in data/
, or perhaps R/sysdata.rda
? The downloading code could still serve as a useful example (though not run, you could set eval=FALSE
), and you'd still get your (very nice!) map
ggplot2 code: no need to add theme settings twice.
The resulting map is exquisite and really highlights just how much data is available from this API! I can see why you wanted to make it 😄
graphics.Rmd
This vignette is quality: succinct, useful and attractive! Perhaps a more descriptive name for this (and other) vignettes? I think users are likely to have other packages with vignettes called graphics
(or come to think of it, maps
).
ropenaq-vignette.Rmd
Great intro. I'd like it to be a little longer, including some general background on what is air quality, how is it measured, who are these people over at the openaq project. A link to their website (in addition to the api specification) would be helpful.
DESCRIPTION
A few small changes to this file would be helpful (and avoid those nasty notes on R CMD CHECK
):
- replace "Maintainer" and "Author" with
Authors@R: person("Maelle", "Salmon", email = "[email protected]", role = c("aut", "cre"))
GPL (version 2 or later)
->GPL(>=2)
- I got the note "Description should contain at least one complete sentence."
- I couldn't find where you use the
XML
package. So you could remove it from here? Imports
is usually where you put the packages you need, rather thanDepends
MISC
develop/
folder. This looks like some code that you're thinking of including either in the package or in some vignettes in the future (for example, I couldn't findfindMissing2015()
used anywhere outside ofdevelop/finding_proportion_missing.R
). I'd suggest working on these things on a development branch, just to keep the package a bit simpler.- i'd remove all the references to
lintr
, since that just tidies your code and doesn't affect the package at all.
from software-review.
I really enjoyed reviewing this package! Thank you so much @sckott for the invitation, and thank you @masalmon for contributing such an interesting package. I learned a lot while reading your code.
I'm also very sorry this is late, and I hope it is still useful 😊 😄
from software-review.
Thank you so much @aammd, this will be very helpful! I'm looking forward to implementing your advice! It's interesting to see how you and @ateucher have spotted different things, both reviews complete each other very well! 👍
I'm just wondering what you mean by references to lintr? I like having the lintr-bot tell me where my syntax is wrong each time I push something to Github. And if I let it here and get contributions from other people later, I can also check their code for "syntax compatibility" with mine.
Regarding encoding: ropensci-archive/ropenaq#4 Imagine an user wants to see which cities are available for Poland and then make a query with measurements for one of these cities. I provide cityURL because otherwise, you couldn't just do copy and paste in a query since with an e.g. French locale the accents of city names in Poland do not appear so what you see in a result of cities() wouldn't be the real name of the country (missing accents) and therefore the httr translation wouldn't be right either. So now I expect the user to provide the encoded version as argument, and I help him/her by providing the cityURL and locationURL column in all tables that have city and location. I had no better idea. 😕
Anyway, I'll start working on the package again next week and I already know it will get better thanks to both reviews, thanks again @ateucher and @aammd !
And thanks @sckott for the httr remark !
from software-review.
Ropenaq -- point-to-point answer to the reviewers
Thanks again for your very helpful and thorough reviews, @ateucher and @aammd ! When I'm a (R) grown up I want to be a Ropensci package reviewer too and make people happy and improve their code 😃
I'll answer to both of you in the same answer to avoid repeating myself. I was not sure how to structure my answer so I hope it'll be easy to read -- I thought that having a remark in italic/answer would be too cumbersome. Both reviews were well structured and easy to read! Please tell me if I missed something.
General comments
- I was surprised too that there was no ropenaq-R-package yet but the Open AQ platform is fairly new. I asked them if they had any plan to add an R package before starting to write one because writing an existing package = 😬
- Regarding the API, @jflasher and @RocketD0g are very patient and helpful so it's been a pleasure to work with them, although the API is not mature.
- If you know air quality scientists who have suggestions for new data sources for the platform, you can give them this link: https://github.com/openaq/openaq-fetch/issues where they can open an issue for suggesting the new data source.
- I think I have fixed my description file and package structure for passing R check (except a warning for the vignettes that do not have the corresponding html -> is this an issue?). I only had two files in develop/, and now have a develop branch, I will see if I use it: one of the two files has already become a vignette for presenting how to use the R package openair with Open AQ data.
- I have added @aammd installation problem shooting in the README. It must have been quite frustrating to have to deal with it! Thanks for spotting this! Is the README the best place for such advice?
rOpenSci Packaging Guidelines
- I'm keeping the change of name (to ropenaq) for the end. Is there any clever way to change the name of the package in all files? Could it be problematic for the github repo (although it is only a change of case)?
- The names of the functions now all start with "aq_".
- I now use Imports instead of Depends.
DESCRIPTION file
I have made all changes.
Dependencies
- I have made all changes, including deleting XML. It was nice to learn about the difference between Imports and Depends. I think I've seen a very thorough article about it somewhere but always thought I was going to read it later and later never came.
- The vignette now uses ggmap. The result is not as nice as with OpenStreetMaps but I agree on its being an heavy dependency, and I mention the possibility to use OpenStreetMaps.
- I know import "%>%" from dplyr so that
Ropenaq::aq_locations(city='Houston', parameter='co')
even if the package is not loaded.
Function Usage and Documentation
- I have changed all return descriptions so that the data.frames are presented as such.
- In all help files there is now a list of the columns with their meaning and class.
- In the help file of country I have written the ISO specification of the country code and that only the code, not the name, can be used. Is this clear enough or should I write this in all help files? In all help files the argument country is described as "Limit results by a certain country – a two-letters codem see countries() for finding code based on name.".
- I have tried to better explain country/city/location in the details section of the
aq_locations
function. But I'm not sure it's clear. - I think that the possibility to query several parameters, or even several countries, should be a feature in a future version. It might be very useful indeed! I also need to find a good way to deal with very big
aq_measurements
query. I've added thepage
argument that I had somehow neglected, and a warning for when the number of found measurements is higher than limit. The warning returns the number of pages to be queried (this warning is printed no matter the value ofpage
). I am not sure I should do the loop inside the function because if the user e.g. writesaq_measurements()
it would last forever, so I prefer let the user write their own loops for now. Any suggestion? Is this something for a future version or should it be dealt with now? Should I add a verbose argument for turning warnings off? - I am now more precise in the description of the
date_from
anddate_to
arguments. - The latest endpoint of the API does not have any limit argument (I had to ask).
- Regarding encoding / having the URL-encoded columns for city and location, as I said in a previous comment it was the easiest solution I found for not getting any locale problem and helping the user make queries for e.g. Poland, but I'll be happy to hear any suggestions. I've noticed that if I paste a city name with polish accents in my R session, whether at home (French locale) or at work (Spanish locale), then some accents disappear which was why I created the columns.
Source Code
- I check inputs very thoroughly because I've seen a remark about it in an old Ropensci package review, I think. 😉
- I have added a utils.R file and all functions files have become much smaller. In the utils.R file there are
base_url
,buildQuery
which returns a list of arguments forhttr::GET
after checking them,getResults
that returns a table of results that still needs to be transformed, the three functionsfunctionURL
,addCityURL
andaddLocationURL
for adding the columns with the URL-encoded names, thenaddGeo
,functionGeo
andfunctionNotGeo
for adding the latitude and longitude columns,functionTime
andfunctionTime2
for transforming the columns with times, andfunctionParameters
for transforming the column with the parameters names foraq_location
. I was first happy with all functions and then I started using standard evaluation instead of non-standard evaluation... And now I have the impression that I have written very ugly code, so I'd be grateful for suggestions. - I actually like both
importFrom
for having the "list" of functions in the NAMESPACE andpkg::
which makes my code easier to read (at least for me). Is it ok if I let both? - I now make a better use of
httr
andjsonlite
:
# does the query and then parses it
getResults <- function(urlAQ, argsList){
page <- httr::GET(url = urlAQ,
query = argsList)
# convert the http error to a R error
httr::stop_for_status(page)
contentPage <- httr::content(page, as = "text")
# parse the data
resTable <- jsonlite::fromJSON(contentPage)$results
resTable <- dplyr::tbl_df(resTable)
return(resTable)
}
The suggestions you both made made it all very easy! I just wonder if having a simple stop_for_status
is user friendly enough?
- I have added a check for the size of the table resulting from the query in 4 functions (all except countries). If the table has zero rows, the function returns an empty table and a warning. This was not mentioned in the reviews but I realize I had no good message when it happened.
- @aammd, are you happy with the
buildQueries
function or do you think it would be easier to read/better if I usedassertthat
? - I now use
tidyr
inaq_latest
. - I read the part about NSE/SE in Hadley Wickham's advanced R book and the vignette you pointed me to, as well as this article http://www.numbertheory.nl/2015/09/23/using-mutate-from-dplyr-inside-a-function-getting-around-non-standard-evaluation/ but I still feel my code is really clumsy. It works (well maybe it's not even clever SE!) and it has
_
for alldplyr
functions but it seems too complicated. I'll be happy to get any help.
Tests
- There is now a test file for each context.
- I commented the failing
aq_latest
test. I promise I don't usually do this when I get atestthat
error! - I don't know how to test my check of the results table being empty. I don't know how to systematically write something that will yield an empty table. And now code coverage is only 94%. 😆
lintr
- Why remove the references to lintr?
Vignettes
- There are now four of them: the general one, the one with the graphics for one day in several places, the one with the map, and one for showing how to use the great
openair
package on the retrieved data. - I have changed their filenames and titles, are they ok now?
- The vignettes now all build. Do you get any problem with them?
- The vignette creating the map now uses your code suggestion, @aammd, thanks a lot! And I saved the data in a data/ folder in the vignettes folder, is it good practice or should the data be elsewhere?
- As I mentioned earlier now this vignette uses
ggmap
instead ofOpenStreetMaps
. - I was not sure which infos to use about air quality in the general vignette? I have added a sentence about Open AQ that I copied from their website, and a link to said website. What else would you like to see here, @aammd? Maybe I could ask @jflasher and @RocketD0g where I could find a concise description.
I hope I haven't missed anything. This review process is really very interesting. I can see that the package is getting better, and I can apply nearly everything I learnt to other R code/packages I have to write, so again, many thanks!
PS: How can I acknowledge the reviewers' input in the DESCRIPTION file, @sckott ?
from software-review.
Just giving some news about the package:
- The API recently made a few changes: now all endpoints have a limit and a page argument, so I need to change a few lines of code in the parts where I check that e.g. a country exists if the users inputs the country as argument. My package still works because I added the arguments in the functions, but it's still missing this checking thing.
- I have started to add memoization with memoise.
- @aammd and @ateucher I'd like to add you as package authors because of all the useful feedback, is it ok for you?
I'll come write a comment again when I'm done with the modifications because of the API change and with memoise (a few weeks at most). I hope that the editor @sckott is ok with this "strategy"!
from software-review.
@masalmon yes, sounds good
from software-review.
I have now prepared a new version of the package. Important things to note:
- I have not changed the package name... yet? Is it compulsory? I feel bad having the capital R, but I kind of wish there were a magical solution for changing the name everywhere. But hey if I have to I will!
- All endpoints have now page and limit arguments.
- The output of all functions is now a list with 3 data.frames: 1 for results, 1 with metadata, 1 with timestamps (I have updated the doc, readme and vignettes accordingly). So now in metadata there is the number of results found on the API, which can be smaller than the limit argument -> in this case the user has to figure out how to compare number of results found and the limit argument, and then loop over several pages. As I said previously, in a future version I'd like to add better tools for getting a lot of data but on the other hand for really big files OpenAQ has csv dumps. So maybe the future version thing is better documenting the "problem".
- I have now decided against caching because the results change very often.
- I still have both "::" and importFrom in some cases I think... But I quite like the "::" -> does it make the code much slower? Is it really bad bad style?
Now I'll wait for feedback in order to finally make the package ready for transfer... and when the API becomes stable I'd like to submit it to CRAN which makes me a bit nervous.
And again, thanks a lot for the previous feedback!
from software-review.
@masalmon I would be delighted to be listed as a reviewer on this package! Its a really cool package and I enjoyed studying it.
I would love to comment on the changes you've made -- I apologize for saying little in response to your awesome & extensive revisions. I confess it felt a little overwhelming! @sckott , what is the best way to go forward? Should I write another comment here?
from software-review.
I have not changed the package name... yet? Is it compulsory?
We don't make anyone change pkg name, but we do find it easier for users if they don't have to worry about case. If you change the name, github will redirect users automatically to the new repo name masalmon/Ropenaq
-> masalmon/ropenaq
what is the best way to go forward? Should I write another comment here?
@aammd if you have time, yes please! We realize your time is valuable, so comments don't have to be extensive
from software-review.
Ok so changing the name will be my next step! 😇 I was smarter with my next package.
from software-review.
Let me pre-emptively warn you about a real headscratcher re: travis:
TL;DR manually change the travis slug in .git/config
and save yourself hours of frustration.
from software-review.
Thanks @jennybc, that's thoughtful! I'm quite lucky because this API does not require authentification, however I'll be careful if something bad happens on Appveyor (for which I encrypted my Github path and put the secret in the config file) since it might have similar issues I guess.
from software-review.
It seems it was much easier than I had envisioned... So now this whole issue deals with a package called ropenaq.
@aammd you're listed as a reviewer now in the DESCRIPTION.
from software-review.
Hi @masalmon , sorry or being so silent lately! I would be thrilled to be added as a reviewer, thanks you! I've been lurking and watching some of your work, am I think it looks awesome! Unfortunately I don't think I'll have time to do much more review on it in the short term, but if I have a chance I will peek in, as I am very interested and I know I can learn a lot from your work.
from software-review.
@sckott, how should I proceed?
I re-read my answers. My main doubts are now:
- a few documentation things (whether I explain city/country/location well, and whether the few words about air quality in the readme/general vignette are enough)
- whether I should get read of all "::" if I use importFrom in the same file (I like having the "::" for seeing what I used but I know it makes computation slower... but not that much slower?)
For a future version of ropenaq my goals are:
- implement a few air quality index functions.
- making it easier to get lots of data, probably by documenting how to do this with the package versus how to go on the website and get csv dumps.
When the API is stable I want to make the package ready for CRAN submission.
from software-review.
@sckott awesome, thank you!
The "Transfer ownership" button is in the danger zone of settings which makes me a bit nervous, ah, but it seems pretty easy.
Since I'm a member of ropenscilabs
do I already have the right to transfer a repo there? When would the repo be transferred to ropensci
? I'm a bit lost with all these names 😁
from software-review.
I think you should be able to - try it and if it doesn't work I'll fix some things.
We started the 2nd organization ropenscilabs for earlier stage packages/projects - once more mature they'll move to ropensci - more mature
is not a very specific definition - so can't give answer yet on what that means - but at some point we'll move to ropensci
from software-review.
@sckott ok thanks for explaining, now I'll transfert it! 😁
from software-review.
Related Issues (20)
- eDNAjoint: an R package for interpreting paired environmental DNA and traditional surveys HOT 16
- Capybara HOT 9
- Presubmission inquiry: `chopin` HOT 2
- Presubmission enquiry for cocoon: Extract, Format, and Print Statistical Output HOT 3
- QuadratiK: A Collection of Methods Using Kernel-Based Quadratic Distances for Statistical Inference and Clustering HOT 52
- osmapiR: an implementation of OpenStreetMap API v0.6 for R HOT 89
- ## Reply to package review by @njtierney HOT 3
- cancerprof: API Client for extracting data from State Cancer Profiles HOT 3
- rsi: Efficiently Retrieve and Process Satellite Imagery HOT 40
- cancerprof: API Client for extracting data from State Cancer Profiles HOT 38
- `chopin`: Computation for Climate and Health research On Parallelized INfrastructure HOT 54
- rpigpior HOT 13
- phangorn HOT 2
- eDNAjoint: R package for interpreting paired environmental DNA and traditional surveys HOT 43
- Presubmission Inquiry of the `cryptoQuotes`-package HOT 6
- Capybara HOT 16
- QuickJSR - Portable, lightweight JavaScript engine for R
- feowR: Download shapefiles of Freshwater Ecoregions of the World (FEOW) HOT 12
- Presubmission inquiry: rJavaEnv
- Presubmission inquiry: mbquartR
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from software-review.