cornelllabofornithology / ebirdst Goto Github PK

View Code? Open in Web Editor NEW

65.0 65.0 11.0 400.63 MB

Access and Analyze eBird Status and Trends Data

Home Page: https://ebird.github.io/ebirdst/

License: GNU General Public License v3.0

R 99.73% TeX 0.27%

ebirdst's People

Contributors

Stargazers

Watchers

Forkers

youngforever222 benitezrcamilo ricschuster kemushi54 cgentemann developmentseed chananaman mengchiehfeng cc0210 nugger18299

ebirdst's Issues

Function error: calc_full_extent() not working

I just started using ebirdst for some exploratory data analysis. Not sure if it's a problem on my end or not, but I can't find anything elsewhere on the internet about this so I figured I'd ask here. Everything else so far is working great, but when I try to use calc_full_extent() I get the following error:

Error in calc_full_extent(occ_proj) : could not find function "calc_full_extent"

And when I run ?ebirdst::calc_full_extent I get this:

No documentation for ‘calc_full_extent’ in specified packages and libraries:
you could try ‘??calc_full_extent’

Seems the function does not exist. Has it been replaced with something different? Thanks!

Question about PD confidence intervals

Hello! Great package. I was wondering if somebody could explain the procedure you follow to construct the bootstrapped confidence intervals for the partial dependence plots. Given that PDPs are constructed from grouped data, I didn't think there was a way to construct traditional confidence bands. Do you know of any literature that support the method used in this package?

The file 'config.json' does not exist/is not being downloaded

Hello,

I am using path_bw <- ebirdst_download("Blackpoll Warbler", tifs_only = FALSE), and I get 73 files downloaded to the bkpwar folder.

Downloading file 1 of 73: bkpwar_range_raw_lr_2021.gpkg
Downloading file 2 of 73: bkpwar_range_raw_mr_2021.gpkg
Downloading file 3 of 73: bkpwar_range_smooth_lr_2021.gpkg
Downloading file 4 of 73: bkpwar_range_smooth_mr_2021.gpkg
Downloading file 5 of 73: band-seasons.csv
Downloading file 6 of 73: bkpwar_abundance_full-year_max_hr_2021.tif
Downloading file 7 of 73: bkpwar_abundance_full-year_max_lr_2021.tif
Downloading file 8 of 73: bkpwar_abundance_full-year_max_mr_2021.tif
Downloading file 9 of 73: bkpwar_abundance_full-year_mean_hr_2021.tif
Downloading file 10 of 73: bkpwar_abundance_full-year_mean_lr_2021.tif
Downloading file 11 of 73: bkpwar_abundance_full-year_mean_mr_2021.tif
Downloading file 12 of 73: bkpwar_abundance_seasonal_max_hr_2021.tif.....

When I run abunds_bw <- load_raster("abundance", path = path_bw) I get the following error
The file 'config.json' does not exist in: C:\Users\jcast\OneDrive\Documents\UsersjcastOneDriveDocumentsR-GISebird_data\2021\bkpwar

Do you have any idea why this is happening? EBIRDST_KEY is already set and using the latest version of the package.

Thank you in advance.

Jessica

Use setZ to store times

Just discovered raster has a formal way of storing dates associated with RasterStacks. setZ() and getZ() get and set dates for layers. Could be a better way of handing dates than using the layer names, which requires parsing.

calc_bins() fails for Chimney Swift and Purple Martin

From Lotem at Audubon

The second issue (which I mentioned on the call) is that the calc_bins function sometimes throws an error for me:

Error in optim(start, llik, hessian = TRUE, method = method, ...) :

L-BFGS-B needs finite values of 'fn'

This has happened with both Chimney Swift and Purple Martin, and I’m not sure how to fix it. Any ideas?

Clarification on product percent-population

Are the values of this product between 0-1 (proportion) or 0-100 (percent)? The description reads: "percent-population: the percent of the total relative abundance within each cell. This is a derived product calculated by dividing each cell value in the relative abundance raster with the total abundance summed across all cells". However this procedure would give you proportion instead of percentage.

Fail gracefully if not internet connection or AWS S3 not responding

Currently doesn't handle cases where data can't be accessed in a clean way. curl::has_internet() may be useful here.

Switch to fractional ES starting with 2021 S&T

Installation Failed: Long Filepaths

Running the devtools install
devtools::install_github("CornellLabofOrnithology/ebirdst")

Produces the following output:

Downloading GitHub repo CornellLabofOrnithology/ebirdst@HEAD
✔ checking for file 'C:...\AppData\Local\Temp\Rtmpo9tUyl\remotes25104b9c149d\CornellLabofOrnithology-ebirdst-78c51c0/DESCRIPTION' ...
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
Warning in file.copy(pkgname, Tdir, recursive = TRUE, copy.date = TRUE) :
over-long path
ERROR
copying to build directory failed
Error: Failed to install 'ebirdst' from GitHub:
! System command 'Rcmd.exe' failed

Swapped my home filepath info for ... in the quote above.

Here's my sessionInfo()

R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] processx_3.6.1 compiler_4.2.0 R6_2.5.1 rprojroot_2.0.3
[5] cli_3.3.0 prettyunits_1.1.1 tools_4.2.0 w>

I previously tried changing Windows behavior for "Enable Long File Paths" in the registry, but R does not seem to recognize this. I've tried changing the directory packages are installed in in both .libpath() settings and with the help of with_libpaths() but no luck so far. Originally I was getting a message specifically mentioning long file paths are an issue in Windows, but wanted to see if there was a way to resolve this.

Make trajectory plotting agnostic of start and end weeks

test data for ppms should not be case control sampled

Error when loading new data packages on dev_erd2020

Happens with both woothr_plugin and sprpip_hurdle

Error: near "AS": syntax error 
12.
stop(structure(list(message = "near \"AS\": syntax error", call = NULL, 
    cppstack = NULL), class = c("Rcpp::exception", "C++Error", 
"error", "condition"))) 
11.
result_create(conn@ptr, statement) 
10.
initialize(value, ...) 
9.
initialize(value, ...) 
8.
new("SQLiteResult", sql = statement, ptr = result_create(conn@ptr, 
    statement), conn = conn, bigint = conn@bigint) 
7.
.local(conn, statement, ...) 
6.
dbSendQuery(conn, statement, ...) 
5.
dbSendQuery(conn, statement, ...) 
4.
.local(conn, statement, ...) 
3.
DBI::dbGetQuery(db, sql) 
2.
DBI::dbGetQuery(db, sql) at ebirdst-loading.R#402
1.
load_pis(sp_path)

date_to_st_week() throws an error when date is on boundary of two weeks

date_to_st_week(as.Date("2013-10-02"))

Error in vapply(days, check_d, FUN.VALUE = integer(length = 1)) :
values must be length 1,
but FUN(X[[1]]) result is length 2

The function wants to return two weeks, both 39 and 40, but should only be giving one value.

Suggest the following change in the function:

which(x >= dv[-length(dv)] & x <= dv[-1]) -> which(x >= dv[-length(dv)] & x < dv[-1])

Downloading data using common name returns too many results

When I download S&T data using common name, for example Least Flycatcher, I get files for Leaden Flycatcher as well. I expect to only get data for Least Flycatcher. Other examples where this occurs are Olive-sided Flycatcher, Grasshopper Sparrow, Great Crested Flycatcher, and Yellow-throated Warbler. It seems to be 6 letter codes that are used for multiple species and only differ by a number at the end.

ebirdst::ebirdst_download(
  species = "Least Flycatcher",
  tifs_only = FALSE,
  pattern = "_smooth_mr_",
  dry_run = TRUE
)
#> Downloading data package for leafly to:
#>   /Users/juliehart/Library/Application Support/org.R-project.R/R/ebirdst
#> File list:
#>   2021/leafly/config.json
#>   2021/leafly/ranges/leafly_range_smooth_mr_2021.gpkg
#>   2021/leafly2/config.json
#>   2021/leafly2/ranges/leafly2_range_smooth_mr_2021.gpkg

^{Created on 2022-12-22 with reprex v2.0.2}

Installation failed: cannot open file

Running the devtools install
devtools::install_github("CornellLabofOrnithology/ebirdst")

Produces the following output:

Downloading GitHub repo CornellLabofOrnithology/ebirdst@master
from URL https://api.github.com/repos/CornellLabofOrnithology/ebirdst/zipball/master
Installation failed: cannot open file 'C:/Users/richard/AppData/Local/Temp/RtmpEHCGi2/devtools3f8c4e0d59d6/CornellLabofOrnithology-ebirdst-d09b5fc/data-raw/clo-is-da-example-data/yebsap-ERD2016-EBIRD_SCIENCE-20180729-7c8cec83/data/ebird.abund_yebsap-ERD2016-EBIRD_SCIENCE-20180729-7c8cec83_erd.test.data.csv': No such file or directory`

Here my SessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=C LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] httr_1.3.1 compiler_3.5.1 R6_2.2.2 tools_3.5.1 withr_2.1.2 curl_3.2
[7] yaml_2.2.0 memoise_1.1.0 git2r_0.23.0 digest_0.6.17 devtools_1.13.6

Any idea what might be causing this?

map_centroids() balks when you don't give a pi argument

Put in a variable check in the front of the function

Folder Structure Issue

I have recently been trying to work through the introductory vignette for working with the STEM data in 'stemhelper'. I have been having an issue with the 'stack_stem' function resulting in the following error message:

*Error in load_config(path) :
_config.RData file does not exist in the /data directory.

STEMCode.docx

I then restructured my data into:

/norcar-ERD2016-PROD-20171009-a098806d/data/
/norcar-ERD2016-PROD-20171009-a098806d/results/

Which did not fix the error.

After restructuring the data, and adjusting my code into the following, this issue was resolved.

**root_path <- "C:\Users\mta4-a\Downloads\norcar-ERD2016-PROD-20171009-a098806d\"
species <- "norcar-ERD2016-PROD-20171009-a098806d"
sp_path <- paste(root_path, species, sep = "")

load a stack of rasters with the helper function stack_stem()
abund <- stack_stem(sp_path, variable = "abundance_umean")**

Error when trying to download species models with ebirdst_download()

Hello. When I run ebirdst_download(species="Sharp-tailed Grouse") I get and error

"Error in ebirdst_download(species = "Sharp-tailed Grouse") :
Cannot access Status and Trends data URL. Ensure that you have a working internet connection and a valid API key for the Status and Trends data."

If I run this code below, which I took from the source code, I can download a species models. Am I messing something up or is this a bug? I suspect the former, but thought I'd ask. Thanks!

path = rappdirs::user_data_dir("ebirdst")
species <- get_species(species)
which_run <- which(ebirdst::ebirdst_runs$species_code == species)
run <- ebirdst::ebirdst_runs$run_name[which_run]
key<-Sys.getenv("EBIRDST_KEY")
api_url <- "https://st-download.ebird.org/v1/"
list_obj_url <- stringr::str_glue("{api_url}list-obj/{species}?key={key}")
files <-jsonlite::read_json(list_obj_url, simplifyVector = TRUE)
files <- data.frame(file = files)
files <- files[!stringr::str_detect(files$file, "\\.db$"), , drop = FALSE]
files$src_path <- stringr::str_glue("{api_url}fetch?objKey={files$file}",
                                    "&key={key}")
files$dest_path <- file.path(path, files$file)
files$exists <- file.exists(files$dest_path)
dirs <- unique(dirname(files$dest_path))
for (d in dirs) {
  dir.create(d, showWarnings = FALSE, recursive = TRUE)
}

old_timeout <- getOption("timeout")
options(timeout = max(3000, old_timeout))

for (i in seq_len(nrow(files))) {
  dl_response <- utils::download.file(files$src_path[i],
                                      files$dest_path[i],
                                      mode = "wb")
  if (dl_response != 0) {
    stop("Error downloading file: ", files$file[i])
  }
}

PPMs dpm

Here:

ebirdst/R/ebirdst-ppms.R

Line 103 in e615b72

for (i in seq_along(es_cutoff)) {

Using the seq_along(es_cutoff) with a run that has a temporal subset (e.g., "woothr-ERD2019-WEATHER_TEST-20210316-7bd55aa1") fails, since it's trying to work on all 52 weeks and it gets out of order, resulting in no preds in the end.

Replace "Occupancy" with "Occurrence" in plot titles for PPMs

Low priority.

all_ppms() has problems with new fotfly

Particularly W_Amazon_spring region in particular. Look at the report generation code and debug.

Clarification of definitions for count and abundance

Hi all, thanks for maintaining this wonderful package. It's a great resource for learning more about the birds I see in my region.

While reading through the documentation and vignettes, I was confused by the definition of count vs. abundance. I understand that they are both the output of statistical models. abundance is "the product of the probability of occurrence and the count conditional on occurrence". So if occurrence is 50%, and count is 10, abundance would be 5. If occurrence is 0, abundance is 0. Unfortunately I do not understand what count is telling me vs. what abundance is telling me.

Is there a resource that provides a more colloquial definition of count vs. abundance?

I have tried to work out some of this own my blog. Quick EDA shows that abundance is lower than count, as I expected based on the definition.

Habitat association and avoidance from PI/PD

After taking a look at the package and the non-raster data vignette, I'm wondering how you would recommend combining PI for the land cover variables with PD to measure habitat association or avoidance - I'm thinking specifically of the estimates of positive or negative relationships with different land cover types at regional or larger scales that was previously available on the eBird Status and Trends pages for certain species. What is the relation between those data products and this one? Any thoughts on how to extract that information or something similar? Thanks!

error in load_fac_parameters

Hi,
First thank you for this work. I have started to use your dataset (version 2020) and recently it switched to a new version (2021). I tried to adapt my code and ended up with the following error :

  path <- ebirdst_download(species = "batgod") #use this if running the script for the first time, otherwise use next line instead
  path <- paste0(ebirdst_data_dir(),"/2021/batgod")
  
  # load relative abundance raster stack with 52 layers, one for each week
  abd_all <- ebirdst::load_raster(path,
                                  product = "abundance",
                                  period= "weekly",
                                  resolution = "lr")
  
  # load species specific mapping parameters
  pars <- ebirdst::load_fac_map_parameters(path)

Erreur dans validityMethod(object) : invalid extent: ymin >= ymax

The traceback is the following :

8: stop("invalid extent: ymin >= ymax")
7: validityMethod(object)
6: isTRUE(x)
5: anyStrings(validityMethod(object))
4: validObject(e)
3: raster::extent(unlist(p$bbox_sinu))
2: raster::extent(unlist(p$bbox_sinu))
1: ebirdst::load_fac_map_parameters(path)

Did I misunderstood something ? Or is there an error in the new version ?
Thanks,
Ronan

Please remove dependencies on rgdal, rgeos, and/or maptools

This package depends on (depends, imports or suggests) raster and one or more of the retiring packages rgdal, rgeos or maptools (https://r-spatial.org/r/2022/04/12/evolution.html, https://r-spatial.org/r/2022/12/14/evolution2.html). Since raster 3.6.3, all use of external FOSS library functionality has been transferred to terra, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

Move lat/lng checks for st_extent_subset() inside of rectangle if

Custom projection changed for batgod between 2020 release and 2021 release

Hello Matt,
First of all, thank you for this work and the help you have given me.

Actually, this issue is not really an issue. It is more a question of understanding to be sure that I did not misuse your package.
I need to extract batgod data for year 2021 and to do this I used the master commit (bf9abb7)

Compared to 2020 dataset version the optimized CRS is different :
for 2020 I had : "+proj=eck4 +lon_0=145.782 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"
And for 2021 I had : "+proj=eck4 +lon_0=0 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"

The longitude was centered on Australia in 2020 (which was great for me) and is now centered on Greenwich.
I switched to the old version (centered on Australia) and it seems ok.

But am I doing wrong ? and why has this projection changed ? Is there any documentation on these choices.
Thanks again
Best
Ronan

plot_pds() throws error when too few data points

Error in smooth.construct.ds.smooth.spec(object, dk$data, dk$knots) :
A term has fewer unique covariate combinations than specified maximum degrees of freedom

This with woothr-ERD2016-AWS_TEST_QUADTREES-20180119-62a56fe5 which is only 10 folds