abbiodiversity / wildrtrax Goto Github PK

View Code? Open in Web Editor NEW

8.0 8.0 8.0 182.87 MB

wildrtrax is an R package for environmental sensor data management and analytics

Home Page: https://abbiodiversity.github.io/wildRtrax/

License: Other

R 100.00%

bioacoustics biodiversity-data r-package

wildrtrax's People

Contributors

Stargazers

Watchers

Forkers

cassstevenson agmacpha ecknight vlucet abmi-kevin see24 mabecker89 dhope

wildrtrax's Issues

CRAN Submission

I was at the webinar yesterday and wanted to reiterate that this project is of great value to the field!

I also wanted to ask whether CRAN submission was a goal for this project and whether you would welcome contributions that would support that goal. This is helpful to know to guide contributions to the package at this stage of development.

wt_tidy_species usage is confusing, have to download wt_get_species twice

In the data wrangling vignette it recommends to first do an inner join between your data and the species table downloaded with wt_get_species and then to do wt_tidy_species to remove species that are not of interest. In fact this is required because if you don't have a species_class column in the data passed to wt_tidy_species you get an error.

Error in `dplyr::select()`:
! Can't subset columns that don't exist.
✖ Column `species_class` doesn't exist.

But then when you run wt_tidy_species it runs wt_get_species internally and re-downloads the same table. This isn't a big deal because it doesn't take long, but seems unnecessary.

It looks like the error above is just because if zerofill = TRUE there is a line to remove the species_class column which gives an error if it is not present. If that select statement was changed to select the desired columns or use select(-any_of(c("species_code", "species_class", ...) then there would be no need to download the species table and join it before running wt_tidy_species which perhaps is the intended process. If that's the case you could update the vignette since the join to the species table is not really necessary.

wt_get_download_summary() do not return organization

Information about the type of sensor is found in the summary table (wt_get_download_summary()). A user that wants to recover sensor type for a specific project needs to merge the project report with its respective summary. However, the only column those two tables has in common is project. Organization is missing from wt_get_download_summary(). Unfortunately, the field project isn't unique in the summary table, which prevent the merge to operate properly. Summary table should return organization and project in order successfully assign sensor type to the right project report.

my_projects <- wt_get_download_summary(sensor_id = 'PC')
my_projects[duplicated(my_projects$project),]
output:
      project project_id sensorId tasks             status
139 Big Grids         31      ARU  7609 Published - Public

my_projects[my_projects$project == "Big Grids",]
output:
      project project_id sensorId tasks             status
138 Big Grids        381      ARU  2236             Active
139 Big Grids         31      ARU  7609 Published - Public

Improve file reading for wt_audio_scanner

@mabecker89 Thoughts on changing the file reading of wt_audio_scanner from dir_ls to dir_info? See documentation here.

df <- fs::dir_info(path = path,
recurse = TRUE,
regexp = file_type_reg) %>>%
"Scanning audio files in path..." %>>%
tibble::enframe() %>%
dplyr::select(path,size) #Offer flexibility with other fields available here?

If using furrr::future_map_dbl(., .f = ~fs::file_size(.), .progress = TRUE) we should also seed = TRUE to avoid the error messages.

Add DOI and citation.cff?

Hi, I'd like to potentially cite this software in an article discussing annotations for audio. I understand from @ecknight that you may add functions soon for downloading annotations from WildTrax.

This is not quite a feature request or bug, then. But just thought I'd suggest you add a DOI, e.g. with GitHub's zenodo integration, and a citation.cff file, to make the software more easily citable. AFAICT wildRtrax does not have either now, although I'm happy to learn I'm wrong.

Feature Request: Path to store downloaded files

It would be nice if there was an argument in wt_download_report for a file path to store the downloaded files. This would prevent having to re-download the files every time a script is re-run which would help to encourage a reproducible workflow.

An example of a similar function is in bbsAssistant the grap_bbs_data function has a directory argument and an overwrite arugment to force a fresh download.

Some projects download data but return an empty list

Seems like probably a problem at the unzipping stage or something. List of projects with this error includes project_id = c(58, 84, 134, 805, 809, 814, 827, 848, 884, 1007, 1084, 1094, 1107, 1116, 1249, 1281. (lemme know if you actually need a reprex...)

Feature request: JWT token support

Hey folks,

I noticed our profiles on the Wildtrax website now include a JWT token, which is great to avoid having to use our login passwords. I noticed the current version of wildRTrax doesn't appear to include JWT support, so I wrote up my own code here: https://github.com/dhope/evenWildRTrax

It is super basic and only really downloads the csv files, but I think could be rolled into wildRtrax fairly simply. I also used httr2, which seems to have a lot of streamlining over the superseded httr.

I unfortunately don't have time to do a full PR on this, but if this seems useful and any of the code can help implement it, please make use of it.

Feature Request: Ability to filter download by location

As a user who is no affiliated with any of the projects or organizations directly, the project names and organizations are not very meaningful. What I want is all the data available for a given location. So far the only way I have found to figure out which projects that includes is to use the Data Discover webpage to search all organizations and projects for the sensor type I want and zoom in on the area and write down the organizations with data in that area. And then download the data for those organizations.

Ideally there would be a way to filter the Data Discover webpage by location or more importantly to provide an extent or polygon to a download function that would find all the data in WildTrax for that area. For example the bcdata package can do this and it is really useful!

I imagine that would involve fairly deep changes to the way the data is stored on WildTrax so another interim solution would be to add some location information to the project metadata returned by wt_get_download_summary such as a list of jurisdictions, BCRs or ecozones where the project has data, or even just a min and max latitude and longitude for each project. This would be very helpful to avoid having to download more data before filtering.

Thanks for the package!

Feature Request: Faster method for downloading multiple reports

Sorry for all the requests! Just getting into this package so I have a few ideas, hopefully they are helpful!

Based on the APIs vignette I have been using purrr::map to run wt_download_report to download each report one at a time and add the data to a tibble.

# Download all of the published ARU data from CWS-ONT
aru_projects <- wt_get_download_summary(sensor_id = "ARU") %>%
  as_tibble()

system.time(
  cwsON_arus <- aru_projects %>%
                        filter(status == "Published - Public",
                               grepl('^CWS-Ontario', project)) %>%
                        dplyr::mutate(data = purrr::map(
                          .x = project_id,
                          .f = ~wt_download_report(project_id = .x, sensor_id = "ARU",
                                                   weather_cols = T, reports = "main"))) %>%
                        select(-project_id, -organization) %>%
                        unnest(data)
)
 ## user  system elapsed 
 ## 111.97   25.72 2676.70

However, this takes a long time (~45 mins) and seems inefficient since wt_download_report has to download and unzip each file one by one. On the other hand it seems like when you select multiple projects and download them by hand on the WildTrax website the datasets are combined into one zip file. I assume that would be faster than downloading and unzipping each one individually. In addition, is there a reason to download all the report csv files when only the main report is requested? If it was possible to only download the requested report that might help.

Also I noticed in the source code for wt_download_report that all the reports are read with read.csv into a list and then they are filtered down to just the requested reports. If you filtered the filenames first so you only read in the requested csv that would speed things up. You might also want to consider using readr::read_csv or data.table::fread since they can be much faster.

duration is cut short by `.make_x`

The .make_x function that is wrapped by wt_qpad_offsets has a step where it cuts off the last character in the task_duration. I assume there used to be a string with an unneeded character there but now it is a number and this step is dropping the last digit.

library(wildRtrax)
library(dplyr)


wt_auth()
#> Authentication into WildTrax successful.

projects <- wt_get_download_summary("PC")

dat_aru <- projects %>%
  # pick a project in boreal and with few tasks
  filter(project == "CWS-Ontario Atlas Digital Point Counts Boreal FMUs 2022") %>%
  pull(project_id) %>%
  wt_download_report(sensor_id = "ARU", reports = "main", weather_cols = FALSE)

dat_aru_clean <- dat_aru %>% wt_tidy_species(sensor = "ARU", zerofill = TRUE) %>%
  wt_replace_tmtt() %>%
  wt_make_wide()
#> Successfully downloaded the species table!

ex_row <- dat_aru_clean %>% slice(2)
ex_row$task_duration
#> [1] 300

ex_x <- wildRtrax:::.make_x(ex_row)
#> Downloading geospatial assets. This may take a moment.
ex_x$MAXDUR
#> [1] 30

^{Created on 2024-02-27 with reprex v2.0.2}

`wt_download_report` fails to output columns expectd by `wt_ind_detect`

@Eric-Jolin and I wanted to test the package detection function, wt_ind_detect. This function expects specific columns to exist in the dataframe that is given to it:

wildRtrax/R/wt_ind_det.R

Line 38 in 377f44d

 req_cols <- c("project", "location", "field_of_view", "scientific_name", "common_name", "number_individuals") 

Yet, when we use wt_download_report to access the tag report for our project...

TDN_tag_report <- wt_download_report(
  project_id = 712,
  sensor_id = "CAM",
  report = "tag",
  weather_cols=TRUE)

... the column field_of_view happens to be missing:

> names(TDN_tag_report)
 [1] "project"                          "organization"                    
 [3] "location"                         "latitude"                        
 [5] "longitude"                        "date_detected"                   
 [7] "image_sequence"                   "scientific_name"                 
 [9] "common_name"                      "age_class"                       
[11] "sex"                              "number_individuals"              
[13] "id_by"                            "needs_review"                    
[15] "auto_tagged"                      "tag_comments"                    
[17] "daily_weather_station_nm"         "daily_weather_station_elevation" 
[19] "daily_weather_station_distance"   "daily_min_temp"                  
[21] "daily_max_temp"                   "daily_mean_temp"                 
[23] "daily_total_rain_mm"              "daily_total_snow_cm"             
[25] "daily_precipitation_mm"           "daily_snow_on_ground_cm"         
[27] "hourly_weather_station_nm"        "hourly_weather_station_elevation"
[29] "hourly_weather_station_distance"

The image report is also missing columns: scientific_name, common_name, number_individuals.

Is the user expected to provide the field of view column to the tag report and the species info to the image report for the detection function to work?

'year' error when loading data

When I call either wt_get_download_summary() or wt_download_report() I get the following "Column year doesn't exist." error. Seems odd, seeing as the same script worked earlier this month. But I tried restarting R, and reloading WildRTrax (but I had the lastest version). Can anyone confirm if they get this error, or is it some odd package conflict on my end of something.

> wildRtrax::wt_get_download_summary(sensor_id = 'CAM')
Error in `stop_subscript()`:
! Can't subset columns that don't exist.
x Column `year` doesn't exist.
Backtrace:
  1. wildRtrax::wt_get_download_summary(sensor_id = "CAM")
  4. dplyr:::select.data.frame(...)
  5. tidyselect::eval_select(expr(c(...)), .data)
  6. tidyselect:::eval_select_impl(...)
 14. tidyselect:::vars_select_eval(...)
     ...
 21. tidyselect:::chr_as_locations(x, vars)
 22. vctrs::vec_as_location(x, n = length(vars), names = vars)
 23. vctrs `<fn>`()
 24. vctrs:::stop_subscript_oob(...)
 25. vctrs:::stop_subscript(...)

Project report dowload as list

Brief description of the problem

I was downloading the data of protect "Eastmain 1A Powerhouse and Rupert Diversion Project EIA 2002" (project_id=837) and it downloaded a list with 4 elements.

wt_download_report(project_id = 837, 
                                       sensor = "PC", 
                                       weather_cols = F, 
                                       report = "main"))

Warnings about file.rename when downloading reports

I installed the development branch and am now getting a bunch of warnings when I download files with wt_download_report. It looks like the commit to remove special characters from project names is causing issues for my Windows file paths which contain ~ and :

Warning:
1: In file.rename(.x, gsub("[:()?!;]", "", .x)) :
cannot rename file 'C:\Users\ENDICO1\AppData\Local\Temp\RtmpygiPXo/BU_Big_Grid_Pilot_Program_2014_recording_report.csv' to 'C\Users\ENDICO1\AppData\Local\Temp\RtmpygiPXo/BU_Big_Grid_Pilot_Program_2014_recording_report.csv', reason 'The system cannot find the path specified'

If you change this line
to:

list.files(td, pattern = "*.csv") %>% 
  purrr::map(~file.rename(file.path(td, .x), file.path(td, gsub("[:()?!~;]", "", .x))))

It will only change the parts after the temp dir

wt_download_report() is not working

Please briefly describe your problem and what output you expect. If you have a question, please don't use this form. Instead, ask on https://stackoverflow.com/ or https://community.rstudio.com/.

Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.

Brief description of the problem

The following code chunk is taking forever (2 hours 19 minutes so far) to download a single file from a folder that takes less time to download manually from the website. I have tried reinstalling the wildRtrax package.

remotes::install_github("ABbiodiversity/wildRtrax")
#Attach package
library(wildRtrax)
Sys.setenv(WT_USERNAME = 'myUSERID', WT_PASSWORD = 'myPassword')#works
wt_auth()#works
my_projects <- wt_get_download_summary(sensor_id = 'ARU')
#Download the project report. Use ?wt_download_report to get options
my_report <- wt_download_report(project_id = 1321, sensor_id = 'ARU', reports="main", weather_cols = FALSE)#cols_def = F, 
#project 1321 is for Ruffed Grouse - BU Lab Project
#report options include "summary", "birdnet", "task", "tag", "definitions"

Authentication failure while properly authentified within package

Hello,

For some reason, I am getting HTTP 500 errors where I used to have no issues. I properly authentify with wt_auth and then attempt a download, but get an error.

> wt_auth(force = T)
Authentication into WildTrax successful.

Using demo project as an example:

> wt_download_report(project_id = 220, sensor_id = 'CAM',
+                    reports = "project", weather_cols = F)
Downloading: 120 B     Error: Authentication failed [500]

Only the "main" report seems to work.

> wt_download_report(project_id = 220, sensor_id = 'CAM',
+                    reports = "main", weather_cols = F)
# A tibble: 28,664 × 35

My session info:

> sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C               LC_TIME=en_CA.UTF-8       
 [4] LC_COLLATE=en_CA.UTF-8     LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Toronto
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] wildRtrax_1.2.0   tidyr_1.3.1       dplyr_1.1.4       tibble_3.2.1      tarchetypes_0.9.0
[6] targets_1.7.0    

loaded via a namespace (and not attached):
 [1] gtable_0.3.5       xfun_0.44          ggplot2_3.5.1      processx_3.8.4    
 [5] lattice_0.22-6     callr_3.7.6        tzdb_0.4.0         vctrs_0.6.5       
 [9] tools_4.4.0        ps_1.7.6           generics_0.1.3     curl_5.2.1        
[13] base64url_1.4      parallel_4.4.0     proxy_0.4-27       fansi_1.0.6       
[17] pkgconfig_2.0.3    KernSmooth_2.23-24 data.table_1.15.4  secretbase_0.5.0  
[21] tuneR_1.4.7        lifecycle_1.0.4    stringr_1.5.1      compiler_4.4.0    
[25] munsell_0.5.1      terra_1.7-78       codetools_0.2-20   class_7.3-22      
[29] yaml_2.3.8         intrval_0.1-3      pillar_1.9.0       furrr_0.3.1       
[33] MASS_7.3-61        classInt_0.4-10    magick_2.8.3       parallelly_1.37.1 
[37] QPAD_0.0-3         tidyselect_1.2.1   digest_0.6.35      stringi_1.8.4     
[41] future_1.33.2      sf_1.0-16          purrr_1.0.2        listenv_0.9.1     
[45] grid_4.4.0         colorspace_2.1-0   cli_3.6.2          magrittr_2.0.3    
[49] utf8_1.2.4         e1071_1.7-14       readr_2.1.5        withr_3.0.0       
[53] scales_1.3.0       backports_1.5.0    unmarked_1.4.1     lubridate_1.9.3   
[57] timechange_0.3.0   httr_1.4.7         globals_0.16.3     signal_1.8-0      
[61] igraph_2.0.3       progressr_0.14.0   hms_1.1.3          knitr_1.47        
[65] markdown_1.13      rlang_1.1.4        Rcpp_1.0.12        DBI_1.2.3         
[69] glue_1.7.0         seewave_2.2.3      pipeR_0.6.1.3      renv_1.0.7        
[73] jsonlite_1.8.8     R6_2.5.1           units_0.8-5        fs_1.6.4

Multi year data in `wt_summarise_cam()`

Hi 👋🏼 ! Here is a bug @Eric-Jolin and I have found in wt_summarise_cam(). I've also opened a PR (#40) that fixes it, and also attempts to improve the code in general (see the PR to those details).

When a user provides data from more than one year, the data is aggregated across years and leads to situations where the number of days of effort can be more than 7 for a given week, and more than 31 for a month. It would also artificially inflate the effort.

times_start <- c("2021-08-01 17:04:40", "2021-09-25 18:37:46", "2021-10-02 16:12:38", "2021-11-02 14:41:04",
                 "2022-04-06 10:12:58", "2022-04-07 12:34:04", "2022-04-22 09:30:52", "2022-04-26 09:54:46",
                 "2022-04-26 15:06:42", "2022-04-27 08:36:27", "2022-04-30 09:30:29", "2022-05-13 10:07:33",
                 "2022-08-10 10:40:17")
times_end <- c("2021-08-01 17:04:41 UTC", "2021-09-25 18:37:46 UTC", "2021-10-02 16:12:38 UTC",
               "2021-11-02 14:41:04 UTC", "2022-04-06 10:13:00 UTC", "2022-04-07 12:57:56 UTC",
               "2022-04-22 09:30:53 UTC", "2022-04-26 09:54:47 UTC", "2022-04-26 15:06:43 UTC",
               "2022-04-27 08:36:28 UTC", "2022-04-30 09:30:31 UTC", "2022-05-13 10:07:34 UTC",
               "2022-08-10 10:40:18 UTC")

raw_dat <-  data.frame(detection = 1:13,
                       project_id = "P1",
                       location = "Loc1",
                       species_common_name = "Sp1",
                       image_date_time = times_start,
                       max_animals = 1)

ind_dat <- data.frame(detection = 1:13,
                      project_id = "P1",
                      location = "Loc1",
                      species_common_name = "Sp1",
                      start_time = times_start,
                      end_time = times_end,
                      max_animals = 1)

wt_summarise_cam(detect_data = ind_dat, raw_data = raw_dat,
                 time_interval = "month",
                 variable = "detections",
                 output_format = "long")

Which gives:

Joining with `by = join_by(project_id, location, month, species_common_name)`
# A tibble: 12 × 7
   project_id location month     n_days_effort species_common_name variable   value
   <chr>      <chr>    <ord>             <int> <chr>               <chr>      <int>
 1 P1         Loc1     January              31 Sp1                 detections     0
 2 P1         Loc1     February             28 Sp1                 detections     0
 3 P1         Loc1     March                31 Sp1                 detections     0
 4 P1         Loc1     April                30 Sp1                 detections     7
 5 P1         Loc1     May                  31 Sp1                 detections     1
 6 P1         Loc1     June                 30 Sp1                 detections     0
 7 P1         Loc1     July                 31 Sp1                 detections     0
 8 P1         Loc1     August               41 Sp1                 detections     2
 9 P1         Loc1     September            30 Sp1                 detections     1
10 P1         Loc1     October              31 Sp1                 detections     1
11 P1         Loc1     November             30 Sp1                 detections     1
12 P1         Loc1     December             31 Sp1                 detections     0

Clearly this function should be grouping and aggregating data by year instead, regardless of the time frame requested. The PR brings that feature:

Joining with `by = join_by(project_id, year, month, species_common_name)`
# A tibble: 13 × 8
   project_id location  year month     n_days_effort species_common_name variable   value
   <chr>      <chr>    <dbl> <ord>             <int> <chr>               <chr>      <int>
 1 P1         Loc1      2021 August               31 Sp1                 detections     1
 2 P1         Loc1      2021 September            30 Sp1                 detections     1
 3 P1         Loc1      2021 October              31 Sp1                 detections     1
 4 P1         Loc1      2021 November             30 Sp1                 detections     1
 5 P1         Loc1      2021 December             31 Sp1                 detections     0
 6 P1         Loc1      2022 January              31 Sp1                 detections     0
 7 P1         Loc1      2022 February             28 Sp1                 detections     0
 8 P1         Loc1      2022 March                31 Sp1                 detections     0
 9 P1         Loc1      2022 April                30 Sp1                 detections     7
10 P1         Loc1      2022 May                  31 Sp1                 detections     1
11 P1         Loc1      2022 June                 30 Sp1                 detections     0
12 P1         Loc1      2022 July                 31 Sp1                 detections     0
13 P1         Loc1      2022 August               10 Sp1                 detections     1

(Also, see the PR for general code cleanliness/coherence improvement I thought could be useful to bring. Feel free to ignore those if they seem superfluous)

`wt_qpad_offsets` `together` argument works opposite to what I expected

When together = TRUE get just offsets and when together = FALSE get dataframe with offsets attached which is opposite to what I expected from the argument name and docs.

library(wildRtrax)
library(dplyr)

# Start by getting everything you need
Sys.setenv(WT_USERNAME = 'guest', WT_PASSWORD = 'Apple123')
wt_auth()
#> Authentication into WildTrax successful.
my_report <- wt_download_report(project_id = 605, sensor_id = 'ARU', reports = "main", weather_cols = F) %>%
  tibble::as_tibble()

my_tidy_data <- wt_tidy_species(my_report, remove = "mammal", zerofill=F)

my_tmtt_data <- wt_replace_tmtt(data = my_tidy_data, calc = "round")

my_wide_data <- wt_make_wide(data = my_tmtt_data, sound = "all")
#> Warning: There was 1 warning in `mutate()`.
#> ℹ In argument: `individual_count = case_when(grepl("^C", individual_count) ~
#>   NA_real_, TRUE ~ as.numeric(individual_count))`.
#> Caused by warning:
#> ! NAs introduced by coercion

# When together is True we get just offsets returned
my_offset_data <- wt_qpad_offsets(data = my_wide_data, species = "all", 
                                  version = 3, together = TRUE)


colnames(my_offset_data)
#>  [1] "ALFL" "AMCR" "AMRE" "AMRO" "ATSP" "ATTW" "BAWW" "BBMA" "BBWA" "BBWO"
#> [11] "BCCH" "BHCO" "BHVI" "BLJA" "BOCH" "BOWA" "BRCR" "CCSP" "CEDW" "CHSP"
#> [21] "CMWA" "COGR" "CONW" "CORA" "COYE" "CSWA" "DEJU" "DOWO" "EVGR" "GCKI"
#> [31] "GRCA" "GRYE" "HAWO" "HETH" "HOWR" "LCSP" "LEFL" "LEYE" "LISP" "MAWA"
#> [41] "MAWR" "MODO" "MOWA" "NOFL" "NOWA" "OCWA" "OSFL" "OVEN" "PAWA" "PHVI"
#> [51] "PIGR" "PISI" "PIWO" "PUFI" "RBGR" "RBNU" "RCKI" "RECR" "REVI" "RUGR"
#> [61] "RWBL" "SAVS" "SEWR" "SOSA" "SOSP" "SPGR" "SPSA" "SWSP" "SWTH" "TEWA"
#> [71] "VATH" "WAVI" "WETA" "WEWP" "WISN" "WIWA" "WIWR" "WTSP" "WWCR" "YBFL"
#> [81] "YBSA" "YEWA" "YHBL" "YRWA"

# And when it is False get offsets attached to data
my_offset_data <- wt_qpad_offsets(data = my_wide_data, species = "all", 
                                  version = 3, together = FALSE)

colnames(my_offset_data)
#>   [1] "organization"        "project_id"          "location"           
#>   [4] "location_id"         "location_buffer_m"   "longitude"          
#>   [7] "latitude"            "equipment_make"      "equipment_model"    
#>  [10] "recording_id"        "recording_date_time" "task_id"            
#>  [13] "aru_task_status"     "task_duration"       "task_method"        
#>  [16] "ALFL"                "AMBI"                "AMCO"               
#>  [19] "AMCR"                "AMRE"                "AMRO"               
#>  [22] "AMWI"                "ATSP"                "ATTW"               
#>  [25] "BADO"                "BAWW"                "BBMA"               
#>  [28] "BBWA"                "BBWO"                "BCCH"               
#>  [31] "BHCO"                "BHVI"                "BLJA"               
#>  [34] "BLTE"                "BOCH"                "BOGU"               
#>  [37] "BOOW"                "BOWA"                "BRCR"               
#>  [40] "BWTE"                "CACI"                "CAJA"               
#>  [43] "CANG"                "CCSP"                "CEDW"               
#>  [46] "CHIK"                "CHSP"                "CMWA"               
#>  [49] "COGO"                "COGR"                "COLO"               
#>  [52] "COME"                "CONI"                "CONW"               
#>  [55] "CORA"                "CORE"                "COTE"               
#>  [58] "COYE"                "CSWA"                "DEJU"               
#>  [61] "DOWO"                "DUGR"                "EVGR"               
#>  [64] "FRGU"                "GADW"                "GCKI"               
#>  [67] "GHOW"                "GRCA"                "GRYE"               
#>  [70] "GWFG"                "GWTE"                "HAWO"               
#>  [73] "HETH"                "HEWI"                "HOWR"               
#>  [76] "LCSP"                "LEFL"                "LEYE"               
#>  [79] "LIAI"                "LIBA"                "LINO"               
#>  [82] "LIRA"                "LISP"                "LITF"               
#>  [85] "LIWI"                "MALL"                "MAWA"               
#>  [88] "MAWR"                "MOAI"                "MOBA"               
#>  [91] "MODO"                "MONO"                "MORA"               
#>  [94] "MOTF"                "MOWA"                "MOWI"               
#>  [97] "NHOW"                "NOFL"                "NOPO"               
#> [100] "NOWA"                "NSHO"                "NSWO"               
#> [103] "OCWA"                "OSFL"                "OVEN"               
#> [106] "PALO"                "PAWA"                "PAWR"               
#> [109] "PBGR"                "PHVI"                "PIGR"               
#> [112] "PISI"                "PIWO"                "PSFL"               
#> [115] "PUFI"                "RBGR"                "RBNU"               
#> [118] "RCKI"                "RECR"                "REVI"               
#> [121] "RNGR"                "RUGR"                "RWBL"               
#> [124] "SACR"                "SAVS"                "SEWR"               
#> [127] "SORA"                "SOSA"                "SOSP"               
#> [130] "species"             "SPGR"                "SPSA"               
#> [133] "STGR"                "SWSP"                "SWTH"               
#> [136] "TEWA"                "TRUS"                "UNAM"               
#> [139] "UNBI"                "UNBL"                "UNBT"               
#> [142] "UNDU"                "UNFL"                "UNGO"               
#> [145] "UNKN"                "UNOW"                "UNPA"               
#> [148] "UNSH"                "UNSP"                "UNTH"               
#> [151] "UNTR"                "UNWA"                "UNWO"               
#> [154] "UNYE"                "VATH"                "WAVI"               
#> [157] "WETA"                "WEWP"                "WISN"               
#> [160] "WIWA"                "WIWR"                "WTSP"               
#> [163] "WWCR"                "YBFL"                "YBSA"               
#> [166] "YERA"                "YEWA"                "YHBL"               
#> [169] "YRWA"                "ALFL.off"            "AMCR.off"           
#> [172] "AMRE.off"            "AMRO.off"            "ATSP.off"           
#> [175] "ATTW.off"            "BAWW.off"            "BBMA.off"           
#> [178] "BBWA.off"            "BBWO.off"            "BCCH.off"           
#> [181] "BHCO.off"            "BHVI.off"            "BLJA.off"           
#> [184] "BOCH.off"            "BOWA.off"            "BRCR.off"           
#> [187] "CCSP.off"            "CEDW.off"            "CHSP.off"           
#> [190] "CMWA.off"            "COGR.off"            "CONW.off"           
#> [193] "CORA.off"            "COYE.off"            "CSWA.off"           
#> [196] "DEJU.off"            "DOWO.off"            "EVGR.off"           
#> [199] "GCKI.off"            "GRCA.off"            "GRYE.off"           
#> [202] "HAWO.off"            "HETH.off"            "HOWR.off"           
#> [205] "LCSP.off"            "LEFL.off"            "LEYE.off"           
#> [208] "LISP.off"            "MAWA.off"            "MAWR.off"           
#> [211] "MODO.off"            "MOWA.off"            "NOFL.off"           
#> [214] "NOWA.off"            "OCWA.off"            "OSFL.off"           
#> [217] "OVEN.off"            "PAWA.off"            "PHVI.off"           
#> [220] "PIGR.off"            "PISI.off"            "PIWO.off"           
#> [223] "PUFI.off"            "RBGR.off"            "RBNU.off"           
#> [226] "RCKI.off"            "RECR.off"            "REVI.off"           
#> [229] "RUGR.off"            "RWBL.off"            "SAVS.off"           
#> [232] "SEWR.off"            "SOSA.off"            "SOSP.off"           
#> [235] "SPGR.off"            "SPSA.off"            "SWSP.off"           
#> [238] "SWTH.off"            "TEWA.off"            "VATH.off"           
#> [241] "WAVI.off"            "WETA.off"            "WEWP.off"           
#> [244] "WISN.off"            "WIWA.off"            "WIWR.off"           
#> [247] "WTSP.off"            "WWCR.off"            "YBFL.off"           
#> [250] "YBSA.off"            "YEWA.off"            "YHBL.off"           
#> [253] "YRWA.off"

^{Created on 2024-01-04 with reprex v2.0.2}

Recurse `wt_chop()`

Support wt_chop() for recursive chopping through folders of files that meet segment_length criteria

Image_fire column class causes wt_download_report() to fail for multiple projects

This isn't strictly a WildRTrax issue, but when I try and pull multiple projects I get the following error:

proj.ids <- c(998, 1401, 1971)
image <- map_df(
  .x = proj.ids,
  .f = ~ wildRtrax::wt_download_report(
    project_id = .x,
    sensor_id = "CAM",
    report = "image_report",
    weather_cols = FALSE
  )
)

Error in `dplyr::bind_rows()`:                                                                                                                                                                                 
! Can't combine `..1$image_fire` <logical> and `..3$image_fire` <character>.
Run `rlang::last_trace()` to see where the error occurred.

Some of the projects must have data in image_fire which changes the column class while others without data end up as a different class. Adding in %>%select(-image_fire) solves the issue.

proj.ids <- c(998, 1401, 1971)
image <- map_df(
  .x = proj.ids,
  .f = ~ wildRtrax::wt_download_report(
    project_id = .x,
    sensor_id = "CAM",
    report = "image_report",
    weather_cols = FALSE
  )%>%select(-image_fire)
)

wt_download_report not working for sensor_id='CAM'

When I use wt_download_report for any camera data I always receive a failure when receiving data. I've tried adjusting/increasing the timeout period, but that doesn't seem to work either. wt_download_report for ARU downloads works fine and no issues. Other download functions also seem to work fine (e.g., wt_get_download_summary downloads and returns data).

library(wildRtrax)

#Sets login info for WildTrax
Sys.setenv(WT_USERNAME = 'xxxxx', WT_PASSWORD = 'xxxx')
wt_auth()
#> Authentication into WildTrax successful.

#Download report of urban widllife cameras
taginfo <- wt_download_report(project_id = 1105, sensor_id = 'CAM', report="tag")

Downloading: 18 MB
#> Error in curl::curl_fetch_disk(url, x$path, handle = handle): Failure when receiving data from the peer

^{Created on 2023-08-11 with reprex v2.0.2}

BOAT should have species_class "ABIOTIC"

Right now in the species list BOAT has species_class NA which might cause trouble with filtering. It should probably be "ABIOTIC" the same as VEHICLE

wildRtrax::wt_get_species() %>% filter(species_code == "BOAT")
##species_id species_code species_common_name species_class species_order species_scientific_name
##       <dbl> <chr>        <chr>               <chr>         <chr>         <chr>                  
##1       4852 BOAT         Boat                NA            NA            " " 

wildRtrax::wt_get_species() %>% filter(species_code == "VEHICLE")
##species_id species_code species_common_name species_class species_order species_scientific_name
##       <dbl> <chr>        <chr>               <chr>         <chr>         <chr>                  
##1       2697 VEHICLE      Vehicle             ABIOTIC       NA            " "

zero-cross file support

Error returning Riff headers

Unexpected Error occurs when running some, but not all recordings with the wt_audio_scanner() function on a batch of files. Error has been isolated to recordings from specific Songmeter SM2+ units
i data = furrr::future_map(...).
x This seems not to be a valid RIFF file of type WAVE.
Run rlang::last_error() to see where the error occurred.
In addition: There were 14 warnings (use warnings() to see them)

`wt_make_wide` gives warning about NAs introduced by coercion

The warning happens in the vignette:

library(wildRtrax)
library(dplyr)

# Start by getting everything you need
Sys.setenv(WT_USERNAME = 'guest', WT_PASSWORD = 'Apple123')
wt_auth()
#> Authentication into WildTrax successful.
my_report <- wt_download_report(project_id = 605, sensor_id = 'ARU', reports = "main", weather_cols = F) %>%
  tibble::as_tibble()

my_tidy_data <- wt_tidy_species(my_report, remove = "mammal", zerofill=F)

my_tmtt_data <- wt_replace_tmtt(data = my_tidy_data, calc = "round")

my_wide_data <- wt_make_wide(data = my_tmtt_data, sound = "all")
#> Warning: There was 1 warning in `mutate()`.
#> ℹ In argument: `individual_count = case_when(grepl("^C", individual_count) ~
#>   NA_real_, TRUE ~ as.numeric(individual_count))`.
#> Caused by warning:
#> ! NAs introduced by coercion

^{Created on 2024-01-04 with reprex v2.0.2}

I am pretty sure this is because when you use case_when all the right-hand sides are run even for cases where the left-hand side is not True so you get the warning even though you are dealing with the case that causes it first.

So if you change to:

mutate(individual_count = case_when(grepl("^C",  individual_count) ~ NA_character_,
                                        TRUE ~ individual_count) %>% as.numeric())

That removes the warning message.

Project does not dowload (eternal loop)

I have an issue with some of the projects when I try to download them. They never start downloading but the process in R keeps going in an eternal loop. I stopped the process after one hour and I got this message

From the project I have access to it is currently happening in projects id numbers: 1384 and 801

eh14_raw <- wt_download_report(
  project_id = 1384,
  sensor_id = 1384,
  report = "main",
  weather_cols = FALSE
)

locations with 0 birds being dropped in `wt_replace_tmtt` and `wt_make_wide`

wt_replace_tmtt and wt_make_wide are removing sites with 0 birds despite calling wt_tidy_species with zerofill = TRUE

library(wildRtrax)

library(testthat)
library(dplyr)

# Download report to use for testing
Sys.setenv(WT_USERNAME = "guest", WT_PASSWORD = "Apple123")
wt_auth(force = TRUE)
#> Authentication into WildTrax successful.
ecosys21 <- wt_download_report(685, 'ARU', 'main', FALSE)

# add a location with 0 birds observed
ecosys21_mod_row <- slice(ecosys21, 1)
ecosys21_mod_row$species_code <- "MOBA"
ecosys21_mod_row$species_scientific_name <- NA
ecosys21_mod_row$vocalization <- "Non-vocal"
ecosys21_mod_row$location_id <- ecosys21_mod_row$location_id+max(ecosys21$location_id)
ecosys21_mod <- bind_rows(ecosys21, ecosys21_mod_row)

# all locations from input should be present in outputs unless not transcribed
ecosys21_locs <- ecosys21_mod %>% filter(aru_task_status == "Transcribed") %>%
  distinct(location_id)

ecosys21_tidy <- wt_tidy_species(ecosys21_mod, remove = c("mammals", "abiotic", "amphibians"), zerofill = T, "ARU")
#> Successfully downloaded the species table!
ecosys21_locs %>% anti_join(ecosys21_tidy, by = "location_id") %>%
  pull(location_id) %>%
  expect_length(0)

ecosys21_tmtt <- wt_replace_tmtt(ecosys21_tidy, calc = "round")
ecosys21_locs %>% anti_join(ecosys21_tmtt, by = "location_id") %>%
  pull(location_id) %>%
  expect_length(0)
#> Error: `.` has length 1, not length 0.

ecosys21_wide <- wt_make_wide(ecosys21_tmtt, sound = "all", sensor = 'ARU')
ecosys21_locs %>% anti_join(ecosys21_wide, by = "location_id") %>%
  pull(location_id) %>%
  expect_length(0)
#> Error: `.` has length 1, not length 0.

# total abundance should be 0
filter(ecosys21_wide, location_id == ecosys21_mod_row$location_id) %>%
  rowwise() %>%
  mutate(tot_birds = sum(c_across(matches("^....$")))) %>%
  pull(tot_birds) %>%
  expect_equal(0)
#> Error: `.` not equal to 0.
#> Lengths differ: 0 is not 1

^{Created on 2024-01-31 with reprex v2.0.2}

Add to Docs: sensor_id can be "PC"

I am able to run wt_get_download_summary(sensor_id = "PC") and get results but according to the docs only "ARU" and "CAM" are options. It would also be nice if there was a check that sensor_id is one of the possible options. If you try "PNTCNT" for example you just get a dplyr error that is not very informative.

wildRtrax/R/api.R

Line 33 in 78f277f

#' @param sensor_id Can either be "ARU" or "CAM"

bar-lt file name support

wt_make_wide() does not work

When trying to use wt_make_wide does not work due to incompatibility of column types in the inner_join ()

eh14_raw <- wt_download_report(
  project_id = 93,
  sensor_id = "ARU",
  report = "main",
  weather_cols = FALSE
)

use.aru <- eh14_raw %>% 
  wt_tidy_species() %>% 
  wt_replace_tmtt() %>%
  wt_make_wide()