Git Product home page Git Product logo

fgeo.tool's People

Contributors

jimhester avatar maurolepore avatar overstreeth avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

fgeo.tool's Issues

Fix byyr_abundance() to count only one individual per Tag not per StemID

Here the output should be 1 -- ont 2 (@).

library(fgeo)
#> -- Attaching packages -------------------------------------------- fgeo 0.0.0.9000 --
#> v bciex           0.0.0.9000     v fgeo.demography 0.0.0.9000
#> v fgeo.abundance  0.0.0.9004     v fgeo.habitat    0.0.0.9006
#> v fgeo.base       0.0.0.9001     v fgeo.map        0.0.0.9204
#> v fgeo.data       0.0.0.9002     v fgeo.tool       0.0.0.9003
#> 

vft <- data.frame(
  StemID = c("1", "2"),
  Tag = c("0001", "0001"),
  PlotName = "p",
  Status = c("alive", "alive"),
  DBH = c(10, 100),
  ExactDate = c("2000-01-01", "2000-01-01"),
  PlotCensusNumber = c(1, 1),
  CensusID = c(1, 1),
  Genus = c("A", "A"),
  SpeciesName = c("a", "a"),
  Family = "f",
  stringsAsFactors = FALSE
)
vft
#>   StemID  Tag PlotName Status DBH  ExactDate PlotCensusNumber CensusID
#> 1      1 0001        p  alive  10 2000-01-01                1        1
#> 2      2 0001        p  alive 100 2000-01-01                1        1
#>   Genus SpeciesName Family
#> 1     A           a      f
#> 2     A           a      f

# First pick the data you want
pick1 <- pick_plotname(vft, "p")
#> Using: p.

pick2 <- drop_dead_trees_by_cns(pick1)
#> Calculating tree-status (from stem `Status`) by `PlotCensusNumber`.
#> Warning: No observation has .status = dead
#>   * Detected values: alive
#> Dropping rows where `Status = dead`.
#> Warning: No observation has .status = dead
#>   * Detected values: alive
pick3 <- pick_dbh_min(pick2, 10)
pick3
#> # A tibble: 2 x 12
#>   StemID Tag   PlotName Status   DBH ExactDate  PlotCensusNumber CensusID
#>   <chr>  <chr> <chr>    <chr>  <dbl> <chr>                 <dbl>    <dbl>
#> 1 1      0001  p        alive     10 2000-01-01                1        1
#> 2 2      0001  p        alive    100 2000-01-01                1        1
#> # ... with 4 more variables: Genus <chr>, SpeciesName <chr>, Family <chr>,
#> #   status_tree <chr>

byyr_abundance(pick3)
#> # A tibble: 1 x 3
#>   species Family `2000`
#>   <chr>   <chr>   <dbl>
#> 1 A a     f           2

Created on 2018-06-20 by the reprex package (v0.2.0).

thank @rick_pack2

replace_all_na <- function(x, filler = 0) {
  replace(x, http://is.na (x), filler)
}
replace_all_na(df, 0)
#>   x y
#> 1 0 2
#> 2 1 0
replace_all_na(df, "missing")
#>         x       y
#> 1 missing       2
#> 2       1 missing

FastField Forms spreadsheet export - missing required sheets

Please briefly describe your problem and what output you expect. If you have a question, please don't use this form. Instead, ask on https://stackoverflow.com/, https://community.rstudio.com/, https://github.com/forestgeo/forum/ or email Mauro Lepore at [email protected].

Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.


Brief description of the problem
When using FastField Forms for a re-census we've discovered or had submissions missing one or multiple of the required sheets: 'new_secondary_stems', 'recruits', and 'original_stems'. If no recruits are recorded in a quadrat, no sheet is exported. Likewise, post-submission if a recruit is found, then added to a 'new' file by itself, the 'new_secondary_stems', and 'original_stems' sheets are missing. This of course breaks the code and the files cannot be compiled. Any solution to this, or should I reach out to FastField Forms/add the sheets ourselves?

> ######## Compile the separate worksheets from the quadrat workbooks into new excel files

> files.directory <- "C:/Users/shuej/Dropbox (Smithsonian)/Field_Form_Data_Entry/HF/uploads"

> files.export <- "C:/Users/shuej/Dropbox (Smithsonian)/Field_Form_Data_Entry/HF/exports/test"
> xl_sheets_to_xl(files.directory, files.export, first_census = FALSE)
Error: Data should contain these sheets:
original_stems, new_secondary_stems, recruits, root
* Missing sheets: recruits
In addition: Warning messages:
1: `new_secondary_stems` has cero rows. 
2: Filling every cero-row dataframe with NAs (new_secondary_stems). 
3: `new_secondary_stems` has cero rows. 
4: Filling every cero-row dataframe with NAs (new_secondary_stems). 

Write helpers to categorize data

From @maurolepore on August 31, 2017 16:19

Gabriel proposed to develop a friendly way to categorize (cut) numeric variables. Important ones include:

  • from dbh to size category.
  • from gx and gy to quadrat: see add_quad().
suppressPackageStartupMessages(library(fgeo))
x <- tibble::tibble(gx = c(0, 50, 999.9, 1000), gy = gx/2)
add_quad(x)
#> Gessing: plotdim = c(1000, 500)
#>   * If guess is wrong, provide the correct argument `plotdim`
#> # A tibble: 4 x 3
#>      gx    gy quad 
#>   <dbl> <dbl> <chr>
#> 1    0     0  0101 
#> 2   50    25  0302 
#> 3 1000.  500. 5025 
#> 4 1000   500  NANA
add_quad(x, start = 0)
#> Gessing: plotdim = c(1000, 500)
#>   * If guess is wrong, provide the correct argument `plotdim`
#> # A tibble: 4 x 3
#>      gx    gy quad 
#>   <dbl> <dbl> <chr>
#> 1    0     0  0000 
#> 2   50    25  0201 
#> 3 1000.  500. 4924 
#> 4 1000   500  NANA

Created on 2018-07-02 by the reprex package (v0.2.0).

Copied from original issue: forestgeo/fgeo.abundance#46

Consider that mpala must be padded with 5 digits -- not 4

mpala is the only dataset that has so many quadrats that it needs to be padded with 5 instead of 4 "0".

Write an example showing that if the data is subsetted before it is passed to plot_tag_status_by_subquadrat() the order of maps may be different than expected. So show how to arrange the data before subsetting it.

Clean flag_multiple() and multiple_var().

DRY. Some funcitons seem duplicated with fgeo.tool.

flat_multiple() is being used but flag_if() and multiple_var() are not. See if and which is a better alternative to flag_multiple() and similar functions in fgeo.tool. Then clean the rest.

Create tree table from stem table

Pull the code out of rtbl?

Rewrite? E.g.

pick_stem_max_dbh <- function(.data) {
  # Allow input as ViewFullTable, stem and tree table
  old <- names(.data)
  names(.data) <- tolower(old)
  fgeo.base::check_crucial_names(.data, c("treeid", "stemid", "dbh"))
  
  # TODO: Check for unique censusid or group by census id 
  
  # TODO:
  # if (tree_already) {return(stats::setNames(.data, old))}
  
  .data <- tibble::rowid_to_column(.data)
  .data <- dplyr::group_by(.data, .data$treeid)
  .data <- dplyr::arrange(.data, dplyr::desc(.data$dbh))
  .data <- dplyr::ungroup(.data)
  
  out <- dplyr::distinct(.data, .data$treeid, .keep_all = TRUE)
  # Recover original order
  out <- dplyr::select(dplyr::arrange(out, .data$rowid), -.data$rowid)
  
  stats::setNames(out, old)
}

df <- tibble::tibble(
  treeID = c(1, 1, 1, 2, 2, 2),
  stemID = letters[c(1, 2, 3, 1, 2, 3)],
  dbh = c(1, 2, NA, 4, 5, NA)
)
df
#> # A tibble: 6 x 3
#>   treeID stemID   dbh
#>    <dbl> <chr>  <dbl>
#> 1      1 a          1
#> 2      1 b          2
#> 3      1 c         NA
#> 4      2 a          4
#> 5      2 b          5
#> 6      2 c         NA

pick_stem_max_dbh(df)
#> # A tibble: 2 x 3
#>   treeID stemID   dbh
#>    <dbl> <chr>  <dbl>
#> 1      1 b          2
#> 2      2 b          5

Created on 2018-06-28 by the reprex package (v0.2.0).

reexport the main dplyr verbs

Consider reexporting the main verbs and common tools.

pro:

  • everything is available with (fgeo)
    con:

  • conflicts will show up when loading tidyverse

  • makes it harder to discover the tidyverse

main verbs

filter
select
arrange
summarize
mutate

also

group_by
tibble
tribble
as_tibble
count
add_count

select2()

select2 <- function(.data, ...) {
  dots <- rlang::list2(...)
  select(.data, !!!text_exprs(dots))
}

text_exprs <- function(...) {
  rlang::parse_exprs(semicolon(...))
}

semicolon <- function(...) {
  paste0(..., collapse = "; ")
}

library(tidyverse)

mtcars <- as.tibble(mtcars)

text_vars <- c("mpg", "am")
select2(mtcars, text_vars)
#> # A tibble: 32 x 2
#>      mpg    am
#>  * <dbl> <dbl>
#>  1  21       1
#>  2  21       1
#>  3  22.8     1
#>  4  21.4     0
#>  5  18.7     0
#>  6  18.1     0
#>  7  14.3     0
#>  8  24.4     0
#>  9  22.8     0
#> 10  19.2     0
#> # ... with 22 more rows

# If multiple strings, separate vars with comma
text_vars <- c("mpg", "am", "carb")
select2(mtcars, text_vars)
#> # A tibble: 32 x 3
#>      mpg    am  carb
#>  * <dbl> <dbl> <dbl>
#>  1  21       1     4
#>  2  21       1     4
#>  3  22.8     1     1
#>  4  21.4     0     1
#>  5  18.7     0     2
#>  6  18.1     0     1
#>  7  14.3     0     4
#>  8  24.4     0     2
#>  9  22.8     0     2
#> 10  19.2     0     4
#> # ... with 22 more rows

# If single string, separate with semicolon
text_vars <- c("mpg; am; carb")
select2(mtcars, text_vars)
#> # A tibble: 32 x 3
#>      mpg    am  carb
#>  * <dbl> <dbl> <dbl>
#>  1  21       1     4
#>  2  21       1     4
#>  3  22.8     1     1
#>  4  21.4     0     1
#>  5  18.7     0     2
#>  6  18.1     0     1
#>  7  14.3     0     4
#>  8  24.4     0     2
#>  9  22.8     0     2
#> 10  19.2     0     4
#> # ... with 22 more rows

# With a single string you can do all you can do with bare
text <- 'mpg:cyl; matches("df"); everything()'
select2(mtcars, text)
#> # A tibble: 32 x 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>  * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> # ... with 22 more rows
# Same
library(rlang)
#> 
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#> 
#>     %@%, %||%, as_function, flatten, flatten_chr, flatten_dbl,
#>     flatten_int, flatten_lgl, invoke, list_along, modify, prepend,
#>     rep_along, splice
bare <- exprs(mpg:cyl, matches("df"), everything())
select(mtcars, !!!bare)
#> # A tibble: 32 x 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>  * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> # ... with 22 more rows

Created on 2018-04-29 by the reprex package (v0.2.0).

Improve by_group() (or groupwise())

# I want to make any funciton work with grouped data. I propose 
# by_group(.data, .f, ...), similar to map(), by applies .f() not to each column
# of .data but to each group -- and to all of it if .data is ungrouped.
# alias: grouply(), groupwise().

# History of this feature including upcomming dplyr::nest_by()
# https://community.rstudio.com/t/is-nest-mutate-map-unnest-really-the-best-alternative-to-dplyr-do/11009/7?u=mauro_lepore

library(tidyverse)

# E.g.
# nest_by_groups(mtcars)
# nest_by_groups(group_by(mtcars, cyl))
nest_by_groups <- function(.data) {
  g <- group_vars(.data)
  .data %>% 
    tibble::as.tibble() %>% 
    tibble::add_column(.nest_id = dplyr::group_indices(.)) %>% 
    dplyr::ungroup() %>%
    tidyr::nest(-.nest_id)
}

by_group <- function(.data, .f, ...) {
  .data %>% 
    nest_by_groups() %>% 
    mutate(data = map(.data$data, .f, ...)) %>% 
    tidyr::unnest() %>% 
    dplyr::select(-.nest_id)
}

# E.g. 
first_row <- function(.x, to_chr = FALSE) {
  first <- .x[1, ]
  if (to_chr) {
    first[] <- lapply(first, as.character)
  }
  
  tibble::as.tibble(first)
}
mtcars %>% 
  by_group(first_row, to_chr = TRUE)
#> # A tibble: 1 x 11
#>   mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>   <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21    6     160   110   3.9   2.62  16.46 0     1     4     4

mtcars %>% 
  group_by(cyl) %>% 
  by_group(first_row, to_chr = T)
#> # A tibble: 3 x 11
#>   mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
#>   <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
#> 2 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
#> 3 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2

Created on 2018-07-16 by the reprex package (v0.2.0.9000).

Wrap vegan

Diversity

E.g. from qo00 in maurolepore/hrv

diversity_by_sample <- wrangle() %>%
  dplyr::group_by(sample_name) %>%
  dplyr::mutate(
    shannon = vegan::diversity(abun_pcnt, "shannon", MARGIN = 2),
    invsimpson = vegan::diversity(abun_pcnt, "invsimpson", MARGIN = 2),
    simpson = vegan::diversity(abun_pcnt, "simpson", MARGIN = 2),
    richness = length(sample_name)
  ) %>%
  dplyr::ungroup() %>%
  dplyr::select(
    time_interval, collected_from, sample_name, species, shannon:richness
  ) %>%
  # Keep only one row per sample -- not one per sample per species
  dplyr::select(-species) %>%
  unique() %>%
  # Transform to long format to allow facetting by metric(key)
  tidyr::gather(key = metric, value = value, shannon:richness)

fgeo classes and methods to classify tables

The job is to homogenize the different datasets so that they can be used with the greatest number of functions.

Use data from Yosemite for example. Suzanne said it's up to date

  • Identify each table by its names. Check the difference between names to understand what identifies each table uniquely

  • Check what names intersect

  • What variables are the same, regardless of their names. See definition in CFTS pakge

  • What variables are not the same

  • What variables matter -- the crucial names of all functions I have developed (search for crucial_names())

...
Fgeo_clasify x
If (names x match vft) as.vft
If ... as.stem

vft <- f (x, ...) structure(x, class = c(vft, tibble?, data.frame))

as.fgeo
as_fgeo?

homo_nms.vft
homo_nms.stem
...

error with xlff_to_dfs - repeating sheet name

After running the FastField Forms files through the xlff_to_xl function we then attempted to run the new xlff_to_dfs function. We were unsuccessful and received the following error:

> first_week_hf <- xlff_to_dfs("C:/Users/shuej/Dropbox (Smithsonian)/Field_Form_Data_Entry/HF/exports/test")
Error: Column `sheet` must have a unique name
In addition: Warning message:
Adding missing sheets: original_stems, new_secondary_stems, recruits, [root.]

The name of the sheet for each stem repeats multiple times depending upon which sheet the stem originated from.
excel file

fill_na()

Rename replace _all_na() to fill_na() with is more specific. Replace is a general term on the implementation domain -- because it is inspired in replace() -- while fill_na() is on the problem domain.

Consider adding a method for lists.

fill_na <- function(x, filler) {
  x[is.na(x)] <- filler
  x
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.