rekyt / funrar Goto Github PK

:package: R package to compute functional rarity indices (functional distinctiveness, uniqueness, restrictedness, scarcity)

Home Page: https://rekyt.github.io/funrar

R 100.00%

ecology rarity ecological-models r-package r traits

funrar's People

Contributors

Stargazers

Watchers

Forkers

juliasilge

funrar's Issues

Functional rarity dimensionality

As suggested by a reviewer, we could add a function to compute functional rarity indices on different combinations of traits from a single trait data frame.
This new function would take a trait data frame and a site-species matrix as input, and output lists of functional rarity indices based on different combinations of traits for example trait by trait or all traits together.
If thinking about all combinations of traits, it may be computionally too intensive to compute. But we could include it as an option.

Add warning when using distinctiveness() & scarcity() directly on site-species matrix

From N. Mouquet's email. Should add a warning to avoid user to input directly site-species matrix.

Could test for example if rows sum to 1. Or if all values are between 0 and 1. And suggest ussing make_relative() to the user.

No error message when tra_table doesn't have any rownames.

From @pierredenelle on April 18, 2016 15:29

Then distinctiveness and scarcity cannot work and there's no clear error message.

Copied from original issue: Rekyt/project_divgrass#1

Create `_tidy()` aliases for `_stack()` functions

Because _stack() can be difficult to understand we create function aliases with _tidy() suffixes

pres_distinctiveness fires warning when nothing changes

Need reproducible example.
But sometimes get warning message "Generating NaN" where there is no NaN.

Add indentation in compute_dist_matrix() message

Ugly message with suggestion euclidean distance:

Only numeric traits provided, consider using euclidean distance.binary variable(s) 5 treated as interval scaled

Should add a "\n" in the cat() call.

Compatibility with `bigmatrix`

For edge cases with large matrices implement function with package bigmatrix.

Better introductory vignettes

Explain better the difference between the list of aravo and the different data.frames.

Explain also better the different format of data (show pictures) between stack and non-stack.

pres_distinctiveness does not work with a single community

Reproducible example:

library(outlieR)

com = matrix(rep(1, 5))

rownames(com) = letters[1:5]

tr = data.frame(tr1 = rnorm(5))

rownames(tr) = letters[1:5]

sp_dist = compute_dist_matrix(tr)

pres_distinctiveness(com, sp_dist)

Error in index_matrix/denom_matrix : non-conformable arrays

Transform to sparse matrices in stack_to_matrix()

Idea from @pierredenelle: leverage on tidytext::cast_sparse() to be able to convert data frame to sparse matrix directly.

Could be with an option stack_to_matrix(my_df, sparse = TRUE).

Can't transform dense matrices into stack

Reproducible example:

library(Matrix)
library(funrar)
m = matrix(rnorm(10), nrow = 2)
matrix_to_stack(as(m, "dgeMatrix"))

R CMD check fail on CRAN

see https://cran.r-project.org/web/checks/check_results_funrar.html

These can be reproduced by checking with --as-cran using current
r-devel, which for now sets

R_CLASS_MATRIX_ARRAY=true

in the check environment, to the effect that

R> class(matrix(1 : 4, 2, 2))
[1] "matrix" "array"

(and no longer just "matrix" as before).

According to the R NEWS file,

For now only active when environment variable R_CLASS_MATRIX_ARRAY
is set to non-empty, but planned to be the new unconditional behavior
when R 4.0.0 is released:

matrix objects now also inherit from class "array", namely, e.g.,
class(diag(1)) is c("matrix", "array") which invalidates code
assuming that length(class(obj)) == 1, an incorrect assumption that
is less frequently fulfilled now.

S3 methods for "array", i.e., .array(), are now also
dispatched for matrix objects.

Apparently your package no longer works correctly when
class(matrix(...)) gives a vector of length two: please fix as
necessary.

vignette rarity

Can the violle TREE paper be added as a reference in this document as well?

Restrictedness computation

For the moment the package does not include Restrictedness computation for sake of simplicity. Should we draft a function applicable on different datasets to provide application for all of the Ecology of outliers framework?

Better standardization of Restrictedness Ri

As suggested by F. Munoz:
For the moment a species that is absent from the site-speices matrix is the only one to achieve Ri = 1, which makes little sense in an real context.
We could try to find other ways of standardizing restrictedness.
For the moment only considering the "pixel"/"site" version of restrictedness

For the moment
[1] Ri = 1 - Ki/Ktot
we could standardize compared to the restrictedness of a species present in a single cell Rone:
[2] Ri_new = Ri/Rone = (1 - Ki/Ktot) / (1 - 1/Ktot) = (Ktot - Ki)/(Ktot - 1)

Or a more general formula to account for spatial determination of restrictedness consider the minimum occupancy value Kmin:
[3] Ri_new_area = Ri_area / Rmin_area = (1 - Ki_area / Ktot_area) / (1 - Kmin_area / K_tot_area) = (Ktot_area - Ki_area) / (Ktot_area - Kmin_area)
Then in [2] & [3], Ri = 1 when Ki = 1 and Ki_area = Kmin_area respectively.

And we should say that Ri for an absent species does not make sense.

No warning when using `scarcity_stack()` without relative abundances

data("aravo", package = "ade4")

mat = as.matrix(aravo$spe)
tra = aravo$traits[, c("Height", "SLA", "N_mass")]
dist_mat = funrar::compute_dist_matrix(tra)
#> Warning in funrar::compute_dist_matrix(tra): Only numeric traits provided,
#> consider using euclidean distance.

# Warning --------------------------------------------------------------------------------
g = funrar::scarcity(mat)
#> Warning in funrar::scarcity(mat): Provided object may not contain relative abundances nor presence-absence
#> Have a look at the make_relative() function if it is the case
h = funrar::distinctiveness(mat, dist_mat)
#> Warning in funrar::distinctiveness(mat, dist_mat): Provided object may not contain relative abundances nor presence-absence
#> Have a look at the make_relative() function if it is the case
i = funrar::distinctiveness_stack(funrar::matrix_to_stack(mat, "abund", "site",
                                                          "species"),
                                  "species", "site", "abund", dist_mat)
#> Warning in funrar::distinctiveness_stack(funrar::matrix_to_stack(mat, "abund", : Provided object may not contain relative abundances nor presence-absence
#> Have a look at the make_relative() function if it is the case

# No warning --------------------------------------------------------------------------------
j = funrar::scarcity_stack(funrar::matrix_to_stack(mat, "abund", "site",
                                                   "species"),
                           "species", "site", "abund")

Should also make a _stack() version of make_relative()!

Alternative distinctiveness definition

Following CESAB group discussion. Suggestion from A. Osling to compute distinctiveness in local communities based on maximum observed functional distance at the local level.

Using a threshold at the site level -> T, T = max(dij) per site for the focal species?

vignette

Pierre - I don't really feel like showing all of the calculations again for the sparse matrix/ table version is necessary in rarity_indices.Rmd. Would it be possible to introduce them, exactly as you do, but then just list the function names for the corresponding functions?

Compute Di SAD

Compute distribution of species or individual number per class of Di values in an assemblage (plus regional).
Compare to null model (sampling of regional pool).
Look at deviations and correlation deviations to Di values.

Distance matrix

Should we include functional distance matrix computation in the package?
We can let the users choose its metrics, because our wrapper compute_dist_matrix() seems quite poor. And is not explicit about its choices.

Add Standardization from Ricotta et al. 2016

Ricotta et al. 2016 [1] suggested various measures of functional rarity. Some similar to Functional Distinctiveness that can be standardized by maximum value.

We could include standardization into funrar.

[1] Ricotta, C., de Bello, F., Moretti, M., Caccianiga, M., Cerabolini, B. E.L. and Pavoine, S. (2016), Measuring the functional redundancy of biological communities: a quantitative guide. Methods Ecol Evol, 7: 1386–1395. doi:10.1111/2041-210X.12604

Usage of dplyr not exported 'left_join_impl' function

The package uses dplyr:::left_join_impl() to join tables, which CRAN advises against. So we should do something about it.

Warning when producing `NaN` with `distinctiveness_stack()`

When distinctiveness_stack() produces NaN values because a species is alone in a given site, there are no warnings.
We should add one, and test also on all the other functions.

Matrices-Dataframes

For the functions, is it possible to add either a check for the incorrect data type that returns a warning (e.g. pres_matrix is a dataframe, must be a matrix); or a check that automatically converts df -> matrix and continues the calculations?
Currently no informative error shows up if you input a data.frame, just something like "Error in pres_matrix %*% dist_matrix : requires numeric/complex matrix/vector arguments"
I imagine this would be a common issue.

Warning message when only continuous traits provided in compute_dist_matrix

Gower distance is the default parameter with compute_dist_matrix.
Possibility to warn the user when only continuous traits are provided.

Ex:
if (all(vapply(traits_table, is.numeric, TRUE))) {
warning("Only continuous traits provided, you could use Euclidean instead of Gower distance.")
}

distinctiveness_global() does not put an error when some species are all NAs

Adding argument to disable matrice subsetting

Sometimes it can be interesting to avoid subsetting matrices between distances matrices and presence-absence matrices.

Lack of error message in pres_distinctiveness.

From @pierredenelle on April 18, 2016 16:25

When:
length(rownames(site_sp_matrix)) > length(rownames(tra_table))
error message not clear.

Display stg like: "more species in site_species matrix than in provided distance matrix."

Copied from original issue: Rekyt/project_divgrass#2

`matrix_to_stack()` does not work on sparse matrices

Need to fix that

Remove reference to relative abundances

distinctiveness() warns the user when using relative abundance while it doesn't change the results of the computation. The warnings should go.

`stack_to_matrix()` and `matrix_to_stack()` names

Just create aliases with names tidy_to_matrix() and matrix_to_tidy(), as well as aliases for all index_stack()

Tidy table or sites x species matrix?

For the moment we made our computations on tidy tables with column for sites and species. However, most of the time, datasets are provided as site-species matrix with abundances in cells.

We developed pres_distinctiveness() to compute distinctiveness over such matrix but it uses tidy table in the background.
We could try to find or write a function to convert between the two types of data presentation.

Add function to compute global distinctiveness using directly distance matrix

A very simple function that would create directly the way to compute distinctiveness with each species equally present using an entire distance/dissimilarity matrix:

get_di_from_dist = function(dist_matrix, di_name = "global_di") {
  funrar::distinctiveness(
    matrix(1, nrow = 1, ncol = length(labels(dist_matrix)),
           dimnames = list(site = "global",
                           species = labels(dist_matrix))),
    as.matrix(dist_matrix)) %>%
    funrar::matrix_to_stack(di_name, "global",
                            "species") %>%
    select(-global)
}

Correct vignette name typo

funrar/vignettes/sparse_matrices.Rmd

Line 7 in f1b04d0

%\VignetteIndexEntry{Sparse Marices with 'funrar'}

Simplify compatibility with sparse Matrices

By depending on the Matrix package we can simplify the code as in fundiversity

abundance data

How are different types of abundance data dealt with currently in outlier? Does it matter if the data is count data, relative abundance data, % cover, etc?

Resolve CRAN compatibility with R 4.0.0 (stringAsFactors = FALSE)

See: https://cran.r-project.org/web/checks/check_results_funrar.html

devel version of R fails because of updates to R 4.0.0 (https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/index.html)

Can you please fix your package to work with both the old and new
default? In principle, this can easily be achieved by adding
stringsAsFactors = TRUE to the relevant calls to data.frame() or
read.table() [or other read.* function calling read.table()], but please
only do this if the sort order used in the string to factor conversion
really does not matter (see the blog post about the locale dependence of
the conversion). Otherwise, please change to create the factors with
explicitly given levels.

Please correct before 2020-03-20

Doc scale 0 and 1

One user contacted me by email because he would not know that distance metric needed to be standardized between 0 and 1 prior functional distinctiveness computation. This is especially true for euclidean distances.
We need to add this in the documentation.