rekyt / funrar Goto Github PK
View Code? Open in Web Editor NEW:package: R package to compute functional rarity indices (functional distinctiveness, uniqueness, restrictedness, scarcity)
Home Page: https://rekyt.github.io/funrar
:package: R package to compute functional rarity indices (functional distinctiveness, uniqueness, restrictedness, scarcity)
Home Page: https://rekyt.github.io/funrar
As suggested by a reviewer, we could add a function to compute functional rarity indices on different combinations of traits from a single trait data frame.
This new function would take a trait data frame and a site-species matrix as input, and output lists of functional rarity indices based on different combinations of traits for example trait by trait or all traits together.
If thinking about all combinations of traits, it may be computionally too intensive to compute. But we could include it as an option.
From N. Mouquet's email. Should add a warning to avoid user to input directly site-species matrix.
Could test for example if rows sum to 1. Or if all values are between 0 and 1. And suggest ussing make_relative()
to the user.
From @pierredenelle on April 18, 2016 15:29
Then distinctiveness and scarcity cannot work and there's no clear error message.
Copied from original issue: Rekyt/project_divgrass#1
Because _stack()
can be difficult to understand we create function aliases with _tidy()
suffixes
Need reproducible example.
But sometimes get warning message "Generating NaN" where there is no NaN.
Ugly message with suggestion euclidean distance:
Only numeric traits provided, consider using euclidean distance.binary variable(s) 5 treated as interval scaled
Should add a "\n"
in the cat()
call.
For edge cases with large matrices implement function with package bigmatrix.
Explain better the difference between the list of aravo
and the different data.frames.
Explain also better the different format of data (show pictures) between stack
and non-stack
.
Reproducible example:
library(outlieR)
com = matrix(rep(1, 5))
rownames(com) = letters[1:5]
tr = data.frame(tr1 = rnorm(5))
rownames(tr) = letters[1:5]
sp_dist = compute_dist_matrix(tr)
pres_distinctiveness(com, sp_dist)
Error in index_matrix/denom_matrix : non-conformable arrays
Idea from @pierredenelle: leverage on tidytext::cast_sparse() to be able to convert data frame to sparse matrix directly.
Could be with an option stack_to_matrix(my_df, sparse = TRUE)
.
Reproducible example:
library(Matrix)
library(funrar)
m = matrix(rnorm(10), nrow = 2)
matrix_to_stack(as(m, "dgeMatrix"))
see https://cran.r-project.org/web/checks/check_results_funrar.html
These can be reproduced by checking with --as-cran using current
r-devel, which for now setsR_CLASS_MATRIX_ARRAY=true
in the check environment, to the effect that
R> class(matrix(1 : 4, 2, 2))
[1] "matrix" "array"(and no longer just "matrix" as before).
According to the R NEWS file,
For now only active when environment variable R_CLASS_MATRIX_ARRAY
is set to non-empty, but planned to be the new unconditional behavior
when R 4.0.0 is released:matrix objects now also inherit from class "array", namely, e.g.,
class(diag(1)) is c("matrix", "array") which invalidates code
assuming that length(class(obj)) == 1, an incorrect assumption that
is less frequently fulfilled now.S3 methods for "array", i.e., .array(), are now also
dispatched for matrix objects.
Apparently your package no longer works correctly when
class(matrix(...)) gives a vector of length two: please fix as
necessary.
Can the violle TREE paper be added as a reference in this document as well?
For the moment the package does not include Restrictedness computation for sake of simplicity. Should we draft a function applicable on different datasets to provide application for all of the Ecology of outliers framework?
As suggested by F. Munoz:
For the moment a species that is absent from the site-speices matrix is the only one to achieve Ri = 1, which makes little sense in an real context.
We could try to find other ways of standardizing restrictedness.
For the moment only considering the "pixel"/"site" version of restrictedness
For the moment
[1] Ri = 1 - Ki/Ktot
we could standardize compared to the restrictedness of a species present in a single cell Rone:
[2] Ri_new = Ri/Rone = (1 - Ki/Ktot) / (1 - 1/Ktot) = (Ktot - Ki)/(Ktot - 1)
Or a more general formula to account for spatial determination of restrictedness consider the minimum occupancy value Kmin:
[3] Ri_new_area = Ri_area / Rmin_area = (1 - Ki_area / Ktot_area) / (1 - Kmin_area / K_tot_area) = (Ktot_area - Ki_area) / (Ktot_area - Kmin_area)
Then in [2] & [3], Ri = 1 when Ki = 1 and Ki_area = Kmin_area respectively.
And we should say that Ri for an absent species does not make sense.
data("aravo", package = "ade4")
mat = as.matrix(aravo$spe)
tra = aravo$traits[, c("Height", "SLA", "N_mass")]
dist_mat = funrar::compute_dist_matrix(tra)
#> Warning in funrar::compute_dist_matrix(tra): Only numeric traits provided,
#> consider using euclidean distance.
# Warning --------------------------------------------------------------------------------
g = funrar::scarcity(mat)
#> Warning in funrar::scarcity(mat): Provided object may not contain relative abundances nor presence-absence
#> Have a look at the make_relative() function if it is the case
h = funrar::distinctiveness(mat, dist_mat)
#> Warning in funrar::distinctiveness(mat, dist_mat): Provided object may not contain relative abundances nor presence-absence
#> Have a look at the make_relative() function if it is the case
i = funrar::distinctiveness_stack(funrar::matrix_to_stack(mat, "abund", "site",
"species"),
"species", "site", "abund", dist_mat)
#> Warning in funrar::distinctiveness_stack(funrar::matrix_to_stack(mat, "abund", : Provided object may not contain relative abundances nor presence-absence
#> Have a look at the make_relative() function if it is the case
# No warning --------------------------------------------------------------------------------
j = funrar::scarcity_stack(funrar::matrix_to_stack(mat, "abund", "site",
"species"),
"species", "site", "abund")
Should also make a _stack()
version of make_relative()
!
Following CESAB group discussion. Suggestion from A. Osling to compute distinctiveness in local communities based on maximum observed functional distance at the local level.
Using a threshold at the site level -> T, T = max(dij) per site for the focal species?
Pierre - I don't really feel like showing all of the calculations again for the sparse matrix/ table version is necessary in rarity_indices.Rmd. Would it be possible to introduce them, exactly as you do, but then just list the function names for the corresponding functions?
Compute distribution of species or individual number per class of Di values in an assemblage (plus regional).
Compare to null model (sampling of regional pool).
Look at deviations and correlation deviations to Di values.
Should we include functional distance matrix computation in the package?
We can let the users choose its metrics, because our wrapper compute_dist_matrix()
seems quite poor. And is not explicit about its choices.
Ricotta et al. 2016 [1] suggested various measures of functional rarity. Some similar to Functional Distinctiveness that can be standardized by maximum value.
We could include standardization into funrar
.
[1] Ricotta, C., de Bello, F., Moretti, M., Caccianiga, M., Cerabolini, B. E.L. and Pavoine, S. (2016), Measuring the functional redundancy of biological communities: a quantitative guide. Methods Ecol Evol, 7: 1386โ1395. doi:10.1111/2041-210X.12604
The package uses dplyr:::left_join_impl()
to join tables, which CRAN advises against. So we should do something about it.
When distinctiveness_stack()
produces NaN
values because a species is alone in a given site, there are no warnings.
We should add one, and test also on all the other functions.
For the functions, is it possible to add either a check for the incorrect data type that returns a warning (e.g. pres_matrix is a dataframe, must be a matrix); or a check that automatically converts df -> matrix and continues the calculations?
Currently no informative error shows up if you input a data.frame, just something like "Error in pres_matrix %*% dist_matrix : requires numeric/complex matrix/vector arguments"
I imagine this would be a common issue.
Gower distance is the default parameter with compute_dist_matrix.
Possibility to warn the user when only continuous traits are provided.
Ex:
if (all(vapply(traits_table, is.numeric, TRUE))) {
warning("Only continuous traits provided, you could use Euclidean instead of Gower distance.")
}
Sometimes it can be interesting to avoid subsetting matrices between distances matrices and presence-absence matrices.
From @pierredenelle on April 18, 2016 16:25
When:
length(rownames(site_sp_matrix)) > length(rownames(tra_table))
error message not clear.
Display stg like: "more species in site_species matrix than in provided distance matrix."
Copied from original issue: Rekyt/project_divgrass#2
Need to fix that
distinctiveness()
warns the user when using relative abundance while it doesn't change the results of the computation. The warnings should go.
Just create aliases with names tidy_to_matrix()
and matrix_to_tidy()
, as well as aliases for all index_stack()
For the moment we made our computations on tidy tables with column for sites and species. However, most of the time, datasets are provided as site-species matrix with abundances in cells.
We developed pres_distinctiveness()
to compute distinctiveness over such matrix but it uses tidy table in the background.
We could try to find or write a function to convert between the two types of data presentation.
A very simple function that would create directly the way to compute distinctiveness with each species equally present using an entire distance/dissimilarity matrix:
get_di_from_dist = function(dist_matrix, di_name = "global_di") {
funrar::distinctiveness(
matrix(1, nrow = 1, ncol = length(labels(dist_matrix)),
dimnames = list(site = "global",
species = labels(dist_matrix))),
as.matrix(dist_matrix)) %>%
funrar::matrix_to_stack(di_name, "global",
"species") %>%
select(-global)
}
funrar/vignettes/sparse_matrices.Rmd
Line 7 in f1b04d0
By depending on the Matrix package we can simplify the code as in fundiversity
How are different types of abundance data dealt with currently in outlier? Does it matter if the data is count data, relative abundance data, % cover, etc?
See: https://cran.r-project.org/web/checks/check_results_funrar.html
devel version of R fails because of updates to R 4.0.0 (https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/index.html)
Can you please fix your package to work with both the old and new
default? In principle, this can easily be achieved by adding
stringsAsFactors = TRUE to the relevant calls to data.frame() or
read.table() [or other read.* function calling read.table()], but please
only do this if the sort order used in the string to factor conversion
really does not matter (see the blog post about the locale dependence of
the conversion). Otherwise, please change to create the factors with
explicitly given levels.
Please correct before 2020-03-20
One user contacted me by email because he would not know that distance metric needed to be standardized between 0 and 1 prior functional distinctiveness computation. This is especially true for euclidean distances.
We need to add this in the documentation.
Line 8-18 the text isn't clear in English.
To describe new functionalities of distinctiveness_global()
, distinctiveness_range()
and others!
We consider species as characters but most of the times default R behavior transforms them into factors we should take that into consideration and comply to it.
One reviewer suggested to add trait standardization/scaling when using compute_dist_matrix()
.
In future versions it should be possible to include scaling for continuous trait by range or variance. Or suggest other functions to use.
Go through text in files and ensure that a consistent term is used and it does not include the word "table".
i like site-species matrix (especially if that is defined somewhere).
using either dataframe or community table can be confusing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.