Git Product home page Git Product logo

Comments (5)

maurolepore avatar maurolepore commented on June 21, 2024

All that follow is by @gabrielareto (via @).

we need, in general, code to go from quantitative variables to qualitative variables, to allow flexible splitting of the data. This is code to go from dbh to size class.

from_dbh_to_dbh_class <- function(dbh, k, breaks = NULL, ...)
{
  if(is.null(breaks)) breaks <- exp(c(seq(from = log(10), to = 7, length.out = k)))
  cut(dbh, breaks = breaks, ...)
} 

it assumes minimum dbh of 10 (mm) and that everything above 1 m (~ exp(7) mm) is the largest size class. It assumes size classes evenly distributed on the log scale. I think it makes sense for trees, in our plots. In some plots with small trees, maybe exp(6) is a more appropriate lower bound for the largest size class. An extra line could merge two size classes if the largest size class is "too empty".

we need something similar to calculate the quadrat for every individual, at any grid size. This code calculates a boolean matrix of individuals by quadrats. It's vectorized, but maybe that matrix is too big for a full dataset and slow because of memory issues. Think about it.

gridsize = 20
g <- expand.grid(seq(0, 1000 - gridsize, by = gridsize), seq(0, 500 - gridsize, by = gridsize))

boo1 <- outer(gx, g[,1], '>=')
boo2 <- outer(gx, g[,1] + gridsize, '<')
boo3 <- outer(gy, g[,2], '>=')
boo4 <- outer(gy, g[,2] + gridsize, '<')
boo <- boo1 & boo2 & boo3 & boo4

the final function should return quadrat codes with consistent names. I suggest x0_y0 format, using coordinates in meters of the bottomleft corner of each quadrat, using 4 digits always because some plots have >1km but all have <10km length. A utility function refill_zeros <- function(x, nchar = 4) should be useful.

it may return also the lx, ly coordinates, lx = gx %% gridsize.

these quantity-to-category functions could either return the categories, or append new columns to the full dataset, returning a larger full table. In any case, all of them should behave similarly.

best regards,

Gabriel

from fgeo.tool.

maurolepore avatar maurolepore commented on June 21, 2024

From @gabrielareto on August 31, 2017 21:5

vectorized function to add zeros to each position in character vectors

add_zeroes <- add_zeros <- function(x, min.n.char = 4)
{
n.0s.missing <- pmax(0, min.n.char - nchar(x))
paste0(strrep("0", n.0s.missing), x)
}

note it doesn't work with a matrix, sanity checks required


This is already implemented in stringr::str_pad()

add_zeroes <- add_zeros <- function(x, min.n.char = 4) {
  n.0s.missing <- pmax(0, min.n.char - nchar(x))
  paste0(base::strrep("0", n.0s.missing), x)
}
add_zeroes(1)
#> [1] "0001"

stringr::str_pad(1, width = 4, pad = 0)
#> [1] "0001"

stringr::str_pad(1, width = 4, pad = "k")
#> [1] "kkk1"

Created on 2018-12-30 by the reprex package (v0.2.1)

from fgeo.tool.

maurolepore avatar maurolepore commented on June 21, 2024

From @gabrielareto on August 31, 2017 21:46

individuals_into_quadrats <- function(x, gridsize = 20, xmax, ymax)
{
  # Some grid sizes are not common multiples of plot dimensions
  bad <- which(!(xmax %% gridsize == 0 & ymax %% gridsize == 0))
  if(length(bad) > 0) {
    warning(paste0("These grid sizes won't be used:", paste(gridsize[bad], collapse = ",")))
    gridsize <- gridsize[-bad]
  }
  
  # Make many grids and calculate quadrat codes in left_bott format
  q <- sapply(gridsize, function(gs){
    g <- expand.grid(seq(0, xmax - gs, by = gs), seq(0, ymax - gs, by = gs))
    boo1 <- outer(x$gx, g[,1], '>=')
    boo2 <- outer(x$gx, g[,1] + gs, '<')
    boo3 <- outer(x$gy, g[,2], '>=')
    boo4 <- outer(x$gy, g[,2] + gs, '<')
    boo <- boo1 & boo2 & boo3 & boo4
    w <- apply(boo, 1, which)
    Q1 <- add_zeros(g[w, 1], min.n.char = 4)
    Q2 <- add_zeros(g[w, 2], min.n.char = 4)
    paste0(Q1, "_", Q2)
  })
  colnames(q) <- paste0("Q", gridsize)
  
  # Calculate local coordinates for each grid --- (yes, this could/should be a different function)
  lx <- outer(x$gx, gridsize, '%%')
  ly <- outer(x$gy, gridsize, '%%')
  colnames(lx) <- paste0(colnames(q), "_lx")
  colnames(ly) <- paste0(colnames(q), "_ly")
  local.coor <- round(cbind(lx, ly), 1)
  # (sort columns in x,y alternance)
  local.coor <- local.coor[,c(seq(1, ncol(local.coor), by = 2), seq(2, ncol(local.coor), by = 2))]
  
  # Join and return
  data.frame(q, local.coor)
}

qq <- individuals_into_quadrats(bci12t3mini, gridsize = c(5, 10, 20, 100, 3.3832893274), xmax = 1000, ymax = 500)
Warning message:
In individuals_into_quadrats(bci12t3mini, gridsize = c(5, 10, 20, :
These grid sizes won't be used:3.3832893274
head(qq)
Q5 Q10 Q20 Q100 Q5_lx Q20_lx Q5_ly Q20_ly Q10_lx Q100_lx Q10_ly Q100_ly
1 0895_0040 0890_0040 0880_0040 0800_0000 3.7 18.7 2.0 2.0 8.7 98.7 2.0 42.0
2 0870_0280 0870_0280 0860_0280 0800_0200 3.2 13.2 4.2 4.2 3.2 73.2 4.2 84.2
3 0865_0160 0860_0160 0860_0160 0800_0100 1.6 6.6 3.4 3.4 6.6 66.6 3.4 63.4
4 0740_0300 0740_0300 0740_0300 0700_0300 4.0 4.0 4.9 4.9 4.0 44.0 4.9 4.9
5 0720_0445 0720_0440 0720_0440 0700_0400 3.8 3.8 1.9 6.9 3.8 23.8 6.9 46.9
6 0615_0435 0610_0430 0600_0420 0600_0400 1.6 16.6 0.8 15.8 6.6 16.6 5.8 35.8

from fgeo.tool.

maurolepore avatar maurolepore commented on June 21, 2024

From @gabrielareto on August 31, 2017 22:0

we need to decide if we want functions that take any input and return any output, or functions that take the table as input and an expanded table as output. The first is more general (anyone could use for their datasets), the second includes a step that we think necessary in our workflow.

maybe arguments should be passed as

f(census.data = NULL, gx = NULL, gy = NULL, etc)

if the census.data is the input, the output could be the expanded census.data. If not, just return what is new.

Example with the dbh to size class function:

from_dbh_to_dbh_class <- function(census.data = NULL, dbh = NULL, k, breaks = NULL, ...)
{
  if(!is.null(census.data)) dbh <- census.data$dbh
  if(is.null(breaks)) breaks <- exp(c(seq(from = log(10), to = 7, length.out = k)))
  out <- cut(dbh, breaks = breaks, ...)
  if(!is.null(census.data)) out <- data.frame(census.data, out)
  out
}

That's an important trade off. Whenever possible, I build general functions. But often times the interface is simplest if the input is ForestGEO-like data. The two kind of functions are tagged with a #' @family. That way it's relatively easy to find general or specific functions.

from fgeo.tool.

maurolepore avatar maurolepore commented on June 21, 2024

From @gabrielareto on September 12, 2017 21:46

This function reads a vector of quadrat ID's in the x0_y0 format (explained above) and returns the "traditional" Q20 code in 'column row' format. This should be useful only to explore data in other formats, from some habitat maps, etc, but the forestr should not rely on this naming convention.

It allows any separation character, and declare what is the first column or row (0 or 1).

from_Q20_x0y0_format_to_colrow_format <- function(x, first = 0, sep = "")
{
	first = as.numeric(first)
	if(length(first) != 1) stop("'first' must be of length one")
	if(!first %in% c(0, 1)) warning("It is very weird that the first column or row is different than 00 or 01...")
	
	col.row.codes <- sapply(strsplit(as.character(x), split = "_"), function(y) {
		out = NA
		y <- as.numeric(y)
		if(all(y %% 20 == 0))
		{
			y <- y/20
			if(first == 1) y <- y + 1
			y <- add_zeros(y, min.n.char = 2)
			out <- paste(y, collapse = sep)
		}
		return(out)
	})
	if(all(is.na(col.row.codes))) stop("The bottom-left corner coordinates are not multiple of 20 m")
	return(col.row.codes)
}

Something similar is already implemented

# Similar
library(fgeo)
#> -- Attaching packages ------------------------------------------------------------------- fgeo 0.0.0.9002 --
#> v fgeo.x       0.0.0.9000     v fgeo.analyze 0.0.0.9003
#> v fgeo.plot    0.0.0.9402     v fgeo.tool    0.0.0.9005
#> -- Conflicts --------------------------------------------------------------------------- fgeo_conflicts() --
#> x fgeo.tool::filter() masks stats::filter()

census <- tibble(
  gx = c(0, 20, 40), 
  gy = c(0, 20, 40)
)

add_col_row(census, gridsize = 20, plotdim = c(100, 100))
#> # A tibble: 3 x 4
#>      gx    gy col   row  
#>   <dbl> <dbl> <chr> <chr>
#> 1     0     0 01    01   
#> 2    20    20 02    02   
#> 3    40    40 03    03

Created on 2018-12-30 by the reprex package (v0.2.1)

from fgeo.tool.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.