Git Product home page Git Product logo

sok-marine-biodiversity's Introduction

The state of knowledge for marine invertebrate biodiversity in the continental US

Work in progress, aiming at synthesizing the publicly available data about marine invertebrate biodiversity within the continental US.

Build Docker container

docker build -t sok:latest .

Run container

sudo docker run --rm -it -v /home/francois/sok-marine-biodiversity/:/sok -v /home/francois/sok-marine-biodiversity/db/data:/var/lib/postgresql/data sok:latest /bin/bash

sok-marine-biodiversity's People

Contributors

fmichonneau avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

amdevine

sok-marine-biodiversity's Issues

write taxonomic validation tests

on the idigbio_records and related, it would be good to check using the WoRMS classification (and not the taxonomy from iDigBio) that all species are actually inverts (it seems that a few fish may have still make their through the data clean up process).

control for diversity in `records_rare_phyla`

the numbers for the rare phyla are possibly exacerbated by low diversity in some phyla (e.g. only 10 spp of phoronids occur in the US). Need to control for diversity, could use phylum estimates from Appeltans et al 2012.

talk about rate of biodiersity loss

from JAM:

I would probably start also with something on the rate of biodiversity loss -- or lack of knowledge in the oceans or something... highlighting the importance of this work. If you can put some sort of number there and then say this is likely a huge under-estimate due to the fact that we dont really know what is there.

rename add_bold to something

the add_ prefix suggests we are adding info to the original dataset, while (at least in its current form), this function gets the stats for each species in the original dataset

in `stat_bold` remove singletons

in the estimation of the proportion of species that include cryptics, remove singletons, as we can't use them if there is a single record for that species

get idigbio records for species that are "not in the database"

for iDigBio, a quick manual search on the 750+ species of mulluscs and arthropods that are listed as "not in database", are actually in idigbio but do not have GPS coordinates associated with them (just generic info: e.g., Cuba, Florida, Texas) that could place them in the list of the Gulf when searching on geographic coordinates.

Maybe search by species names to estimate how many that could represent.

rename project

maybe status-invertebrates-us or us-marine-biodiversity

discuss trade off with older collections

in discussion: important to digitize older records/improve their geographic info; however, potentially can't be used for DNA (low success rate, and most specimens fixed in formalin).

mention absence of FWRI collections

mention (after double checking) that the FWRI collections are probably the most comprehensive for the Gulf of Mexico but they haven't been digitized yet.

Ideas to explore diversity

splitting the map into regions, would allow me to compare richness along the coast. Maybe a simple similarity index + tree could do the job

create table that shows data quality

a minimal dashboard that shows:

  • for species name:
    • percent that have a match in WoRMS
    • percent identified at species level
    • misspellings
  • for dates:
    • outside valid range
    • empty
  • for geographic coordinates
    • empty
    • wrong?

insist more about the need for DNA in intro

From JAM:

I would also talk about the issue of identifying species and needing DNA. I think that most people wont get this. I would indicate that this is one of the only methods for finding cryptic species and all sorts of diversity. Give some numbers on 'perceived species' due to morphlogical analysis and how that changed with DNA identification of some areas ... if you can. That would help to highlight some of major hurdles in this group.

is the chordata data complete?

I should double check that the queries for the chordata are general enough that I'm including records for institutions that may have recorded have as urochordata, etc...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.