shawnlaffan / biodiverse Goto Github PK

A tool for the spatial analysis of diversity

Home Page: http://shawnlaffan.github.io/biodiverse/

License: GNU General Public License v3.0

Perl 99.63% R 0.12% Shell 0.06% Batchfile 0.04% Prolog 0.10% Raku 0.06%

spatial-analysis phylogeography phylogenetic-trees phylogenetic-diversity phylodiversity endemism beta-diversity species-turnover randomisations biodiverse

biodiverse's Introduction

Biodiverse

Biodiverse is a tool for the spatial analysis of diversity using indices based on taxonomic, phylogenetic, trait and matrix-based (e.g. genetic distance) relationships, as well as related environmental and temporal variations.

DOWNLOAD: Biodiverse can be downloaded from https://github.com/shawnlaffan/biodiverse/wiki/Downloads

Biodiverse supports the following processes:

Linked visualisation of data distributions in geographic, taxonomic, phylogenetic and matrix spaces;
Spatial moving window analyses including richness, endemism, phylogenetic diversity and beta diversity;
Spatially constrained agglomerative cluster analyses;
Spatially constrained region grower analyses;
Interactive visualisation of turnover patterns (for example beta-diversity); and
Randomisations for hypothesis testing.

Biodiverse is open-source and supports user developed extensions. It can be used both through a graphical user interface (GUI) and through user written scripts.

More than 300 indices are supported. See the Indices page.

Screen shots can be found on the ScreenShots page.

Example applications can be seen at the publications page.

Help can be located via the help pages (these are also accessible via the wiki link on the right of this page).

A discussion group is at http://groups.google.com.au/group/biodiverse-users and a blog at http://biodiverse-analysis-software.blogspot.com.au/

To cite Biodiverse or acknowledge its use, use the following details, substituting the version of the application that you used for "Version 1.0".

Laffan, S.W., Lubarsky, E. & Rosauer, D.F. (2010) Biodiverse, a tool for the spatial analysis of biological and related diversity. Ecography. Vol 33, 643-647 (Version 1.0).

An overview of the system is also provided in Dan Rosauer's talk at TDWG2008:

Rosauer, D.F. & Laffan, S.W. (2008) Linking phylogenetic trees, taxonomy & geography to map phylogeography using Biodiverse. Taxonomic Data Working Group 2008, Perth, Australia. PPT SWF with audio.

For a list of publications using Biodiverse, see the PublicationsList page.

Installation

Installation instructions can be accessed through the Installation page.

News

See http://shawnlaffan.github.io/biodiverse/#news

Acknowledgements

This research has been supported by Australian Research Council Linkage Grant LP0562070 (Laffan and West) and UNSW FRGP funding to Laffan.

Much of the original GUI coding was by Eugene Lubarsky. Substantial contributions to the project have also been made by Michael Zhou and Anthony Knittel. Unfortunately the code history does not show their contributions as authorship details were lost as we transitioned from Google Code to GitHub.

Persistent URL

http://www.purl.org/biodiverse

Keywords

Biodiversity analysis tool, spatial analysis, phylogeography, spatial analysis, endemism, phylogenetic diversity, beta diversity, species turnover

biodiverse's People

Stargazers

Watchers

Forkers

kappa thanhleviet andersonku cv-library neiljun 65mo ajiwahyu nsm120 souzayuri stephdag danielyao12 wyeco kjrom-sol toki6im otoliths 67kkkk kkpan11 vmikk

biodiverse's Issues

Disable rename button on outputs tab when a specific output type (not the output object) is selected

Rename works when a basedata or output object are selected.

When a specific output item (eg richness or PD) is selected, the Delete and 
Export buttons are disabled because these actions can't be done for a single 
item.

Rename also can't be done in this case, so the button should be disabled.

Original issue reported on code.google.com by danielrosauer on 20 Oct 2009 at 5:40

ReadNexus skips first item in the translate table

This has been corrected for version 0.12

Original issue reported on code.google.com by shawnlaffan on 7 Nov 2009 at 1:14

make scree plot smaller when first plotting dendrogram

The scree plot below the dendrogram (cluster and view labels tabs) occupies
most of the vertical space in that section of the window.  It needs to be
smaller to begin with.

Original issue reported on code.google.com by shawnlaffan on 19 Nov 2009 at 4:31

Next selected basedata is undefined after deleting the currently selected one

What steps will reproduce the problem?
1.  Delete the selected basedata object in the GUI
2.  The selection list defaults to a blank entry


What is the expected output? What do you see instead?
It should choose the next on the list, or 'none' if there are no more.  

Clicking on another basedata object corrects it, but choosing another from
the drop down list causes issues with any subsequent analyses.

Original issue reported on code.google.com by shawnlaffan on 15 Nov 2009 at 8:08

Add chooser to export interface to make available formats obvious

The export system currently uses the file extension to determine which
format to use.  This needs to be converted to a chooser/combobox to make
the choices more obvious, with associated greying out or hiding of
irrelevant options.

Original issue reported on code.google.com by shawnlaffan on 4 Nov 2009 at 7:03

randomisation export fails

The randomisation export methods fail in the GUI, leaving a window with the
name 'label120'.

Original issue reported on code.google.com by shawnlaffan on 29 Oct 2009 at 9:13

sp_match_text does not work when used in a definition query.

sp_match_text does not work when used in a definition query.

Does not match the axes.

Original issue reported on code.google.com by shawnlaffan on 19 Oct 2009 at 9:39

Move subroutine metadata into separate subs

Metadata for subroutines is used to build much of the GUI, determining what
options are available at which time.  

This is currently stored as an if-block within each subroutine.  This is a
fragile approach as calls made to subs without metadata can have
undesirable results.  In many cases the metadata is also longer than the
analysis code.

Moving the metadata into separate subs will simplify things, make the code
less cluttered, and should speed up a number of the processes where subs
are called repeatedly.

All main code has been ported to this system for version 0.12.

Original issue reported on code.google.com by shawnlaffan on 8 Nov 2009 at 9:59

Cluster analysis export window not correctly sized WRT args

The cluster analysis export window is initially not properly sized, causing
some text to be cut off at the bottom. Also, a dividing line near the
bottom crosses over some of the text.  The window may have been originally
sized properly, but new options were added causing the window to be too short.

Could be "fixed" by adding a scroll bar to the window but separating it
into more than one window would be more effective, as then arguments
relevant to the file type can be specified.

Original issue reported on code.google.com by shawnlaffan on 25 Sep 2009 at 6:25

Separate the endemism scalar and list indices into separate subs

The endemism analyses (calc_endemism_central and calc_endemism_whole) both
return additional lists to allow exploration of species ranges and weights
used in the calculations.  

These are very useful, but can use large amounts of memory for large data
sets.  

They should be optional rather than default.

Original issue reported on code.google.com by shawnlaffan on 9 Dec 2009 at 5:30

Tree pane - incorrect plot method highlighted in menu

When displaying a tree object, it is initially shown plotted by node
length, but under the options button the "depth" mode is shown as being
selected. This corrects itself when reselecting the display mode.

Original issue reported on code.google.com by shawnlaffan on 25 Sep 2009 at 6:27

Launch help files in default browser

The help menu options currently open a window with a hyperlink that needs
to be copied across to a browser.  

The browser should insted be launched automatically.

Original issue reported on code.google.com by shawnlaffan on 19 Dec 2009 at 3:06

GUI does not remove overwritten outputs from output tab listing

What steps will reproduce the problem?
1. Re-run an analysis (possibly one that was cancelled), overwriting the
existing output.  

What is the expected output? What do you see instead?
The outputs tab lists the output twice.  The upper one points to the old
version, the lower one to the new version.  This will continue as analyses
are overwritten.

Original issue reported on code.google.com by shawnlaffan on 12 Oct 2009 at 12:46

Warn user if they try to add an output and the basedata has existing randomisations

Doing this will mean the new outputs are out of synch for any additional
iterations of the existing randomisations.

Warn the user and let them decide.

Original issue reported on code.google.com by shawnlaffan on 12 Nov 2009 at 10:43

Memory leak in the randomisation analyses

The system consumes all the available memory over a series of iterations
(how fast depends on the size of the data set).  

Eventually it gets to the point that there is not enough memory to save the
data set or run any further analyses.  This occurs on WinXP, but is likely
to occur on all systems.

A workaround is to run the analysis for some number of iterations, save,
close and restart the application (this clears the memory), and then resume
the analysis.  Repeat solution as needed. 

The load_and_randomise_wrapper.pl script in the Biodiverse/bin folder has
been developed as a means of automating the close and restart problem but
will likely need modification for some problems.

Original issue reported on code.google.com by shawnlaffan on 25 Sep 2009 at 6:43

Nexus files with a ; instead of a , after the last taxon name in the translate block are not loaded correctly

Nexus files with a ; instead of a , after the last taxon name in the 
translate block are not loaded correctly.

Instead, the terminal node corresponding to the last name, is named as an 
internal node.

See attached file, line 126

This seems to be the format which Mesquite produces.  I'm not sure what 
other programs also do.

Solution: read the last name of a translate block, even if a ; is reached 
instead of the comma which follows other lines.

Workaround: in the meantime, open the tree in a text editor, and add a , 
after the last name in the translate block.

Original issue reported on code.google.com by danielrosauer on 21 Oct 2009 at 9:48

double counting of elements when neighbour sets are identical, sp_self_only()

What steps will reproduce the problem?
1. Set both neighbour sets to be self_only() in a spatial analysis
2. Run the analysis.


What is the expected output? What do you see instead?

Warnings are raised 
[INDICES] DOUBLE COUNTING OF ELEMENTS IN calc_abc, 1 + 1 > 1


This is likely due to an optimisation in Spatial.pm not setting the
exclusions properly.

Original issue reported on code.google.com by shawnlaffan on 14 Oct 2009 at 10:43

Gray scale coloring would be nice (or symbols)

What steps will reproduce the problem?
1. My collaborator is color-blind. For instance red and blue as chosen in
Biodiversity are almost indistinguishable for him.
2. I realize that it is going to be very low priority, but to be able to
choose either grayscale or symbols instead of colors would be useful
3.

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 3 Nov 2009 at 3:42

Add list indices to rarity calculations

The endemism calculations allow the extraction of lists containing the
labels and their ranges and weights (see wiki page
Indices#Endemism_and_Rarity ).  These need to also be available for the
rarity indices, as they are calculated using the same algorithm but with
sample counts insted of ranges.

The process of implementing this will also require the calc_rarity subs to
be modfied in line with the calc_endemism subs (as per issue #46).

Original issue reported on code.google.com by shawnlaffan on 26 Dec 2009 at 10:15

Colours do not refresh on change from hue to sat or vice versa

In the spatial analysis and cluster map panes, changing the colour scheme
from “hue” to “sat” or vice versa results in the colours disappearing.  

The view is “fixed” by switching to another index and then back (or
adjusting the slider on the cluster plot), indicating that it's likely some
sort of display refresh problem.

Original issue reported on code.google.com by shawnlaffan on 25 Sep 2009 at 6:19

Basedata import fails for undefined/null group field

Import a data file with one ore more records containing undefined values
for the group fields.  

The import fails when it reaches the first such record.  


line 102 of file C:\filepath\blah.csv
*** unhandled exception in callback:
***    at /<C:\biodiverse\bin\BiodiverseGUI_win.exe>Biodiverse/GUI/Bas
edataImport.pm line 265
***  ignoring at BiodiverseGUI.pl line 111.
Can't use an undefined value as an ARRAY reference at
/<C:\biodiverse\bin\BiodiverseGUI_win.exe>Biodiverse/GUI/Grid.pm
line 973.
 at
/<C:\\biodiverse\bin\BiodiverseGUI_win.exe>Biodiverse/GUI/Tabs/Outputs.pm
line 174
User cancelled grid initialisation, closing

Original issue reported on code.google.com by shawnlaffan on 23 Oct 2009 at 5:49

Enable deletion of randomisation outputs

The GUI currently does not support the deletion of randomisation outputs.  

It will involve extra searching of other outputs to remove the lists that
were added to them.

Original issue reported on code.google.com by shawnlaffan on 18 Oct 2009 at 12:14

For imports: recognise R style matrices which have one less column in the first row, with the top left cell omitted.

Rather than making this an option, the ideal would be to recognise and handle 
this file format as part of the default behaviour.

A sample file in this format is a attached.

Original issue reported on code.google.com by danielrosauer on 19 Oct 2009 at 9:40

Attachments:

Spiders_28x12_spe.txt

display doesn't refresh after running a spatial analysis that overwrites another

If the map of a spatial analysis is open, and a new analysis is specified
and overwrites the previous analysis, the display will not change (even if
“display results” is selected for the new analysis), and the tab must be
closed and then reopened for it to display the new analysis properly.

The display code needs cleaning up so events are not triggered by certain.

Original issue reported on code.google.com by shawnlaffan on 25 Sep 2009 at 6:59

Provide more informative feedback when analysis fails, or when user cancels

Currently the system will return quietly if many of the analyses fail (as
far as the GUI is concerned).  It does, however, print warnings to the log
window.

This information should be sent to the GUI to make it obvious.

Original issue reported on code.google.com by shawnlaffan on 24 Oct 2009 at 8:14

Matrix to tree conversion fails

The method used to build a tree from a matrix passed as an argument crashes
in many cases.  

What steps will reproduce the problem?
1. In the GUI, import a matrix using the normal methods.
2. Use the Matrix -> Convert Matrix to Tree menu option.
3. The conversion will fail, complaining of too many root nodes. 

This is an issue with the underlying conversion code which has not been
updated to conform to the current BaseData cluster analysis methods in
library Biodiverse::Cluster.

Original issue reported on code.google.com by shawnlaffan on 25 Sep 2009 at 6:23

plot more than one shapefile

The GUI will currently plot only one shapefile overlay at a time.  This
should be extended to allow multiple files.

Original issue reported on code.google.com by shawnlaffan on 15 Dec 2009 at 3:11

Add menu option to show full list of available spatial conditions

Add an option to the GUI to give the list of available spatial conditions
and their syntax.  

This will also require an example option in the function metadata to save
user guesswork.

Original issue reported on code.google.com by shawnlaffan on 18 Oct 2009 at 4:45

randomisation lists not displayable in spatial tab

What steps will reproduce the problem?
1.  Run a spatial analysis and display it.
2.  Now run a randomisation.
3.  Redisplay the spatial analysis.  

What is the expected output? What do you see instead?
The randomisation list results should be available in the combo box at the
bottom left.  It is not.  The list can still be accessed from the popup
windows.

This is due to caching of the lists to help with larger data sets when
definition queries are used.  Running the randomisation adds new lists and
these are not updated in the cached set.

Original issue reported on code.google.com by shawnlaffan on 9 Dec 2009 at 5:23

Matrix Rao QE calcs should return undef if there are no labels from the matrix

calc_mx_rao_qe currently returns a zero value if there are no values in the
neighbour sets that are on the matrix.  It should return undef instead so
there is no ambiguity.

Original issue reported on code.google.com by shawnlaffan on 17 Nov 2009 at 5:33

Implement generic export methods for trees and matrices, as per basestructs

BaseStruct objects now have generic export methods instead of being based
on file extensions.  See issue #33.

The same needs to be done for the trees and matrices.

Original issue reported on code.google.com by shawnlaffan on 5 Nov 2009 at 9:57

Implement post_calc options to complement the pre_calcs

Analyses in Biodiverse specify a set of global and local precalculations
required for each subroutine.  This saves processing time overall, as
several analyses might depend on a single subroutine.  Using this method,
the dependency is run once and then its results are passed as arguments to
the subs that need them.

Having a post_calc option would also be useful, for example to run some
cleanup where subs use local caching and some memory could be saved by
clearing it after each window.

Original issue reported on code.google.com by shawnlaffan on 11 Dec 2009 at 5:36

[deleted issue]

[deleted issue]

spatial tab - analysis fails if no spatial neighbourhoods are defined

What steps will reproduce the problem?
1. Set both the neighbourhood definitions to be empty strings
2. Run the analysis
3. The system fails, complaining about undefined values. The spatial tab
closes, or does not close and cannot be closed.

Original issue reported on code.google.com by shawnlaffan on 15 Oct 2009 at 2:56

Rename the view labels tab when basedata is renamed

Renaming a basedata object while it is also open in the view labels tab
does not rename the tab.

Original issue reported on code.google.com by shawnlaffan on 19 Oct 2009 at 8:58

Allow renaming of basedata and outputs without opening the tab

Add capability to the GUI to rename an output without needing to open the
tab, and therefore avoiding the need to display large data sets in order to
rename them.  Also add capability to rename a basedata object, for which
there is currently no option in the GUI.

Original issue reported on code.google.com by shawnlaffan on 12 Oct 2009 at 12:50

Implement import method for observation matrix data, e.g. site/species matrices

Often species observation data are presented in a matrix form, where the
columns are species and the locations or their IDs are on the rows. 
Biodiverse exports to this format, but does not import them back in.

Original issue reported on code.google.com by shawnlaffan on 4 Oct 2009 at 6:01

Add an annulus spatial condition

An annulus will allow calculations in concentric rings and is used as the
standard neighbourhood function in many spatial analyses, eg semivariograms.

Original issue reported on code.google.com by shawnlaffan on 20 Oct 2009 at 12:08

ReadNexus will not properly read back in an exported cluster output

What steps will reproduce the problem?
1. Run a cluster analysis and export it to nexus
2. Import it as a tree
3. It does not remove the quotes properly from the names so will not link
correctly to a transposed basedata object.

Original issue reported on code.google.com by shawnlaffan on 7 Nov 2009 at 1:12

Set element properties for matrices on import

There is currently no option to remap matrix labels on import.

Original issue reported on code.google.com by shawnlaffan on 19 Oct 2009 at 10:20

Tabs cannot be closed if an analysis fails.

If an analysis fails for some reason then the associated tab cannot be closed.

They do not come back if the application is closed and restarted.  

Need to implement better error trapping.

Original issue reported on code.google.com by shawnlaffan on 23 Oct 2009 at 6:13

Spatial tab: Some results lists not visible after using definition query in an analysis

What steps will reproduce the problem?
1. Run an analysis with a definition query to reduce the number of groups
containing results.
2. The lists available for display are sometimes limited to the
SPATIAL_RESULTS, none of the lists generated by indices like
calc_endemism_central are visible.


This is due to only the first group being checked to see what lists are
available.  Need to either search the first "n" groups to get a good
sample, or keep track of lists generated when creating them.

Original issue reported on code.google.com by shawnlaffan on 15 Oct 2009 at 11:38

does not run, exception in gtk+

What steps will reproduce the problem?
1. try to import data
2. about 12 columns, 3 groups, 2 labels
3. 3114 lines of data

What is the expected output? What do you see instead?
What I saw is apparently a crash. The GUI never finishes loading the file,
but just continues to indicate busy. The console seems to indicate that the
system has crashed.

In addition, some buttons (like cancel) are not working, and only default
import worked. Trying to tell the GUI that space was the separator and "
the quote character gave incorrect parsing.

What version of the product are you using? On what operating system?
64 bit vista
version 0.10
Gtk+ as per the link in web page

Please provide any additional information below.

output in console window:
[Project] Basedata Model empty
[GUI] Created new Biodiverse project
[Outputs tab] Loaded tab - Outputs
Gtk-WARNING **: Attempting to read the recently used resources file at
`C:\Users
\lca\.recently-used.xbel', but the parser failed: Unexpected attribute
'modified
' for element 'application'. at
/<C:\Biodiverse\biodiverse_0.10\bin\BiodiverseGU
I_win32.exe>Biodiverse/GUI/BasedataImport.pm line 505, <DATA> line 165.
GLib-CRITICAL **: g_bookmark_file_get_size: assertion `bookmark != NULL' failed
at
/<C:\Biodiverse\biodiverse_0.10\bin\BiodiverseGUI_win32.exe>Biodiverse/GUI/Ba
sedataImport.pm line 505, <DATA> line 165.
[GUI] Discovering columns from
C:\Projects\BestRotation\Debug\biodiversitydata.t
xt
[COMMON] Guessed field separator as ' '
[COMMON] Guessed quote char as "
[GUI] Generating make columns dialog for 12 columns
[BASEDATA] Loading from files
C:\Projects\BestRotation\Debug\biodiversitydata.tx
t
[BASEDATA] INPUT FILE: C:\Projects\BestRotation\Debug\biodiversitydata.txt
[COMMON] Guessed quote char as "
[COMMON] Guessed field separator as ' '
[BASEDATA] Line number: 1
[BASEDATA]  Chunk size 3114 lines
Use of uninitialized value in concatenation (.) or string at
/<C:\Biodiverse\bio
diverse_0.10\bin\BiodiverseGUI_win32.exe>Biodiverse/BaseData.pm line 639,
<GEN0>
 line 3115.
[BASEDATA] Non-numeric cell size field 7, check your data or cellsize
arguments.

line 1 of file C:\Projects\BestRotation\Debug\biodiversitydata.txt
*** unhandled exception in callback:
***    at
/<C:\Biodiverse\biodiverse_0.10\bin\BiodiverseGUI_win32.exe>Biodiverse
/GUI/BasedataImport.pm line 265
***  ignoring at BiodiverseGUI.pl line 111.

Original issue reported on code.google.com by [email protected] on 1 Nov 2009 at 4:14

Add element properties (remapping) to matrix import

The system currently does not allow this.  It should.

Original issue reported on code.google.com by shawnlaffan on 20 Oct 2009 at 8:05

Merged into: #16

Add definition query to allow calculations on a subset of groups

The addition of a definition query to the arguments will allow the
calculations to be run on only a subset of groups.  This should speed
things up, and also allow the user to work on spatial subsets.

Original issue reported on code.google.com by shawnlaffan on 9 Oct 2009 at 5:51

Enable loading of BaseData, matrix and tree files at startup

The system currently allows the user to specify a Biodiverse project file
as a command line argument on startup.  

Support is needed for native format BaseData, matrix and tree files as well
to save having to then open it manually.

Original issue reported on code.google.com by shawnlaffan on 12 Nov 2009 at 3:07

neighbourhood recycling does not account for definition queries

What steps will reproduce the problem?
1.  Run an analysis using a recyclable spatial condition, eg sp_block(size
=> 300000)
2.  Set a definition query, eg "$x <= 3650000"


What is the expected output? What do you see instead?
The results are recycled into those groups that failed the definition query
when they should not be.

Original issue reported on code.google.com by shawnlaffan on 19 Nov 2009 at 10:22

[deleted issue]

[deleted issue]

spatial outputs not considering definition query when reusing nbr set from another spatial output

The processing for spatial outputs uses an optimisation such that it reuses
neighbour sets from another spatial output with the same conditions (if one
exists in that basedata).  

These currently do not consider the definition queries, resulting in groups
excluded by the definition query still having results.

Original issue reported on code.google.com by shawnlaffan on 19 Nov 2009 at 10:23

Block the user from running exclusions on a basedata with outputs

The system does not guard against exclusions being run when a basedata
contains outputs.  This will cause problems with existing cluster and
spatial outputs as they will refer, or be a function of, no longer existing
groups.

Original issue reported on code.google.com by shawnlaffan on 19 Oct 2009 at 12:53