Git Product home page Git Product logo

nsnsdacoustics's Introduction

NSNSDAcoustics

This repository provides a place for National Park Service Natural Sounds and Night Skies Division (NSNSD) staff to develop and modernize several bioacoustics workflows.

All documentation and code is actively under development. There is currently no official release of this package and code may change. If you encounter a problem, please submit it to Issues. If you have a question or need that isn't covered by submitting an issue, please reach out to Cathleen Balantic (cathleen_balantic at nps.gov).

Table of Contents

Installing NSNSDAcoustics

First, ensure that you have installed the latest versions of R and RStudio.

Next, you can install NSNSDAcoustics using one of two options:

(1) Option 1: Use install_github():

You may need to first install the latest version of devtools, and the R console may also prompt you to install Rtools (follow directions given in console message). If you are having trouble installing devtools, make sure to disconnect from VPN. Once you have Rtools and devtools, you can install the latest version of NSNSDAcoustics:

install.packages('devtools')
library(devtools)
devtools::install_github('nationalparkservice/NSNSDAcoustics')

(2) Option 2: Download and install manually

Note: this option will not be implemented until we have a first stable release of NSNSDAcoustics.

Eventually, you will be able to download the zip or tar.gz file directly using one of the following links.

  • Windows users can download the zip file from NSNSDAcoustics-master.zip need to create link
  • Mac or Linux users can download the tar.gz file from NSNSDAcoustcs-master.tar.gz need to create link

After downloading, open R Studio, click on the Install button on the Packages tab, select Install From Package Archive File, and navigate to the downloaded file.

Once NSNSDAcoustics is installed, you can call in the package and look at all helpfiles:

library(NSNSDAcoustics)
help(package = 'NSNSDAcoustics')

A note on data.table syntax

NSNSDAcoustics depends on the R package data.table, which enables fast querying and manipulation of large data.frames. If you are an R user but have never used data.table syntax before, some of the example code may look unfamiliar. Don't fret -- data.table object types are also data.frames. If you get frustrated trying to work with them, you can always convert to a regular data.frame to deal with a more familiar object type.

Running BirdNET from RStudio

This worfklow was developed for Windows 10 and BirdNET-Analyzer V2.4. It has not been tested on other systems.

BirdNET is a bird sound recognition program developed by the Cornell Center for Conservation Bioacoustics. The BirdNET-Analyzer Github repository provides a promising free tool for processing large volumes of audio data relatively quickly and understanding something about which avian species are present.

Download the link shown in this screenshot and follow the directions:

Image showing which BirdNET link to click and download.

If you are working on a machine where you do not have admin privileges, you may not be able to unblock the software and will need to pursue a workaround. NPS staff please reach out to Cathleen Balantic (cathleen_balantic at nps.gov) with questions.

These command line arguments are the building blocks needed to construct a statement that tells BirdNET where to find your audio (--i), where to place result files (--o), what detection sensitivity to use (--sensitivity), where to find a species list if you are using one (--slist), how many CPU threads to use (--threads), what type of result to produce (--rtype) and much more. To use the functions in this package, you will need to specify --rtype 'r'. Once you understand the command line arguments, you are ready to try using BirdNET from the Windows command line.

(3) Step 3. Test your BirdNET Installation.

Test that BirdNET is functional by opening up a Windows Command Prompt (see below image). Look to the lower lefthand side of your screen and locate the Windows search bar. Type Command Prompt and click the app.

Image showing how to locate and open the Windows command prompt.

Next, you can construct a statement for the command prompt. Your statement might look something like the following example, or it might include additional command line arguments:

"C:/path/to/BirdNET-Analyzer/BirdNET-Analyzer.exe" --i "D:/AUDIO" --o "D:/RESULTS" --lat -1 --lon -1 --week -1 --slist "D:/species_list.txt" --rtype "r" --min_conf 0.1 --sensitivity 1.0 --threads 4

Please edit this example to reflect file paths and folder names on your machine, and then modify, omit, or include command line arguments as desired, and give it a try.

You might see the message 'WARNING:tensorflow:AutoGraph is not available in this environment' but this is not a cause for concern. If BirdNET-Analyzer is working and writing results to your results folder, you were successful in getting everything installed.

(4) Step 4. Run BirdNET from RStudio.

If you are processing many terabytes, years, and/or locations of data through BirdNET, you might have dozens or hundreds of audio folders. The prospect of constructing command line statements by hand for each folder may sound daunting and tedious. You may find yourself wishing that you could loop through all of your folders and access BirdNET-Analyzer directly from R. Fortunately, this is possible via R's system() function.

For example, let's say you have several audio folders: D:/AUDIO_1, D:/AUDIO_2, and D:/AUDIO_3. You might have created corresponding results folders: D:/RESULTS_1, D:/RESULTS_2, and D:/RESULTS_3. You can use R to automate the construction of command line statements for each folder, and use a loop to wrap those statements in system(). This allows you to run BirdNET-Analyzer directly from RStudio. The following pseudocode is an example that can be edited for your own purposes.

# Initialize important variables such as your BirdNET analyzer path, folders, and other command line arguments:
birdnet.path <- 'C:/path/to/BirdNET-Analyzer/BirdNET-Analyzer.exe'
audio.folders <- c(paste0('D:/', 'AUDIO_', 1:3))
result.folders <- c(paste0('D:/', 'RESULTS_', 1:3))
species.list.path <- 'D:/species_list.txt'
num.threads <- 7

# Generate a single command to loop through several folders:
## NOTE: be mindful of your quotations when editing!
all.commands <- paste0(
  '"', birdnet.path,
  '" --i "', audio.folders,
  '" --o "', result.folders,
  '" --lat -1 --lon -1 --week -1 --slist ',
  species.list.path, ' --rtype "r" --threads ',
  num.threads, ' --min_conf 0.1 --sensitivity 1.0')

# Test that one command runs
# system(all.commands[1])

# Loop through all commands (i.e., all audio folders) and send them to BirdNET-Analyzer
for (i in 1:length(all.commands)) {
  cat('\n \n This is folder', i, 'of', length(all.commands), '\n \n')
  system(all.commands[i])
}

Additional Tips:

  • Cornell's underlying BirdNET-Analyzer software gives several options for file output types, but the only one implemented in this package is --rtype 'r'. Please specify --rtype 'r' in your command line statements if you want to use the rest of the functions in this package.
  • The --threads argument specifies how many files BirdNET will work on at once, and it depends on the number of logical processors your machine has. On a Windows machine, to figure out how many cores your processor has, press CTRL + SHIFT + ESC to open Task Manager. Select the Performance tab to see how many cores and logical processors your PC has. A rule of thumb is to never set "threads" to more than the number of logical processors minus 1. For example, your machine might have 8 logical processors, in which case you would set "threads" to no higher than 7.

Assessing BirdNET Results

If you have a large number of audio files, and plan to monitor for a long time across many locations, you may very quickly find yourself managing thousands of BirdNET output files. It's likely that you'll want a systematic way to track and check on these results, and verify whether BirdNET detections are truly from a target species. The birdnet_format() --> birdnet_verify() workflow offers one way to keep track of your verifications. An alternative way would be to set up a SQLite database (e.g., as used in the AMMonitor package). Although a database solution may ultimately be the most robust way to track results through time in a long term project, this can come with a lot of start up and might not be easily extensible to your project needs. Instead, the worfklow below provides a simple way to reformat and work with BirdNET output files directly, allowing you to store your verifications there. Lastly, birdnet_plot() and birdnet_barchart() provide plotting options to visualize detected data.

Reformat raw BirdNET results

birdnet_format() is a simple function that reformats the raw BirdNET txt or csv results with a "recordingID" column for easier data manipulation, a "verify" column to support manual verification of detection results, and a "timezone" column to clarify the timezone setting used by the audio recorder.

Below, we'll walk through the documentation and example helpfiles for birdnet_format(). Always start by pulling up the function helpfile. Everything covered below is located in the "Examples" section of this helpfile.

?birdnet_format

To run this example, we first create an example "results" directory, and then we write some raw BirdNET txt results to this directory. This setup illustrates the file types and folder structure birdnet_format() expects to encounter.

# Create a BirdNET results directory for this example
dir.create('example-results-directory')

# Write examples of raw BirdNET outputs to example results directory
data(exampleBirdNET1)
write.table(x = exampleBirdNET1,
            file = 'example-results-directory/Rivendell_20210623_113602.BirdNET.results.csv',
            row.names = FALSE, quote = FALSE, sep = ',')
data(exampleBirdNET2)
write.table(x = exampleBirdNET2,
            file = 'example-results-directory/Rivendell_20210623_114602.BirdNET.results.csv',
            row.names = FALSE, quote = FALSE, sep = ',')

Now, we're set up to run the function. birdnet_format() takes two arguments. First, the results.directory should point to the folder where you have stored your raw BirdNET outputs. Second, the timezone argument allows you to specify the timezone setting used in the audio recorder (i.e., the timezone reflected in the wave filename). It's important to pay attention to this! Recall that the functions described here expect wave files that follow a SITEID_YYYYMMDD_HHMMSS naming convention. In the package sample audio data, we have a wave named Rivendell_20210623_113602.wav. This means the site ID is Rivendell, and the recording was taken on June 23rd, 2021 at 11:36:02. The timezone argument allows us to clarify what 11:36:02 actually means. Was the recording taken in local time at your site, or was it taken in UTC? This point might seem trivial if you're just getting started with collecting data at a few sites for a single season, but if you're collecting data across many sites, over many years, with varying audio recorder equipment and varying recording settings through time, different field technicians, and potentially across timezones (as we often do at NPS NSNSD), you will want to keep meticulous track of your timezones so that your analyses will be accurate across time and space. If recordings were taken in local time at your study site, specify an Olson-names-formatted character timezone for the location (e.g., "America/Los_Angeles"). If recordings were taken in UTC, you can put either 'GMT' or 'UTC' (both are acceptable in R for downstream date-time formatting).

Below, we point to our example results directory and specify 'GMT' (i.e., 'UTC') as the timezone, since the recordings were not taken in local time at this recorder.

# Run birdnet_format:
birdnet_format(results.directory = 'example-results-directory',
               timezone = 'GMT')

After running the function, look in your example results directory folder and check on the results. This function produces new formatted files with filename prefix "BirdNET_formatted_" (note that it does NOT overwrite your raw BirdNET results). The columns in this formatted file are described in the helpfile.

Finally, clean up by deleting temporary files that were set up for the example.

# Delete all temporary example files when finished
unlink(x = 'example-results-directory', recursive = TRUE)

The point of all this reformatting is to make it easier to keep track of our downstream analyses of BirdNET detections. The "verify" column is what allows us to track whether a detected event actually came from a target species.

Gather BirdNET results

birdnet_gather() is a simple convenience function that gathers all BirdNET results from a desired folder into one user-friendly data.table / data.frame. View the helpfile for more information and code examples (?birdnet_gather).

The function allows you to gather either unformatted (raw) or formatted data. For this example, we set up an example results directory and write both formatted and unformatted .txt results to it. This illustrates the type of folder and file structure birdnet_gather() expects to encounter.

# Create a BirdNET results directory for this example
dir.create('example-results-directory')

# Write examples of formatted BirdNET outputs to example results directory
data(exampleFormatted1)
write.table(x = exampleFormatted1,
            file = 'example-results-directory/Rivendell_20210623_113602.BirdNET_formatted_results.csv',
            row.names = FALSE, quote = FALSE, sep = ',')

data(exampleFormatted2)
write.table(x = exampleFormatted2,
            file = 'example-results-directory/Rivendell_20210623_114602.BirdNET_formatted_results.csv',
            row.names = FALSE, quote = FALSE, sep = ',')

# Write examples of raw BirdNET outputs to example results directory
data(exampleBirdNET1)
write.table(x = exampleBirdNET1,
            file = 'example-results-directory/Rivendell_20210623_113602.BirdNET.results.csv',
            row.names = FALSE, quote = FALSE, sep = ',')
data(exampleBirdNET2)
write.table(x = exampleBirdNET2,
            file = 'example-results-directory/Rivendell_20210623_114602.BirdNET.results.csv',
            row.names = FALSE, quote = FALSE, sep = ',')

birdnet_gather() takes two arguments: results.directory (the file path to the directory where BirdNET results are stored) and formatted, a logical indicating whether formatted results should be gathered. If TRUE, formatted results are gathered. If FALSE, unformatted (raw) BirdNET results are gathered. Both options are demonstrated below.

# Gather formatted BirdNET results
formatted.results <- birdnet_gather(
                             results.directory = 'example-results-directory',
                             formatted = TRUE
                             )

# Gather unformatted (raw) BirdNET results
raw.results <- birdnet_gather(
                       results.directory = 'example-results-directory',
                       formatted = FALSE
                       )

Finally, we delete all example files when finished.

# Delete all temporary example files when finished
unlink(x = 'example-results-directory', recursive = TRUE)

Verify BirdNET results

birdnet_verify() allows the user to manually verify a selected subset of detections based on a user-input library of classification options.

Below, we'll walk through the documentation and example helpfiles for birdnet_verify(). As always, we start by pulling up the function helpfile. Everything covered below is located in the "Examples" section of this helpfile.

?birdnet_verify

To run this example, we first create an example audio directory, to which we will write the sample audio that comes with the package. We'll also set up an example results directory to which we will write example formatted .txt data. This is meant to illustrate the file types and folder structure birdnet_verify() expects to encounter.

# Create an audio directory for this example
dir.create('example-audio-directory')

# Read in example wave files
data(exampleAudio1)
data(exampleAudio2)

# Write example waves to example audio directory
tuneR::writeWave(object = exampleAudio1,
                 filename = 'example-audio-directory/Rivendell_20210623_113602.wav')
tuneR::writeWave(object = exampleAudio2,
                 filename = 'example-audio-directory/Rivendell_20210623_114602.wav')

# Create a BirdNET results directory for this example
dir.create('example-results-directory')

# Write examples of formatted BirdNET outputs to example results directory
data(exampleFormatted1)
write.table(x = exampleFormatted1,
            file = 'example-results-directory/Rivendell_20210623_113602.BirdNET_formatted_results.csv',
            row.names = FALSE, quote = FALSE, sep = ',')
data(exampleFormatted2)
write.table(x = exampleFormatted2,
            file = 'example-results-directory/Rivendell_20210623_114602.BirdNET_formatted_results.csv',
            row.names = FALSE, quote = FALSE, sep = ',')

Next, we can use birdnet_gather() to grab all the formatted results from the example folder. From here, you can manipulate your results however you want to create a subset of detections that you wish to verify. The key is to subset only to a single species. In case you accidentally include multiple species in your data subset, birdnet_verify() will remind you that it only accepts one species at a time for verifications. In this example, we'll focus on verifying detections for the Swainson's Thrush (Catharus ustulatus).

You can create your verification sample however you like. A few options are to take a simple random sample, or take a stratified sample based on detections from different locations or times of day. Depending on your question, you might even want to verify every detection for your target species. In the below example, we set a seed for reproducibility and take a simple random sample of three Swainson's Thrush detections.

# Gather formatted BirdNET results
dat <- birdnet_gather(results.directory = 'example-results-directory',
                      formatted = TRUE)

# Create a random sample of three detections to verify
set.seed(4)
to.verify <- dat[common_name == "Swainson's Thrush"][sample(.N, 3)]

The next step is to create a "verification library" for this species; essentially, a character vector of acceptable options for your verification labels. Verifying BirdNET detections may be tricky depending on your research question, because BirdNET does not distinguish between the different types of vocalizations a bird may produce. This means the burden is on you, the verifier, to label the detection in a way that will best support you in answering your motivating research question.

The Swainson's Thrush provides a good example of what makes this challenging, because in addition to its recognizable flutelike song, it has a variety of different call types, including a "peep" note, a "whit", a high-pitched "whine", a "bink", and a "peeer" call.

Thus, the verification library you set up will depend on the level of detail you need to answer your research question. Here are two examples of questions you might have as you verify:

  • Is this a Swainson's Thrush or not? For this question, you'll think to yourself, "Yes, this is my target species!" or, "No, this definitely isn't my target species", or, "I'm not sure". You might choose simple verification labels like c('y', 'n', 'unsure').
  • What type of Swainson's Thrush vocalization is this? For this question, you might choose more descriptive verification labels like c('song', 'call', 'unsure', 'false'), or something even more detailed like c('song', 'peep', 'whit', 'whine', 'bink', 'peeer', 'false', 'unsure').

This part is left to your discretion. It's one of the challenging aspects of assessing automated detection results, and you may find that you need to iterate through a few options before settling on the verification library that works best for your circumstances.

Below, we'll use a simple verification library where 'y' means yes, it's a Swainson's Thrush, 'n' means it's not, and 'unsure' means we aren't certain.

# Create a verification library for this species
ver.lib <- c('y', 'n', 'unsure')

Now we're ready to use birdnet_verify(). This interactive function displays a spectrogram of the detected event and prompts the user to label it with one of the input options defined in the verification library. The function also optionally writes a temporary wave clip to your working directory. (Although there are options for playing a sound clip automatically from R, the behavior of these options varies across platforms/operating systems; instead of using R to play the clip, we decided it would be simpler to write a temporary wave clip file for the user). If you expect to be listening to the clips, you'll want easy access to your working directory so that you can open the wave clips up manually.

birdnet_verify() has several arguments. data takes the data.frame or data.table of detections you want to verify. verification.library takes a character vector of the labeling options you will be using. audio.directory points to the directory where your audio files live, and results.directory points to the directory where your formatted BirdNET files are stored. overwrite allows you to decide whether or not any previously existing verifications should be overwritten. When FALSE, users will not be prompted to verify any detected events that already have a verification, but when TRUE, you may be overwriting previous labels. The play argument specifies whether or not a temporary wave file should be written to the working directory for each detection. The remaining arguments allow the user to customize how detected events should be displayed during the interactive session (see ?birdnet_verify for details).

# Verify detections
birdnet_verify(data = to.verify,
               verification.library = ver.lib,
               audio.directory = 'example-audio-directory',
               results.directory = 'example-results-directory',
               overwrite = FALSE, 
               play = TRUE,
               frq.lim = c(0, 12),
               buffer = 1,
               box.col = 'blue',
               spec.col = monitoR::gray.3())

Running this example will produce an interactive output like the below image. The RStudio console will prompt you to provide a label for the detection. The plot pane will display a spectrogram of the detection. You'll use this spectrogram -- optionally along with the temporary wave file -- to decide which label to choose. In this example, we've used the buffer argument to place a 1 second buffer around the detection to provide additional visual and acoustic context. The detection itself is contained within the blue box (all BirdNET detections occur in three-second chunks). About 23.5 seconds in, a Swainson's Thrush begins singing. The spectrogram title gives us information about where we can find this detection in the txt or csv file, and informs us that BirdNET has a confidence level of 0.9 for the detection. We can label this as 'y' because the blue detection window does contain a Swainson's Thrush vocalization.

Click image for a larger version.

Illustration of outputs when using the function birdnet_verify. Left side of image shows interactive RStudio interface used for verification, right side depicts spectrogram of detected event to be verified, with a blue border showing the time boundaries of the detection.

Once you've added labels for the remaining detections (in fact, they all contain Swainson's Thrush vocalizations!), birdnet_verify() will update the underlying formatted txt or csv files with your verifications. Below, we gather up the results again and check that our three verifications have been added.

# Check that underlying files have been updated with user verifications
dat <- birdnet_gather(results.directory = 'example-results-directory',
                      formatted = TRUE)
dat[!is.na(verify)]

Finally, we clean up by deleting temporary files that we set up for the example.

# Delete all temporary example files when finished
unlink(x = 'example-audio-directory', recursive = TRUE)
unlink(x = 'example-results-directory', recursive = TRUE)

Visualize BirdNET spectrograms

birdnet_plot() allows the user to visualize spectrograms of BirdNET detections (whether or not the data have been verified).

Start by pulling up the function helpfile. Everything covered below is located in the "Examples" section of this helpfile.

?birdnet_plot

We start by creating an example audio directory and writing example audio data to this directory.

# Create an audio directory for this example
dir.create('example-audio-directory')

# Read in example wave files
data(exampleAudio1)
data(exampleAudio2)

# Write example waves to example audio directory
tuneR::writeWave(object = exampleAudio1,
                 filename = 'example-audio-directory/Rivendell_20210623_113602.wav')
tuneR::writeWave(object = exampleAudio2,
                 filename = 'example-audio-directory/Rivendell_20210623_114602.wav')

Next, we read in example data.table/data.frame for plotting. Checking the structure of this example data reveals that there are 284 rows of data in the example.

# Read in example data.table/data.frame for plotting
data(examplePlotData)

# Check the structure of this example data
str(examplePlotData)

birdnet_plot() expects a data.table/data.frame that has been formatted with the columns produced by birdnet_format(). Beyond that, this data.frame can contain just about anything. A user might choose to plot data by species, song type, verification label, confidence levels, and more. The audio.directory argument should point to the folder where your audio are contained. The remaining arguments allow some aesthetic control over plotting, with the option to provide a title, control the frequency limits, and choose spectrogram and box colors (see helpfile for details).

Below, we subset the examplePlotData object to plot detections for Swainson's Thrush that contain the label "song" in the verify column. We give the plot a descriptive title, use frequency limits ranging from 0.5 to 12 kHz, specify a gray color scheme for the spectrogram, and draw gray boxes around each detection.

# Plot only detections of Swainson's Thrush verified as "song",
# with frequency limits ranging from 0.5 to 12 kHz, gray spectrogram colors,
# a custom title, and a gray box around each detection
plot.songs <- examplePlotData[common_name == "Swainson's Thrush" & verify == "song"]
birdnet_plot(data = plot.songs,
             audio.directory = 'example-audio-directory',
             title = "Swainson's Thrush Songs",
             frq.lim = c(0.5, 12),
             new.window = TRUE,
             spec.col = gray.3(),
             box = TRUE,
             box.lwd = 1,
             box.col = 'gray')

Click image for a larger version.

Illustration of spectrogram output from birdnet_plot for Swainson's Thrush.

In the next example, we plot detections for Swainson's Thrush that contain the label "call" in the verify column. We give the plot a descriptive title, use frequency limits ranging from 0.5 to 6 kHz and choose not to draw any boxes around detections. Below, we demonstrate that the spec.col argument allows for adjustable spectrogram colors, and that users can create their own gradients or use existing ones. A few spectrogram color options are provided with the package (e.g., gray.3()). In the example below, we input a color gradient from the viridis R package.

# install.packages('viridis') # install the package first if you do not have it
library(viridis) 

# Plot only detections of Swainson's Thrush verified as "call"
# with frequency limits ranging from 0.5 to 6 kHz,a custom title, no boxes,
# and colors sampled from the viridis color package
plot.calls <- examplePlotData[common_name == "Swainson's Thrush" & verify == "call"]
birdnet_plot(data = plot.calls,
             audio.directory = 'example-audio-directory',
             title = "Swainson's Thrush Calls",
             frq.lim = c(0.5, 6),
             new.window = TRUE,
             spec.col = viridis::viridis(30),
             box = FALSE)

Click image for a larger version.

Alternative illustration of spectrogram output from birdnet_plot for Swainson's Thrush, showing the of use different color schemes and plotting parameters.

In the final example, we demonstrate that birdnet_plot() can also be used to visualize unverified data. Below, we loop through to plot all detections for two selected species -- Varied Thrush and Pacific-slope Flycatcher -- where the confidence of detection is greater than or equal to 0.25.

# Loop through to plot detections for selected unverified species
# where confidence of detection >= 0.25
# with frequency limits ranging from 0.5 to 12 kHz, custom titles, gray boxes,
# and gray spectrogram colors
sp <- c('Varied Thrush', 'Pacific-slope Flycatcher')
for (i in 1:length(sp)) {
 plot.sp <- examplePlotData[confidence >= 0.25 & common_name == sp[i]]
 birdnet_plot(data = plot.sp,
              audio.directory = 'example-audio-directory',
              title = paste0(sp[i], ' Detections >= 0.25'),
              frq.lim = c(0.5, 12),
              new.window = TRUE,
              spec.col = gray.3(),
              box = TRUE,
              box.lwd = 0.5,
              box.col = 'gray',
              title.size = 1.5)
}

Click image for a larger version.

Illustration of the spectrogram outputs from birdnet_plot from looping through two focal species.

Finally, delete all temporary files when finished.

# Delete all temporary example files when finished
unlink(x = 'example-audio-directory', recursive = TRUE)

Create barcharts of BirdNET detections

birdnet_barchart() allows the user to visualize stacked barcharts of user-selected BirdNET results by date. Start by pulling up the function helpfile. Everything covered below is located in the "Examples" section of this helpfile.

?birdnet_barchart

First, we read in example data to be used as an input to birdnet_barchart. This would typically be a data.table generated by a call to birdnet_gather, and typically the data have been formatted using birdnet_format. Generally, this data object may be preceded by a call to add_time_cols. Regardless, the data object input to birdnet_barchart should contain BirdNET detection data that comes from a single site, and the object must contain columns named "locationID" (character), "recordingID" (character), and "dateTimeLocal" (POSIXct).

# Read in exampleBarchartData
data(exampleBarchartData)

# Generally, add_time_cols() may be called as part of preprocessing
# (if not, please ensure data object has columns that include locationID (character),
# recordingID (character), and dateTimeLocal (POSIXct))
dat <- add_time_cols(dt = exampleBarchartData,
                     tz.recorder = 'America/Los_angeles',
                     tz.local = 'America/Los_angeles')

In addition to the data object, birdnet_barchart has an argument called interactive. When set to TRUE, birdnet_barchart will display a plotly-based interactive plot that allows the user to hover over data to see which species detections are being visualized.

# Produce an interactive plotly barchart with interactive = TRUE
birdnet_barchart(data = dat, interactive = TRUE)

Click image for a larger version. The figure is not interactive and merely serves as an illustration.

Screenshot of an interactive plotly barchart that shows stacked bars of detected species through time at a monitoring location. The x axis shows date, and the y axis shows the total number of detections for that date. Hovering over individual bars shows an instance of 1420 detections of Pacific Wren on julian date 132.

Meanwhile, when interactive is set to FALSE, birdnet_barchart produces a ggplot-based static plot. The user also has the option to highlight certain species with the focal.species argument, which takes a character vector of common names of species to display. The focal.colors argument allows the user to specify which colors to use for which focal species. If the data object contains other species aside from the focals, all non-focal species will be plotted in black as "Other".

# Produce a static ggplot barchat with interactive = FALSE,
# add focal.species with custom colors (any species in the data object
# that are not in focal.species will be plotted in black as "Other".)
birdnet_barchart(data = dat,
                 interactive = FALSE,
                 focal.species = c("Pacific Wren", "Swainson's Thrush", "Varied Thrush"),
                 focal.colors = c('#00BE67', '#C77CFF', '#c51b8a'))

Click image for a larger version.

Static ggplot barchart highlighting the detections of three focal species through time at a monitoring location. The x axis shows date, and the y axis shows the total number of detections for that date.

Generally, interactive = FALSE should be used in conjunction with the focal.species argument. If the focal.species argument is being used, a legend will also be plotted. This option provides a static plot output and the opportunity to highlight a small number of focal species and their detection activity through time.

Meanwhile, use of interactive = TRUE is meant strictly for exploratory purposes. If the focal.species argument is not used, no legend will be plotted. A typical use case is to omit focal.species when setting interactive = TRUE since the pointer can hover over the data interactively to display species. A legend is not plotted in this case because typically interactive mode is only being used when there are dozens of species to display, and the number of colors in the legend will make species indistinguishable.

Create heat maps of BirdNET detections by date

birdnet_heatmap() allows the user to visualize heat maps of user-selected BirdNET results by date. Start by pulling up the function helpfile. Everything covered below is located in the "Examples" section of this helpfile.

?birdnet_heatmap

First, we read in example data to be used as an input to birdnet_heatmap. This would typically be a data.table generated by a call to birdnet_gather, and the data may or may not been formatted using birdnet_format. Generally, this data object may be preceded by a call to add_time_cols. Regardless, the data object input to birdnet_heatmap should contain BirdNET detection data that comes from a single site, and the object must contain columns named "locationID" (character), "recordingID" (character), and "dateTimeLocal" (POSIXct). Below, we are also reading in exampleDatesSampled, which is a dates object.

# Read in example data
data(exampleHeatmapData)
data(exampleDatesSampled)

# Ensure your data has an appropriate recordingID column and time columns
dat <- exampleHeatmapData
dat[ ,recordingID := basename(filepath)]
dat <- add_time_cols(
           dt = dat,
           tz.recorder = 'America/Los_angeles',
           tz.local = 'America/Los_angeles'
)

In the data argument, birdnet_heatmap expects a data.frame or data.table of BirdNET results. In locationID and common.name, specify a valid character locationID and common.name for your location and species of interest. In conf.threshold, specify a numeric input for a BirdNET confidence threshold below which detections will be discarded. The optional argument julian.breaks allows you to input a numeric vector of julian date plotting breaks to use on the x-axis of the heat map (if omitted, these breaks will be computed automatically by the function). (Here's a useful chart for choosing julian breaks). Setting comparable.color.breaks = TRUE allows you to generate heatmap color breaks based on every species in the input data to enable easier inter-species comparisons. Setting comparable.color.breaks = FALSE means the function will simply generate heatmap color breaks based only on the species of interest you specified in common.name. Finally, the dates.sampled argument requires either a Date vector or character vector of dates that were sampled and should be visualized on the heat map. This information is required because your data input may only contain detection data, and not non-detection data (i.e., zeroes). For example, you might have recorded audio on 2021-03-14, but have no BirdNET detections in your "data" object. This will result in an inaccurate visualization. Since your results may not automatically contain non-detection data, it is incumbent on the user to input which dates were sampled.

# Generate a heatmap at Rivendell for Pacific Wren
# Set comparable.color.breaks = FALSE to maximize contrast in a single species map

# Add user-input julian.breaks
birdnet_heatmap(
     data = dat,
     locationID = 'Rivendell',
     common.name = 'Pacific Wren',
     conf.threshold = 0.2,
     dates.sampled = exampleDatesSampled,
     julian.breaks = seq(from = 70, to = 250, by = 30),
     comparable.color.breaks = FALSE
)

Click image for a larger version.

Heatmap of detections of Pacific Wren at a monitoring location. The x axis shows date, the y axis shows year, and daily detection value is visualized from low (purple) to high (yellow).

In the second example, we loop through multiple species and set comparable.color.breaks = TRUE to plot according to a consistent color ramp. This option enables easier comparison between species.

# Generate heatmaps for several species with comparable.color.breaks == TRUE
# so that heatmap color scale is conserved for ease of interspecies comparison
sp <- c("Pacific Wren",
        "Pacific-slope Flycatcher",
        "Swainson's Thrush",
        "Wilson's Warbler")

for (i in 1:length(sp)) {
 print(paste0('Working on ', sp[i]))
 g <- birdnet_heatmap(
     data = dat,
     locationID = 'Rivendell',
     common.name = sp[i],
     conf.threshold = 0.2,
     dates.sampled = exampleDatesSampled,
     julian.breaks = seq(from = 70, to = 250, by = 30),
     comparable.color.breaks = TRUE
 )

 print(g)

}

Click image for a larger version.

Heatmaps of focal species detections by date at a monitoring location. The x axis shows date, the y axis shows year, and daily detection value is visualized from low (purple) to high (yellow).

Create heat maps of BirdNET detections by date and time

birdnet_heatmap_time() allows the user to visualize heat maps of user-selected BirdNET results by date and time. Start by pulling up the function helpfile. Everything covered below is located in the "Examples" section of this helpfile.

?birdnet_heatmap_time

First, we read in example data to be used as an input to birdnet_heatmap_time. This would typically be a data.table generated by a call to birdnet_gather, and the data may or may not been formatted using birdnet_format. Generally, this data object may be preceded by a call to add_time_cols. Regardless, the data object input to birdnet_heatmap_time should contain BirdNET detection data that comes from a single site, and the object must contain columns named "locationID" (character), "recordingID" (character), and "dateTimeLocal" (POSIXct). Below, we are also reading in exampleDatesSampled, which is a dates object.

# Read in example data
data(exampleHeatmapData)
data(exampleDatesSampled)

# Ensure your data has an appropriate recordingID column and time columns
dat <- exampleHeatmapData
dat[ ,recordingID := basename(filepath)]
dat <- add_time_cols(
           dt = dat,
           tz.recorder = 'America/Los_angeles',
           tz.local = 'America/Los_angeles'
)

The arguments to birdnet_heatmap_time are similar to birdnet_heatmap, but with a few additions. The hours.sampled argument allows the user to clearly display which hours were actually acoustically monitored; the argument takes either an integer vector declaring which hours were sampled across the monitoring period (e.g., c(6:8, 18:20)), or a list declaring sun-based monitoring based on how many hours before and after sunset were recorded. For example, list(sunrise = c(1.5, 1.5), sunset = c(1, 1)) means that the schedule recorded 1.5 hours before sunrise until 1.5 hours after, and 1 hour before sunset to 1 hour after. If missing hours.sampled, the function assumes continuous sampling and will display the plot as such; beware that this may misrepresent your data and if you did not sample during all hours, the plot will make it appear as if you did. The y.axis.limits argument lets the user control how much of the 24-hour day to display, (for example, y.axis.limits = c(2,12) would display data from 2am to 12pm). The minute.timestep argument allows the user to specify how finely to bin the data; any divisor of 60 is allowed in options c(1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, 60). The sun.lines and sun.linetypes arguments allow the user to customize and graph lines indicating sunrise, sunset, dawn, dusk, and several other options detailed in the helpfile. If using these arguments, the latitude and longitude argument are also required.

In the below example, we graph Pacific-slope Flycatcher detections at Rivendell above a confidence threshold of 0.25, in 5 minute increments, under a sampling regime where acoustic monitoring was conducted from 1.5 hours before sunrise to 1.5 hours after, with 0 sampling hours around sunset. We also graph lines for dusk, dawn, sunrise, and sunset.

# Generate a heatmap at Rivendell for Pacific-slope Flycatcher
# Set comparable.color.breaks = FALSE to maximize contrast in a single species map
birdnet_heatmap_time(
     data = dat,
     common.name = 'Pacific-slope Flycatcher',
     locationID = 'Rivendell',
     conf.threshold = 0.25,
     dates.sampled = exampleDatesSampled,
     hours.sampled = list(sunrise = c(1.5, 1.5), sunset = c(0, 0)),
     y.axis.limits = c(0, 23),
     julian.breaks = c(30, 60, 90, 120, 150, 180, 210, 240, 270),
     minute.timestep = 5,
     comparable.color.breaks = FALSE,
     tz.local = 'America/Los_angeles',
     latitude = 46.1646,
     longitude = -123.77955,
     sun.lines = c('dusk', 'dawn', 'sunrise', 'sunset'),
     sun.linetypes = c('dotdash', 'longdash', 'dotted', 'solid')
)

Click image for a larger version.

Heatmap of detections of Pacific-slope Flycatcher at a monitoring location. The x axis shows date, the y axis shows time of day, and daily detection value is visualized from low (purple) to high (yellow).

In the second example, we loop through multiple species and set comparable.color.breaks = TRUE to plot according to a consistent color ramp. This option enables easier comparison between species.

# Generate heatmaps for several species with comparable.color.breaks == TRUE
# so that heatmap color scale is conserved for ease of interspecies comparison
sp <- c("Pacific Wren",
        "Pacific-slope Flycatcher",
        "Swainson's Thrush",
        "Wilson's Warbler")

for (i in 1:length(sp)) {

 print(paste0('Working on ', sp[i]))

 g <- birdnet_heatmap_time(
     data = dat,
     common.name = sp[i],
     locationID = 'Rivendell',
     conf.threshold = 0.1,
     dates.sampled = exampleDatesSampled,
     hours.sampled = list(sunrise = c(1.5, 1.5), sunset = c(0, 0)),
     y.axis.limits = c(3, 10),
     julian.breaks = c(30, 60, 90, 120, 150, 180, 210, 240, 270),
     minute.timestep = 1,
     plot.title = sp[i],
     comparable.color.breaks = TRUE,
     tz.local = 'America/Los_angeles',
     latitude = 46.1646,
     longitude = -123.77955,
     sun.lines = c('dawn', 'sunrise'),
     sun.linetypes = c('longdash', 'solid')
 )

 print(g)

}

Click image for a larger version.

Heatmaps of detections of focal species at a monitoring location. The x axis shows date, the y axis shows time of day, and daily detection value is visualized from low (purple) to high (yellow).

Converting wave audio files to NVSPL tables with wave_to_nvspl

wave_to_nvspl() uses PAMGuide code to convert wave files into an NVSPL formatted table. NVSPL stands for NPS-Volpe Sound Pressure Level, and is the standard format used in NSNSD analyses. These are hourly files comprised of 1/3 octave data in 1-sec LEQ increments. PAMGuide was developed by Nathan D. Merchant et al. 2015.

Start by pulling up the function helpfile. Everything covered below is located in the "Examples" section of this helpfile.

?wave_to_nvspl

We start by creating an example audio input directory and writing example audio to this directory. This is meant to illustrate the file types and folder structure wave_to_nvspl() expects to encounter.

# Create an input directory for this example
dir.create('example-input-directory')

# Read in example wave files
data(exampleAudio1)
data(exampleAudio2)

# Write example waves to example input directory
tuneR::writeWave(object = exampleAudio1,
                 filename = 'example-input-directory/Rivendell_20210623_113602.wav')
tuneR::writeWave(object = exampleAudio2,
                 filename = 'example-input-directory/Rivendell_20210623_114602.wav')

wave_to_nvspl() takes several arguments. input.directory indicates the top-level input directory path. data.directory is a logical flag for whether audio files are housed in 'Data' subdirectories (common when using Songmeter SM4). The next argument, test.file, is a logical flag for whether to run the function in testing mode or in batch processing mode. The project argument allows the user to input a project name. The project name will be used to create a "params" file that will save parameter inputs in a file for posterity. The timezone argument forces the user to specify the timezone for the time reflected in the audio file name. Additional arguments are described in the helpfile; note that there are several default values in this function customized for NSNSD default settings when using a Songmeter SM4 audio recorder.

The suggested workflow for this function is to first set test.file = TRUE to verify that your workflow has been accurately parameterized. When test.file = TRUE, wave_to_nvspl() will assess one file and encourage the user to check all outputs. Users should ensure there isn't an NA in the "Time stamp start time" output (if so, something is wrong). Lastly, the test.file = TRUE argument will create a plot allowing the user to verify that time is continuous. If there are breaks in the plotted line, there is an issue with your parameterization. For additional context and details, NSNSD staff and collaborators should view this video tutorial.

# Perform wave_to_nvspl in test mode (test.file = TRUE)
wave_to_nvspl(
 input.directory = 'example-input-directory',
 data.directory = FALSE,
 test.file = TRUE,
 project = 'testproject',
 timezone = 'GMT')

Once you feel confident that you have parameterized accurately, run the function in batch mode by setting test.file = FALSE. The example below provides progress feedback and takes a few moments to run. Once complete, we can view the NVSPL table outputs. Column names are described in the helpfile.

# Perform wave_to_nvspl in batch mode (test.file = FALSE)
wave_to_nvspl(
 input.directory = 'example-input-directory',
 data.directory = FALSE,
 test.file = FALSE,
 project = 'testproject',
 timezone = 'GMT')

# Verify that NVSPL outputs have been created
nvspls <- list.files('example-input-directory/NVSPL', full.names = TRUE)

# View one of the NVSPL outputs
one.nvspl <- read.delim(file = nvspls[1], sep = ',')

Finally, we clean up by deleting the example input directory.

# Delete all temporary example files when finished
unlink(x = 'example-input-directory', recursive = TRUE)

Converting NVSPL files to acoustic indices with nvspl_to_ai

nvspl_to_ai() takes NVSPL table data created by wave_to_nvspl() and converts it into a broad range of acoustic index values, including acoustic activity, acoustic complexity index, acoustic diversity index, acoustic richness, spectral persistence, and roughness.

Start by pulling up the function helpfile. Everything covered below is located in the "Examples" section of this helpfile.

?nvspl_to_ai

First, we create a few example directories: an input directory containing the sample wave audio files that come with the package, and an output directory to collect the resulting CSV of acoustic index values. This is meant to illustrate the file types and folder structure nvspl_to_ai() expects to encounter.

# Create an input and output directory for this example
dir.create('example-input-directory')
dir.create('example-output-directory')

# Read in example NVSPL data
data(exampleNVSPL)

# Write example NVSPL data to example input directory
for (i in 1:length(exampleNVSPL)) {
write.table(x = exampleNVSPL[[i]],
            file = paste0('example-input-directory/', names(exampleNVSPL)[i]),
            sep = ',',
            quote = FALSE)
}

Now we are positioned to run nvspl_to_ai(). input.directory indicates the top-level input directory path, output.directory specifies where csv results should be stored, and project allows the user to input a project name. The project name will be used to create a "params" file that will save parameter inputs in a file for posterity. Additional arguments are described in the helpfile; note that there are several default values in this function customized for NSNSD default settings.

# Run nvspl_to_ai to generate acoustic indices csv for example NVSPL files,
nvspl_to_ai(input.directory = 'example-input-directory',
            output.directory = 'example-output-directory',
            project = 'example-project')
            
# View Results
(ai.results <- read.csv(list.files(path = 'example-output-directory',
                                   pattern = '.csv', full.names = TRUE)))

Finally, we clean up by deleting all example files.

# Delete all temporary example files when finished
unlink(x = 'example-input-directory', recursive = TRUE)
unlink(x = 'example-output-directory', recursive = TRUE)

nsnsdacoustics's People

Contributors

cbalantic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

limitlessgreen

nsnsdacoustics's Issues

TLitte issues

I downloaded the source codes and released it to my local computer. I followed all the instruction and got this error. Please let me know if you need any more information.

Thanks

Error: Cannot analyze audio file
C:/Users/xxxx/Documents/example-audio-directory/Rivendell_20210623_113602.wav.
Traceback (most recent call last):
File "", line 268, in analyzeFile
File "", line 217, in predict
File "C:\Program Files\BirdNET-Analyzer-main\model.py", line 121, in predict
loadModel()
File "C:\Program Files\BirdNET-Analyzer-main\model.py", line 38, in loadModel
INTERPRETER = tflite.Interpreter(model_path=cfg.MODEL_PATH, num_threads=cfg.TFLITE_THREADS)
File "C:\PROGRA3\ANACON1\envs\PYBIRD~1\lib\site-packages\tensorflow\lite\python\interpreter.py", line 197, in init
_interpreter_wrapper.CreateWrapperFromFile(
ValueError: Op builtin_code out of range: 127. Are you using old TFLite binary with newer model?Registration failed

image

birdnet_format issue resulting in altered recordingID and downstream errors in birdnet_barchart

This issue only seems to occur if a BirdNET text file doesn't include any species detections. When birdnet_format is run, it adds "BirdNET_" to the beginning of the recordingID.

When these formatted files are set up using the documented process and input into birdnet_barchart, the locationID is misread as "BirdNET" and the date and time fields are filled with "NA" values. I think the function is missing the date and time information because the recordingID field is no longer in the expected format. birdnet_barchart then produces an error stating that julian.range must be a finite number, which I believe is due to NAs appearing in the date and time fields.

Temporary wav files not deleted if you exit birdnet_verify

If you use escape to exit birdnet_verify in the middle of a set of verifications and play = TRUE, the temporary wav file written to the working directory is not deleted. The function does not produce any error message, and temporary files are deleted properly if you do input a verification.

birdnet_verify not updating with verifications

It returns the following after inputting a user verification:

Updating with new verifications.
Error in if (file == "") file <- stdout() else if (is.character(file)) { :
argument is of length zero

seems from line 210 of birdnet_verify that the file name is meant to be inserted here:

message("Updating ", basename(finame), " with new verifications.")

  • so maybe linked with case sensitivity in ".WAV" vs. ".wav"?
  • or some issue with the finame object, which appears to be missing somehow?

R and RStudio are running on their latest versions

Debug codes are committed

In the file R/birdnet_analyzer.R, the debug cods is committed without comment out.

# # Set up for profiling

audio.directory = 'C:/Users/cbalantic/OneDrive - DOI/Code-NPS/NSNSDAcoustics/example-audio-directory'
results.directory = 'C:/Users/cbalantic/OneDrive - DOI/Code-NPS/NSNSDAcoustics/example-results-directory'
birdnet.directory = 'C:/Users/cbalantic/OneDrive - DOI/BirdNET-Analyzer-main/'
lat = 46.09924
lon = -123.8765
start = 1
ovlp = 0.0
sens = 1.0
min.conf = 0.1
threads = 4
batchsize = 1
locale = 'en'
sf.thresh = 0.03

Issues with birdnet_format, birdnet_verify, and birdnet_barchart.

Hi Cathleen,

Wondering if you have any insights on the following errors:

For birdnet_format() I get:
Error in birdnet_format(results.directory = "mydirectory", : Multiple file extension types found in folder. Please make sure results are all txt or all csv. Do not mix file types. even though they are all csv

For birdnet_verify() I get:
Error in setkeyv(x, cols, verbose = verbose, physical = physical) : some columns are not in the data.table: recordingID

For birdnet_barchart() I get:
Error in birdnet_barchart(data = raw.results, interactive = TRUE) : could not find function "birdnet_barchart"

Thanks!

Best,
Vlad

Error when trying to install NSNSDAcoustics

Hello,

I get the following error when I attempt to install the package:

Error in .install_package_code_files(".", instdir) : 
files in 'Collate' field missing from 'C:/Users/vladk/AppData/Local/Temp/RtmpagvOJn/R.INSTALL2e98702075ac/NSNSDAcoustics/R':
  bcp-functions.R
  nvspl_to_ai.R
ERROR: unable to collate and parse R files for package 'NSNSDAcoustics'
* removing 'C:/Users/vladk/Documents/R/win-library/4.1/NSNSDAcoustics'

Thanks in advance!

-Vlad

parallel not found

We got the 'parallel' not found error as running the command in R studio. R and R studio are latest version. Python is 3.8.0 or 3.8.10

birdnet_analyzer(audio.directory = 'C:/Users/xxxxxx/Documents/example-audio-directory',
results.directory = 'C:/Users/xxxxxx/Documents/example-results-directory',
birdnet.directory = 'C:/Program Files/BirdNET-Analyzer-main',
lat = 46.09924,
lon = -123.8765)


Error in birdnet_analyzer(audio.directory = "C:/Users/xxxxxx/Documents/example-audio-directory", :
object 'parallel' not found

There is no need for the whole Anaconda package

I highly recommend using Miniconda instead. Even better, consider using the C++ implementation of it (Mamba), which has a much faster solver. This is especially important because the current installation instructions have long waiting times.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.