browndwarf / gollum Goto Github PK

View Code? Open in Web Editor NEW

20.0 20.0 5.0 14.49 MB

A microservice for programmatic access to precomputed synthetic spectral model grids in astronomy

Home Page: https://gollum-astro.readthedocs.io/

License: MIT License

Python 69.25% Jupyter Notebook 23.67% TeX 7.07%

astronomy astrophysics physics python science spectroscopy stars

gollum's People

Contributors

Stargazers

Watchers

Forkers

astrocaroline cdxker vandalt danielandreasen

gollum's Issues

A function for finding the closest grid point in a lattice (3 dimensional space)

The sonora.py code currently takes in effective temperature, surface gravity, and metallicity as grid points in a three dimensional space. However, the points are in a jagged array (#31 ), so we need to create a function to find the nearest grid point whenever a grid point that doesn't exist is found.

Add GitHub actions continuous integration

Our sibling package muler has introduced continuous integration with GitHub Actions. Let's do the same! It may be trickier to add support for testing model grids since they are so voluminous. Hmm, what to do?

Support for a reading in just a single static file?

One of the chief problems we have is that the grids are voluminous. We have contemplated many ways to attack that problem with mini-grids (#27), or a binary storage format (#26).

Here we propose another stategy--- the ability to read in just a single standalone static file.

Imagine I simply hand you the (URL or local) path to a single PHOENIX or Sonora model. For Sonora, these have all the information you need to make a spectrum object. For Phoenix, you also need the wavelength file (we could handle that, though).

Add a downsample or decimation method

For some plotting purposes it is useful to decimate the data after smoothing. We should add such a method.

Increase Test Coverage

Unit tests sometimes let some errors slip through. Improved coverage would give it a more accurate output of if the code is working.

Dashboard extension idea: dynamic side plot of Teff vs logg (vs. [Fe/H]?)

@SujayShankarUT and I were talking about an extension to the dashboard, in which the instantaneous position of Teff and logg are shown on a side plot. This layout would resemble the lightkurve TPF interact view

We could either show a 2D view or a 3D view, like this:

But with the current (Teff, logg) coordinated highlighted with a bigger, colorized circle for emphasis. So as you move the sliders around, the current highlighted grid point would jump around in the plot.

What's more is that we could enable a custom python callback, so that the user could click with the cursor on a particular grid point coordinate, and the spectrum shown would update to that (Teff, logg) pair. A technical implentation for this python callback design is available in the lightkurve interact source code

For a 2D view, some dimensions would have to be excluded by default. So yet another extension could be to use bokeh's dropdown or radio buttons to allow the user to select which pairs of physical properties they would like to view.

This is all really complicated and would likely be a ton of work, but could be a fun UI/UX visualization project, with both front-end and back-end development.

Finite differences idea for Sonora family of models: abundance Jacobians

I've been talking with @astrocaroline about an idea for augmenting the Sonora models with some cheap-and-useful ancillary information. Here is a sketch of the idea we have:

An opportunity: finite difference Jacobians for free cheap

Caroline explains that the computation of a single Sonora spectrum gets bottlenecked at the stage that computes physical self-consistency: assembling the T-P profile amidst all the opacitiy sources, etc. However, once that righteous $T-P$ profile has been obtained, perturbations to the underlying assumptions can be computed very cheaply. Examples of such perturbations include tweaking the abundances, for example.

These perturbations can be made small enough that the underlying assumptions that arrived at the $T-P$ profile are close to correct. Under such limiting cases the perturbed output spectra represent a Taylor Series expansion on the grid point. We can then obtain the Jacobian by taking finite differences, say between the original bona-fide grid point, and this new one:

$\lim_{h \to 0} \frac{F_\nu([H_2O]+h) - F_\nu([H_2O])}{h} \approx \frac{\partial F_\nu}{\partial \mathrm{[H_2O]}} $

where $h = \delta [H_2O]$ represents a small perturbation to the water abundance. So this equation represents the change in the emergent spectrum from perturbing the water abundance, at the native $R \sim 1,000,000$ spectral resolution of the precomputed synthetic spectrum!

The problem: how to deal with them?

Such Jacobian information is not currently packaged with the model outputs, and it would be tricky for a consumer to obtain an estimate without reverse-engineering aspects of the water line list that were already in-hand at the time of model convergence. So why don't the model owners compute these Jacobians, package them with the models, and publish them?

At least one reason is because it's not clear how folks would use them: they are a non-standard model output, and there apparently has not been any demand for them.

A `gollum`-based strategy for handling Jacobians

gollum provides a potentially new way to liberate these outputs, if they existed in the first place. For example, you could imagine an extension to the dashboard for water abundance that allows the user to move a slider for water abundance, restricted to in a small range of validity near the grid point. While imperfect, this visualization guide would let the user know whether certain features are attributable to water or not. Some heuristics for such a visualization exist, but mostly for low resolution spectra: essentially "water is this big bandhead". But as we move towards high resolution spectra, the heritage of any particular line or group of lines becomes much more difficult to interrogate: lines overlap and shift and ebb and flow. So I envision this tool as primarily unlocking new use cases at high spectral resolution.

There is another benefit: These Jacobians could be used to compute "Information Content" analyses, by taking the dot-product of Jacobians from different physical perturbations (each doctored with the same resolution kernels and sampling). That's a formalized way to answer questions like "to what extent is H2O abundance degenerate with FeH in my wavelength range of interest?". For some use cases this information content analysis could lessen the demand for the much-more expensive MCMC "retrieval" analysis that currently achieves similar aims. It could make it easier to write JWST proposals that assess the tradeoffs among instrument modes, for example, achieving better proposals and better overall resource allocation.

Hypothetically there may be a way to obtain finite-difference Hessians, by perturbing pairs of parameters, but I've though much less about that. I suppose that's just to say that there are even more spin-off technologies that could arise by building a workable prototype around these ideas.

How to actually generate the products, and who should do it?

One key idea is that it would involve making new model outputs. To date gollum has taken the models as given, precomputed text files stored on the web. I think it is beyond the purview of gollum to generate new model products. So likely the generating code would live in a separate repo, and then the products would get consumed with new gollum code. So there is code development in both of those places.

Who wants to work on this? Thoughts?

API documentation not rendering

Probably because of missing requirements.txt dependencies...

Experimental Telluric Transmission support --- discussion and fate

I introduced a new lightweight experimental telluric transmission class TelluricSpectrum in commit e3a2cee

The class supports ASCII output from the ESO SkyCalc online calculator.

This spectrum is technically a "Precomputed Spectrum", so it might have a place in gollum. Instrumental broadening, normalizing, and plotting all apply to this spectrum. On the other hand, methods like rotational broadening do not apply. On balance, I think having this lightweight class around is helpful, and it means we can easily underplot telluric transmission on the interactive dashboard. This dashboard plotting was the main impetus for adding the feature.

The main problem with the class is that SkyCalc does not have a grid to read in. It is computed for user-specified input, and then exists only as a temporary browser IFrame that cannot be easily referenced. So actually connecting to the data is awkward since people have to copy-and-paste the output of SkyCalc, which lends itself to generation loss as users may input new lines, or add headers, or other non-standardized changes to the content.

So: What should we do about tellurics? Should we

Make them yet-another-separate-microservice
Leave them in gollum
Something else?

Extensions to the dashboard

The Sonora dashboard is working great! A few conceivable extensions I talked about with @astrocaroline ---

Add a save button for the model state
Sometimes you may want to reuse the model state that you obtained from the dashboard. A simple thing to do would be to simply save a csv file of the wavelength and flux of the model. This static approach does not support round-tripping-- the csv file can't be recognized as a Sonora spectrum later. Alternatively you could save the model parameters as a csv file and then recompute the model from those values (supports round-tripping).
Instrumental resolution handling
What should we do about instrumental resolution when a data of known resolution is provided into the dashboard? We could add it as a slider or put a toggle button. Currently I think we default to native resolution (or R=100,000), and use the resolution of the spectrum if it is available. But I have disabled it at various times.

Make sure we are using deep copies instead of self?

We learned a few lessons from muler that we should incorporate into gollum. We had to make deep copies instead of self in order to avoid in-place operations. We should do the same here. It has not appeared to cause a problem yet, but it might eventually.

How to deal with jagged grid in Sonora Bobcat 2021

In Sonora Bobcat 2021, it states that

Some combinations of model grid parameters include additional values of the gravity.

This grid sampling strategy results in jagged arrays. We need a system to know which combinations of grid points are allowed. That will mean we need to make a utility or attribute called "allowed_grid_points", essentially a look up table for which grid points are allowed.

_truncate method is buggy with data input

When using the data keyword for _truncate, the grid is not properly truncated.

Lower the barrier to entry for setting up model grids: supplying a mini-grid or "tutorial grid"

In addition to lowering the barrier to downloading and setting up the whole grid (#26), we also want to make the downloading extremely fast for quick-look purposes. It may therefore be useful to provide a decimated grid, truncated in wavelength extents, reduced resolution, reduced parameter ranges, but with all of the available dimensions to facilitate the dashboard.

Luckily, this grid may also have significant scientific potential: It could represent low res spectra across the IJHK bands, for instance, and therefore be useful outside the context of fitting high resolution spectra.

So making this grid is a win-win from an engineering and science standpoint, and may therefore be worth doing.

Expand the dashboard to include panels for T-P profile and composition gradients

The Sonora dashboard currently only plots the spectrum, but we could---and should---show other ancillary views into the physics of the model grid. The principal view-of-interest is the famous Temperature-Pressure (a.k.a, $T-P$, or sometimes $P-T$) profile, which changes with $T\mathrm{eff}$, $\log{g}$, and to some extent metallicity and C-to-O-ratio.

So in this hypothetical new version of the dashboard, the spectrum would update while simultaneously updating the appearance of a yet-to-be-added $P-T$ profile panel plot. In this way, the user would get instantaneous visual feedback about how the $P-T$ profile changes with $T\mathrm{eff}$, and how the Spectrum changes with $T\mathrm{eff}$.

The metadata for the $P-T$ profile is housed within the “structure.tar” files provided with Sonora-Bobcat. ATMO also provides this structure data.

In a conversation with Mark Phillips at CoolStars21 in Toulouse, Mark encouraged the addition of a third panel that is related but distinct from the $T-P$ profile plot. This third panel would show the composition of chemical species as a function of Pressure. One could conceivably overplot these on top of the P-T pofile, since they share the same $y-$axis, Pressure. But Mark recommended against cluttering the view too much. I tend to agree, since the movement of the curves will make the plot look visually busy. It's common to also plot the condensation curves for various species; I'm not sure if we would incorporate that visual element, but it's of course possible.

I think it’s doable to add these. I could imagine a Hack Day in the Fall where some of us work together on a prototype.

By the way, I think this dashboard would be an excellent learning tool. The grids are complicated and intrinsically "high" dimensional (about 4 or 5 tunable dimensions, plus many covariates). So a visualization tool like this would be useful to both newcomers and seasoned practitioners alike.

One drawback is that currently the dashboard has to be customized to each model grid, since they have different dimensions. That's OK, but means we have lots of duplicate code. I don't think that's a major problem, just a maintenance burden since the various dashboard get out of sync with each other.

Miscellaneous notes on phoenix.py (take with a grain of salt)

General: Change double quotes to single quotes, should make things seem less busy.
Use f-strings instead of .format()
Use # for one-line comments.
Add type hints to function and class arguments.
Tuple unpacking can be done in one line.

L65-69: PATH AND METALLICITY DEFAULTS
Merge default values into init arguments.

L84: MASK VARIABLE DEFINITION
Use chain comparison instead of two separate comparisons.

L121-143: TEFF, LOGG, AND METALLICITY PROPERTIES
Use conditional expressions to create one-liner functions.
Call to .keys() is unnecessary, 'in' operator automatically searches keys.

L171-175: CHECK FOR FLUX, SPECTRAL_AXIS, AND META
Use set().issubset(dict) instead of multiple 'in' statements.

L188-200: RANGE CHECKING
Use chain comparison instead of repeated comparisons.
Eliminate the 'subset' middleman variable, and directly put the comparison as the index.

L202-203: EMPTY LIST CREATION
Move all empty list creations to one line.

L214-225: GRID POINT VARIABLE DEFINITION
Eliminate the 'grid_point' middleman variable, and directly append the tuple to grid_points.

L230-231: LOOKUP_DICT VARIABLE DEFINITION
Use enumeration to eliminate finding the length and the indexing operation.

L323-328: WAVELENGTH_RANGE VARIABLE DEFINITION
Directly assign wavelength values to the shortest and longest instead of creating a tuple and unpacking it.

L332-334: MASK VARIABLE DEFINITION
Use chain comparison instead of repeated single comparisons.

L340-341: ASSERTIONS FOR FLUXES AND WAVELENGTHS
Combine assertions.

L349-355: FINDING NEAREST TEFF AND METALLICITY
Eliminate middleman variable 'idx'.

L529-530: ELSE PASS
Unnecessary.

Support for Sonora Cholla

The new disequilibrium models are out, named Sonora Cholla. Let's add support for these.

show_dashboard duplicates boilerplate code in sonora and phoenix

SonoraGrid and PHOENIXGrid both have a .show_dashboard() method. The two methods share almost the same code duplicated nearly verbatim, possibly as much as 90% identical, with only about 10% differences. This Write Everything Twice (WET) approach appears to violate Don't Repeat Yourself (DRY). show_dashboard constitudes 44% of the lines of code of phoenix.py and 40% of the lines of code for sonora.py, so reducing the duplicate code would stands to make a big impact.

The problem is that the 10% differences are difficult to abstract. In particular, we would need a way to generalize dimensions of grids. That amounts to:

"Add a slider that changes the fourth grid dimension"

...means that we need a pre-requisite goal:

"Abstract the grid dimensions with aliases so that unlike grids can share underlying mechanics."

So for example, we would have something like

grid.first_dimension = grid.teff_points
grid.second_dimension = grid.logg_points
grid.third_dimension = grid.metallicity_points
grid.fourth_dimension = grid.alpha_abundance_points # (Not Implemented yet)

That's easy enough to do, but in practice the corresponding manipulations can be tricky. For example, Sonora and PHOENIX both have ragged/jagged/sparse filling of those dimensions. So we would need some generic mechanism to handle both the abstraction and the "raggedness". That might be straightforward, but for some reason it sounds error-prone too.

@SujayShankarUT and I agree that the path forward is to do some exploratory work in this direction:

Add the fourth dimensions to both grids (i.e. C / O for Sonora #40 and $[\alpha/Fe]$ for Phoenix). The idiosyncracies of these dimensions will reveal themselves and show the righteous path towards abstraction layers.
Add some experimentation in to-what-extent we can abstract the dimensions. So for example, we could implement a show_dashboard method in precomputed_spectrum that the users would never see because it gets overridden instantly by each subclass, but we developers could be experimenting with it behind-the scenes. Then, once it's working, we can simply delete the subclass methods and the parent class method will gracefully take over. This dev strategy will give us a long runaway for experimentation...

Add a normalization slider to the PHOENIX dashboard

The Sonora Dashboard got a normalization slider but somehow the PHOENIX dashboard didn't get one. Let's add one, it should pretty much be a copy-and-paste from Sonora to Phoenix.

Add support for ATMO2020

I had a great conversation with Mark Phillips at CoolStars21 in Toulouse, France. We discussed the prospect of adding support for the ATMO2020 model grid ¹ into gollum. ATMO2020 would be the third model grid we support, with existing support for PHOENIX and Sonora-Bobcat. This support could include both Equilibrium and Disequilibrium chemistry options.

It would be straightforward to add in ATMO2020, since it would follow the same overall structure as PHOENIX and Sonora-Bobcat. In particular, it would inherit from the PrecomputedSpectrum class, which provides a majority of the standardized operations. We'd make a new module and class:

from gollum.atmo import Atmo2020Spectrum, Atmo2020Grid

native_spectrum = Atmo2020Spectrum(teff=700, logg=5.0)

native_spectrum.normalize().plot()

Philips et al. 2020 ↩

Instrumental Broadening bug

The current .instrumental_broadening() method probably has a bug (converting FWHM and sigma in the wrong direction). Pretty sure I just fixed that. But while doing so I was reminded of another issue. We simply compute the angstroms per pixel with the median. For spectra with discontinuous jumps in pixel sampling, this step is erroneous. Instead we should consider using something like the Nadaraya-Watson kernel smoother, which I think will handle the discontinuous jumps just fine.

Make an animated gif showing how the dashboard works

Currently the dashboard only works locally, meaning that it cannot easily be uploaded to an html website and used in the browser in real-time. Instead, let's make an animated gif from a short screen recording. We can then upload that animated gif to social media for advertising purposes and we can display it instead the tutorial.

Low resolution Brown Dwarf grid tutorial and animated gif

We should make a tutorial showing how to apply the interactive dashboards to low resolution spectra. We had previously focused on high-resolution applications, but many practitioners think at low resolution, especially for ultracool dwarfs, where high resolution had historically been difficult to obtain.

Here is a demo of the new Sonora-Bobcat ultracool dwarf dashboard illuminating some satisfying morphology as you move sliders for temperature, surface gravity, and metallicity.

This dashboard may be of-interest to folks at BDNYC (and elsewhere) @kelle @jfaherty17 👀

sonora_bobcat_JHK_dashboard_demo.mov

A custom dashboard with sliders for e.g. starspot physical properties, and generalization discussion

Starspots emit light and---if their surface coverage is large enough---that light can be significant enough to be perceived as a weak constituent in the stellar-disk-averaged spectrum. The extent of this starspot emission can be quantified with two parameters:

$f_\mathrm{spot}$ the coverage fraction of the projected stellar disk exhibiting starspots (e.g. 7%)
$T_{\mathrm{spot}}$ the characteristic temperature of the spot, typically less than the ambient photosphere Temperature (e.g. "2800 K)

Idea: We could add these two parameters as new sliders in the dashboard.

The problem with this idea is that it's fairly specialized: most folks don't need such parameterization, since it affects special categories of stars (mostly pre-main sequence, sub-subgiants, and M dwarfs). Only the most profoundly spotted stars ($f_{\mathrm{spot}}\gtrsim20$%) exhibit enough collective emission to be visually detected in a typical echelle spectrum. So most practitioners safely ignore this effect, or turn to precision techniques. In general, caution should be exercised when deciding to build a tool that is narrowly tailored to a tiny application area.

One strategy could be to generalize the dashboard code: make it more modular and therefore easier to stick pieces together on-the-fly. That redesign would suit both this application and eclipsing binaries, veiling, and other ideas. As currently written, the widgets are hard-coded, preventing this ease of extensibility. I can't easily think of a way to generalize the code, but I haven't thought about it too much. I suppose a dialog about it with some architecture-minded folks could reveal some new strategy.

Another strategy would be to build it, but keep it out of gollum altogether: to allow custom plug-ins or verbatim-mimicked/adapted dashboards that tolerate hardcode, and they simply live elsewhere, so as not to muddle gollum too much.

I think for now I favor the latter approach: build a bespoke dashboard offline. We can take the lessons learned from it and apply them back into gollum later on, if needed.

Split PHOENIX and Sonora tests

Many users, especially students, are likely to only be interested in loading either PHOENIX or Sonora models, not both. Recommend providing something like py_phoenix.test and py_sonora.test so student users aren't bogged down with failures related to grids they are not currently interested in.

PHOENIX and Sonora getitem method is returning a spectrum with incomplete metadata

getitem for PHOENIXGrid is indexing in an odd fashion which causes the returned spectrum to miss teff, logg, and metallicity attributes.

Merits of type hinting in arguments

I was thinking that some of the init arguments could use type hints, but only ones that could take one type of input. Ex. if we had a scalar value input, a type hint for "float" should cover it.
This raises the question of what tradeoffs arise when using type hinting

Correctly handle convolution when wavelength sampling changes

We currently use np.convolve for rotational broadening. This procedure assumes fixed separation between samples, and is therefore erroneous when the sampling changes (as it does in PHOENIX near 1 micron for example). We need to fix this, probably by convolving at both spacings, and then splicing the convolved fluxes at that breakpoint.

Interpolating between points on a model grid

This is partly a feature request but also partly an example of how this might be done.

You can load a grid of models, for example PHOENIX models like so

grid = PHOENIXGrid(wl_lo=wave_range[0], wl_hi=wave_range[1], path=path_to_pheonix_models, teff_range=[8000, 11000], logg_range=[3.5, 5.0], metallicity_range=[-0.5,0.5])

which you can then fit with the dashboard (for example see this Tutorial https://gollum-astro.readthedocs.io/en/latest/tutorials/gollum_demo_Sonora_and_BDSS.html).

grid.show_dashboard(data=std_spec)

While (as far as I know) there is no current implementation of interpolating between models on a grid, I found I could load two models and combine them, given a chosen fraction of how much to use from each. In the following example, I combine two PHOENIX model synthetic spectra which are identical, but one has a log g=4.5 and the other has log g =5.0, into one and then proceed to process and normalize it:

synth_spec_uno = PHOENIXSpectrum(path=path_to_pheonix_models, wl_lo=wave_range[0], wl_hi=wave_range[1],
    teff=10000, logg=4.5, metallicity=-0.5)
synth_spec_dos = PHOENIXSpectrum(path=path_to_pheonix_models, wl_lo=wave_range[0], wl_hi=wave_range[1],
    teff=10000, logg=5.0, metallicity=-0.5)

uno_fraction = 0.55
dos_fraction = 1.0 - uno_fraction
synth_spec = synth_spec_uno * uno_fraction + synth_spec_dos * dos_fraction

synth_spec = synth_spec.normalize(percentile=90.0)
synth_spec = synth_spec.rotationally_broaden(15.0)
synth_spec = synth_spec.rv_shift(25.0)

Something like this could form the basis for interpolating between points on a model grid to generate synthetic spectra with stellar parameters with values between grid points. In the meantime, the above example I have seems to work, for anyone who needs this functionality.

Add a feature to resample to Doppler-ready spectral axis

There exists a special wavelength sampling that is uniform in Doppler velocity coordinates, and exponential in wavelength space. Adopting this special sampling makes it effortless to apply instrumental and rotational broadening convolutions, since the np.convolve operators or equivalent can be applied.

This issue matters mostly for high-bandwidth spectra, approaching say an octave of bandwidth, though as usual it depends on the science case, too.

We should consider adding a method to resample the spectrum to this special sampling. Doing so would solve #3 and #9.

JOSS paper discussion

Let's make a gollum paper!

Destination and bundling

The muler paper is destined to be submitted relatively soon, likely to JOSS (and possibly part of an affiliated AAS journals paper). See the muler issue for that discussion: OttoStruve/muler#76

My inclination is to submit a standalone JOSS gollum paper. I've already started a paper.md file here: 6e950e8

The alternative paths would be either A) combine it with muler, or B) associate it with an affiliated AAS Journals paper. Option A seems a bit messy, and option B doesn't have a specific AAS journal paper in mind, since I see gollum as a general purpose tool/microservice. So option B would delay the publication, essentially indefinitely.

Timeline and things-to-do before submitting

In principle we could submit now-- the framework is working and is pretty good albeit imperfect.

The main thing I want to add before submitting is support for the full Sonora-Bobcat 2021 model grid (Marley et al. 2021) with metallicity support (Issue #6). My recommendation is to make adding this feature a priority.

We'd probably also want to add a tutorial or video screencast on how to use the dashboard. Getting up-and-running with the dashboard is tricky since it only works locally (does not work on Google Colab, for example), and requires the voluminous model grids in a certain directory structure. So showing the awesomeness of the dashboard without having to install is a key important step.

So that puts us around April/May 2022 for submission time. I'm okay with that, and it allows some new contributors a chance to join in before then.

Add an interactive dashboard to SonoraGrid

The intuition project shows an interactive bokeh dashboard in a Jupyter notebook. We want to port that code to gollum so that one can simply:

grid.show_dashboard()

What to name the method? show_dashboard, interact, or just dashboard?

Update the "Downloading Model Grids" page with instructions for Sonora Bobcat 2021 model grids.

We used to only support the zero metallicity version. We now only support the 2021 model grid, with all the text files already unzipped. We should update the website to reflect that change.

Add a trim edges feature

The convolution process currently leaves edge effects. We should explore if we can reduce edge effects with some clever kwargs in np.convolve. But we should also just have a .trim_edges feature.

Truncate kwargs have a unit inconsistency

Truncate's kwargs are not being used correctly; there is a bug in the units. Needs some TLC.

Feedback from Cool Stars 21 focus group

I'm here at Cool Stars 21 in Toulouse France with @kelle and @Will-Cooper and we are examining gollum.

1. What to call the interactive component

We currently call the ..show_dashoard a dashboard, but @kelle points out that this is misleading. At the currently time it is mostly a widget, due to the predominence of sliders. Eventually we may add more dashboard-like interfaces, so do we want to futureproof it or rename it? Kelle notes that whatever bokeh calls it could also guide the decision.

2. How/whether to connect to `simple` web app

Kelle likes the dashboard/widget! Yay! She argues that it should go on to the simple web app. We are now discussing the tradeoffs and design possibilities.

More flexible units, less hardcoding

We currently assume Angstroms everywhere in gollum. While that is convenient, some communities are used to microns or nanometers, or maybe even wavenumbers. We should allow the spectra to reside in whatever units are input. We'll have to sanitize any locations that have Angstroms hardcoded, that's doable. Wavenumbers are harder to deal with because in some places we may assume spectral axes are sorted in certain ways.

The flux units are also tricky. We should consider allowing either $F_\nu$ or $F_\lambda$ values.

These are all straightforward to implement with the astropy equivalencies protocol, it may just require some boiler plate code whenever we do some operation that may depend on the unit (e.g. plotting, black body normalization).

Normalization not permanent in PHOENIX

Every time we change grid points while having some level of normalization, the spectrum loses any normalization it had, and needs the normalization slider to re-update for it to go back into effect.

A "visualize_grid()" or "summarize_grid" method

Today @Jiayi-Cao, @astrocaroline, and I discussed the challenge of visualizing the sparse, semi-regular 6+ dimensional point clouds that we practitioners broadly refer to as "The Grid". We seek a way to visualize or summarize the grid at-a-glance, useful to both newcomers and experienced practitioners alike.

One idea is a 2D or 3D plot, such as in this slide of a presentation I gave in Seoul, South Korea in 2015.

The problem with the plot is that it can only show at most 3 Dimensions at a time, and we have 6+ dimensions.

Another strategy would be a corner plot. The problem here, though, is that the hidden dimensions in any pair of axes will be exactly overlapping, masking any high dimensional sparsity. That conflict may sound hypothetical, but such a situation can arise in [M/H] vs C-to-O ratio in the new Sonora Bobcat 2021 models. Even still, a corner plot may be interesting.

Alternatively, summarize_grid( ) would be a verbal representation of the grid. Something like:

This grid has:
12 points in Teff
8 points in logg
3 points in [M/H]
3 points in C / O (for a restricted subset of metallicity points)

It spans - wavelength at a spectral resolving power of R=

... and even more info.

It's important to emphasize that these summary properties will not be static, since SonoraGrid can have sub-grids, and can truncate extents, or decimate data. So these functions have to conduct a metaphorical MRI on the grid, so they may be non-trivial depending on how much we want to say.

Carbon-to-Oxygen ratio support for Sonora Bobcat models

We added support for Metallicity [Fe/H] to Sonora Bobcat, and now it has a working slider for that dimension.

℅ ratio is available for a subset of the model grid, and we'd like to add support for that.

Add ability to truncate the grid to a given wavelength range

Currently we truncate the grid upon reading-in the spectra. But sometimes we want to start with a wide range and then cut the grid to a smaller grid. That's currently not easy to do. Let's make it easy.

Add color shift gradient to RV slider in Dashboard

Add a red to blue gradient depending on if the recessional velocity of the object is positive or negative.

Handle Jagged or "Ragged" 3D/4D grid arrays in PHOENIX

I've just completed download of the Phoenix Grid files. 162G in 1d 13h!

I tried to run through the tutorial for simulating a spectra with Phoenix instead of Sonora:
https://gollum-astro.readthedocs.io/en/latest/tutorials/gollum_demo_Sonora_and_BDSS.html

When trying to create the grid with:
grid = PHOENIXGrid(wl_lo=wl_lo, wl_hi=wl_hi)
it got to about 73% and then stopped with the following traceback:
"AssertionError: Double check that the file D:\PHOENIX\phoenix.astro.physik.uni-goettingen.de\HiResFITS\PHOENIX-ACES-AGSS-COND-2011/Z-4.0/lte07800-3.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits exists"

I checked my files and it is indeed missing. ~/Z-4.0/lte07800-2.50* and ~/Z-4.0/lte07800-3.50* are both there, but not 3.00

Looking back in my terminal log, it appears as though the download didn't even look for ~/Z-4.0/lte07800-3.00*
it just went straight from 2.5 to 3.5.

Is the file missing? Is there a way to get just that file to complete the grid? Is there a way to turn ignore that missing grid point?
I'm not sure what I should do. Please advise.

Gollum + Muler environment file and dual installation instructions.

Currently we make two environments, one for Muler and another for Gollum. We then must switch environments to switch tools.

A single environment file for both packages would make the two more compatible.

Installation instructions for the merged environment should be provided in the documents.

It seems like gollum is an extension to muler, and so making the muler documentation include the the gollum setup, that would be helpful.

Metallicity support for Sonora Bobcat models

Eventually the Sonora-Bobcat models will have metallicity, C/O ratio, and possibly other dimensions like f_sed. We want to be able to read in those future metallicity grids, while maintaining support for the extant solar-metallicity grid.

Future grids may have irregular sampling in the grid parameters. Irregular sampling is a smart strategy for grappling with high dimensional spaces, but it does add some complexity to how we read in the data and how to make the dashboard sliders identify and "snap-to" the closest gridpoint.

@astrocaroline has provided some pre-release grids with metallicity for development and evaluation purposes. To read in this grid we may want to move towards a glob and filename comprehension approach to identify eligible grid points rather than pre-specify the grid points. I am starting an experimental feature branch in anticipation that these changes may be backwards breaking...

Documentation or utility function for fetching the raw model grids

We should add documentation, scripts, or a utility function for automatically fetching the model grids. The automatic fetching is tricky because we can't assume everyone has, say, wget or whatever. We might want to use beautifulsoup (or more likely requests) to make the requests: that's another dependency to add in---which is OK but adds some complexity.

The other hiccup is that the grids are huge, so we wouldn't want to just cram them into a hidden folder, and we'd want to make sure the user knows how big they will be in advance.

Should we allow automatic grid downloading with the `download=` approach?

Currently we allow the downloading of a single PHOENIX model with the download= kwarg. Hypothetically we could extend this into the PHOENIXGrid. The problem is that it would take a long time to download everything if the grid is huge (15G).

This approach does do caching though, so maybe it's OK?

Thoughts?

Lower the barrier to entry for setting up model grids: downloading the full grid

Downloading, unzipping, and moving the model grids into the right location is time consuming, platform-dependent, and error-prone. We want a solution that provides access to the model grids quickly and reliably.
There are two ideas:

Provide a script to download and extract the native tar.gz files
Pros: Does not introduce a new standard that we have to maintain, respects the original location.
Cons: Difficult to design a platform-independent script capable of unzipping.
Store the raw grid in a binary format (e.g. HDF5, parquet, arrow, numpy arrays, pickle files, feather, etc.)
Pros: Possibly very fast and efficient, only one file to download (not 6+ individual tar files),
Cons: We have to maintain the files now (where do they get long-term stored, what README's go with it, might even be against the preferences of the original authors), the grids may be so big that storing them as a single massive array will exceed some machine memory limits.

In a discussion with @astrocaroline and @Jiayi-Cao today we agreed that option 2's benefits win the day. We'll want a format like HDF5 that allows granular access to sub-portions of the array without reading the entire dataset into memory.

add an option to save and load the processed grid

should add a save button to store, and a load button to load grids without repeatedly processing them.

Interactive notebook_url

Tried using the notebook_url keyword when plotting the Sonora grid models interactively, but even when redirecting to the proper localhost:num I end up with the following error:

ERROR:bokeh.server.views.ws:Refusing websocket connection from Origin 'http://localhost:8891'; use --allow-websocket-origin=localhost:8891 or set BOKEH_ALLOW_WS_ORIGIN=localhost:8891 to permit this; currently we allow origins {'localhost:8888'} WARNING:tornado.access:403 GET /ws?kernel_name=python3 (::1) 0.75ms WARNING:tornado.access:403 GET /ws?kernel_name=python3 (::1) 0.75ms