dleebrown / anna Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 5.0 43.71 MB

Automatically parameterize stellar spectra using a convolutional neural network

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

astronomy convolutional-neural-networks machine-learning spectroscopy

anna's People

Contributors

Stargazers

Watchers

Forkers

gpetter hughdickinson mahmud83 zhouyutao2018

anna's Issues

check if loading a frozen graph clears devices

edge case so not critical since testing and inference can be run on the cpu for small data sets anyway.

Duplicate tensor op name issue

Problem in ANNA_test - ANNA_train is imported to use helper functions for preprocessing and to avoid bloating the list of things that need updating when a function is changed. But tensorflow graph is read in as well when ANNA_train is read in, so when saved metagraph is imported, ANNA_test fails because the variable names already exist.

Possible solutions:

Try running in a different session in ANNA_train, then the ANNA_test graph will be imported but unused.
Move helper functions to a separate .py file and import that.
Both of the above will need to deal with the fact that the queue op is buried in a python function.
Move copies of helper functions to ANNA_test. This is not ideal since it will generate redundant copies of the functions.
Make a separate architecture file. This is basically the inverse of solution 1, and has the same problems.

Look into stability of weight initialization.

Under more extensive testing, it looks like the weight initialization in v0.1.0, which follows the He+ (2015) weight initialization scheme, is unstable under certain training conditions, namely low absolute variance in the input data (e.g. training on a <1000 K temperature range).

Is this a bug? Is my weight initialization incorrect vs He+?
If not a bug, redo the weight initialization to make the training process more stable.

For now, rolling back to the previous weight initialization scheme, which is just based on the incoming dimensionality to each layer.

Arbitrary number of stages during training

Should be an easy feature to implement - just put the second through n stages inside a loop, where the program first checks to see if "DO_TRAIN(n)" exists in the parameter file and if so fetches LEARN_RATE(n), etc, and runs training, otherwise returns.

coordinator stopped while threads still running

Never seen this before. It happens in the xval subloop.

cross validation broken when tboard output off

need to add a conditional instructing the xval loop to only write tboard summaries when the option is turned on.

Odd issue with dimensions of fully connected layer

For some reason they have to be identical. I might have a variable misnamed somewhere.

Expand fits IO capability

Probably want to make this as flexible as possible - think about adding in fields in the parameter file to:

specify which header keywords correspond to what
only select certain rows in a multispec file
specify keyword for actual IDs, rather than fits row numbers

update manual language to reflect the inference output naming conventions

inference now outputs the filename that was inferred + '_infer.out'

Basic testing capability

Need to write a function to test ANNA - similar to the older version of the code, it would be good to read and parse a binary file, run it through ANNA, and then print some sort of summary statistics.

Remove timeline capability

It doesn't seem useful to an end-user and I don't need it anymore.

Better error catching

Add some exceptions to catch missing parameter errors, etc, and kill the python process and all threads. Having an issue with zombie threads if the program crashes during run.

readmultispec not python3 friendly

the print commands are broken

Make test function output inferred results

It's a trivial calculation to do from the current test results but there's no reason not to just stick it in the output in the first place

need single fits image reader

Readmultispec doesn't work on single fits images. This may not be worth working on since the format of the fits images isn't going to be standardized.

wavelength-dependent continuum error

Might be nice to implement some sort of wavelength-dependent continuum error scheme. Could provide a template of similar form to the SN relative error template - user provides 2-column text file with wavelength, relative continuum error, then provides (using existing parameter) the continuum error to use for relative error = 1.0.

As with SN, the code can read in the template and interpolate the template linearly onto the training wavelength grid.

First release, future plans

After testing, I think ANNA is ready for an initial v0.1 release.

In the pipeline, I think it's worth exploring switching to a FCN architecture, as the accuracy should be about the same but the training/inference speed should be much faster due to many fewer free parameters.

Weight initialization bug

The way the weights are initialized is incorrect, compared with the scheme in He (2015). This doesn't impact the network too much since it's pretty shallow, but it needs to be fixed.

Make number of trained parameters more flexible

Instead of getting rid of the option to select number of output parameters, actually make it meaningful (instead of virtually locked to be 5).

Make it changeable, and add a field in the parameter file that contains the names of each of these parameters in order.

Edit the manual to reflect that the normalization numbers must be in the order specified by the text field added and must correspond to the number of output parameters selected when building the architecture.

Modify how the output files are written.

Change the radv field to be a dummy parameter in general (it already is) and modify the manual to note this and instruct users to set it to anything they want if they aren't going to use it.

Timeline output broken

Not sure why. It just outputs an empty json file now. Probably first thing to do is test it on Sputnik with the code as-is and see if it's different there. Could be a no-chrome thing.

Trim convnet function

Don't need all the wrappers for things ANNA doesn't use.

Additional parameters to link to parameter file

Preprocessed example queue depth
Learning rate
Binary training/test datasets
Option to output a timeline for debugging
Saved state location
Number of preprocessing threads

Tensorboard during training

Add ability to automatically open a tensorboard session and dynamically print training progress - error, etc.

Add ability to turn off cross-validation preprocessing

This would be useful if, for example, one wanted to cross-validate on real data.

Streamline train_neural_network

Functional, but getting a bit verbose. Need to clean up and refactor some, and add some helpful comments as well.

multiple text file readin

Maybe just add capability to read in multiple text files for inference at once - separate the names with commas?

Multi stage training

Finish implementing the ability to do training in several stages with different learning parameters. Possibly integrate with the early stopping feature to automatically start second stage of training once early stopping kicks in for the first stage.

Add saving and restoring functionality

Need to add in the ability to save/load models using specified locations in the parameter file

Early stopping and cross-validation

Modify the training function to do more testing diagnostics during the training process. Specify a separate binary file to use as watchlist and define some sort of early stopping parameter that will stop training once error stops improving for some number of iterations.

Add ability to graph outputs of test

Generate plots of dtemp vs true temp, etc. and add a flag to the parameter file turning this on or off.

Fits image IO

Need to add the ability to read in fits images and pass them in the form needed by the feedforward function. Probably safe to assume that for now, only multispec fits images will be needed. For now just put in the basic functionality - read in an image, use all lines in the image as separate inputs, return the line of each spectra along with its inferred parameters.

Feedforward capability

Need to build a function to load a previously trained model and run data through, putting the results in a file. Would be best to write this as taking input of a standard form, so that I can test it on binary inputs using existing functions, but later on write some sort of function that will do fits IO

Preprocessing stalling GPU

Right now the code is set to do preprocessing on a single thread and then passing batches to the gpu. Looks like there's something with the GIL not being released quickly enough or the preprocessing taking too long relative to the time to do the gpu compute. Spawing more preprocessing threads doesn't solve the issue, so it's not an IO bottleneck. This is a nontrivial issue that might require looking into multitprocess preprocessing, rather than threading.