lsstdesc / bayesian-pipelines-pixels Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 2.0 3.61 MB

Bayesian cosmological inference directly from pixels.

License: MIT License

Python 5.68% Jupyter Notebook 94.32%

bayesian-pipelines-pixels's Issues

add code to generate a simple PSF model (atmospheric)

add a high-level function to use `descwl-shear-sims` images for evaluation

Some notes:

not the forward model, this should be separate (Axel)
For simplicity (not having to install the LSST pipelines which is nearly impossible) we should use Axel's fork

add more realistic distributions for generating synthetic catalogs

could be useful in development and to test things quickly, but I think we will mostly use DC2 catalogs to sample

Project Roadmap

Useful links:

Project Proposal

Generative Model

Stage 0: Setting things up

Stage 1: Multiband / redshifts/ variable shear

single image per band
multiband
redshift + SEDs
variable shear
galaxy locations poisson process of 2D density field
feed angular power spectrum to galsim/colore -> 2D gaussian density map
bin galaxies by true redshift
lensing magnifications

Stage2: Blending / Impact of PSF

blending effects
noisy PSF estimate
use correlation functions in PSF

Inference

Stage 0

Goals:

connect BLISS with JIF/BFD and recover constant shear
start defining what what outputs look like: probabilistic magnitudes + shear

Tasks:

Stage 1

BFD is already multi-band (can potentially photo-z)
check that we recover power spectrum (?)
assume we know true redshift
connect with RAIL (?)
- catalogs: some aspect of photometry + position
- how to connect posterior samples of photometry
- tables_io

Stage 2

recovering binning (tomo challenge)
run on DC2

add code to draw galaxies in a scene based on a given "catalog"

allow for pip installation of `descwl-shear-sims` (Axel's fork)

https://github.com/aguinot/descwl-shear-sims

add code to generate bulge+disk galaxies

add notebook to train BLISS' detection encoder on forward model saved images/catalogs

Implement hierarchical importance sampling for constant shear in Jax

Our Bayesian inference codes for analyzing galaxy images, such as BLISS and JIF, produce Monte Carlo samples from the approximate posterior distribution of galaxy image model parameters given the pixel data. To infer shear, we need to marginalize over the galaxy model parameters while inferring a shear model common to all the galaxies. This script implements importance sampling to perform this marginalization and shear inference. See the papers here and here.

validate BLISS can recover counts, locations, ... on images from forward model

add utility functions for converting catalogs/images to BLISS format

generate catalog where galaxies have uniform spacing in the scene (blending off)

Stage 0: Setting up shear inference pipeline

Following our discussion last week and some reflection, I have attempted to identify the minium set of tasks to get our stage 0 shear inference pipeline up and running:

Data Format:

Decide on an input/output format for parameter and posterior

Train / Inference:

#37
allow for pip installation of latest version of BLISS
#43
#46
add notebook showing how to run JIF on forward model saved images
add notebook showing how to use importance sampling from recovering shear from shape samples
add notebook putting together BLISS/JIF/MBI-Shear notebooks to produce shear posterior estimates
add script/functions that runs shear inference pipeline from beginning-to-end
Discuss what training sample to use (more complex than Gaussians)

Validation:

Check flux scale/background in descwl-shear-sims images is consistent with forward model
#47
merge #33
#44
#45
add script to run pipeline in a parallelizable way on descwl-shear-sims images.
run pipeline on descwl-shear-sims images to extract shear posterior, multiplicative bias

create simple unit tests that exercise code

No need to be complicated, just check code runs. This has the added benefit of showing how to run the code.

include a dummy catalog with three rows or something like that

brainstorm how to connect BLISS with BFD (inputs/outputs)

Resources:

Synergies with BLISS slides
Brainstorm document

discuss whether we need to optimize the code

Once most of Stage 0 is implemented

First version of pipeline

Outline

With @aguinot we have converged on what a first version of the pipeline will look like. We have decided to split the effort in two stages. In the first stage we will use simulations from descwl-shear-sims with constant shear per coadd and shape noise cancellation to evaluate the extent to which we incur in biases. For the second stage, the pipeline will be designed to take in simulations with a GLASS prior on clustering + shear + intrinsic ellipticities to output shear maps.

First Stage

We will start by targeting the simplest possible simulations where we can measure biases. The point is to setup the pipeline (detection, split into groups, joint measurement with HMC, shear posterior with MagicBeans) and have sufficient statisticts to distinguish a shear bias (with the mininum number of coadds possible).

Simulations: Gaussian galaxies, known constant Moffat PSF, no image artifcats, one band, one redshift bin, LSST-like, simple representative random distributions for size, flux, and ellipticity.
Procedure: SEP for detection and deblending, for each detection look at detections within 2'' and form groups to jointly fit in combination with SEP flag, 50x50 size cutouts, mask objects when computing likelihood in HMC that are not part of target galaxies, use MagicBeans to combine ellipticity samples
Output: Shear posterior samples (for each applied shear bin?)
Dataset: TBD (how large?)

LensMC Euclid paper contains useful guidance.

Steps

#55
#57
Setup functions to run chains with a convergence criteria
Setup code following MagicBeans procedure to construct shear posterior from ellipticity samples
Implement differentiable metacal to calibrate ellipticity samples

compare/validate with `descwl-shear-sims`

validate BLISS uncertainties on images from forward model

Setup detection pipeline with SEP + creating group cutouts with masks

Notes

Avoid FoF at this stage if possible
Consider ellipse matching code from Shuang to decide on jointly-fit groups
Groups and overlaps can be computed in parallel at the beginning before analyses begins (could save time, although perhaps a later consideration)

create footprints with single uncentered galaxies

Note: Recall no blending in stage 0. DM cutouts will always have brightest image in the center.

large enough extent (fixed) to avoid shape bias (diagonal > sides in a square)
fixed size of cutouts
random shift from center

download cosmo DC2 catalog and use them to sample galaxies

change column names to be compatible, example in BTK repo

brainstorm how to connect BLISS with JIF (inputs/outputs)

Resources:

Synergies with BLISS slides
Brainstorm document

check that we can generate ~100k obejcts in a reasonable amount of time

add code to sample galaxy parameters and locations in a scene

add survey parameters we should use (background, psf size, etc.)

Use galcheat parameters?

List:

psf size
mag -> flux conversion

add notebook to run pipeline on `descwl-shear-sims` images produced on-the-fly

Tasks

XX...

Questions

Should we make them into cutouts so JIF can process them?

validation with descwl-shear-sims

just basic check on fluxes are correct
true positions, true fluxes -> make sure we can get them from descwl-shear-sims
get a wrapper around the code and make sure that we can get these things easily
what simulation, we want to make, and what modifications we need?

Notes:

BasicSurvey does not WLD
for training, it can be BLISS adapted -> hdf5
for simulation, it has to be adapted to real data what real data will look like to make sure we can ingest it.
- do we really need to store saved images for validation?

brainstorm how to make code easily usable by other frameworks

For example, passing around images rather than galsim objects. Create a container for image information like pixel_scale that can be passed around.

@EiffL brought this up in today's meeting. It might be good to think about this earlier rather than later so that we don't have to do a lot of work later in the inference stage to make our code compatible with tensorflow, pytorch, jax, etc.

add Spergel galaxy profile

Comparable to the Sersic profile and doesn't have the same issues with varying the sersic index continuously in Galsim

train BLISS on forward model images

decide on data format to save the galaxy cutouts -> hdf5/fits/vaex
store many files or one large file ?
store all data for training/validation?

implement simple Sersic / Gaussian galaxy model

As @aguinot mentioned, starting with a Bulge+Disk model might be overkill. We can already test shear recovery in the simple Gaussian setting, and this model has a well-defined simple intrinsic ellipticity (no need for second-moment calculations, etc. ).

One note is that Bulge galaxies do not have a well defined size either (why?)

Later we can revisit the Bulge+Disk model (we need it for DC2)

can sample n parameter randomly instead of Bulge+Disk

Setup coadd generator using `descwl-shear-sim`

turn `descwl-shear-sims` notebook into script(s)

Add high-level functions that enable us to use descwl-shear-sims images for evaluation.

return truth catalog of positions, shapes for each object
make into cutouts for ingestion to JIF?
add decentering ? (maybe unnecessary if we use BLISS for location and centering of cutouts)

function to read and save images from forward model as h5py file
function to read and save catalog (astropy table) from forward model as h5py file

rewrite forward model in JAX

Some thoughts:

The difficult part is writing the shear function in JAX which to get enough precision requires higher-level interpolation.
To get started (for a toy model) we could use linear interpolation to implement the shear function in JAX.
Francois and Benjamin R. have been working on writing this in the Tensorflow cuda backend
If we wrote it in XLA anyone (JAX, Pytorch, Tensorlfow) could use it!
This might also be helpful in speeding up JIF which uses MCMC for shear inference

add code to shear galaxies and add gaussian noise to them

sample cutouts with only noise in them

will need a boolean indicating whether there is a source or not.

lsstdesc / bayesian-pipelines-pixels Goto Github PK

bayesian-pipelines-pixels's People

Contributors

Stargazers

Watchers

Forkers