drphilmarshall / pangloss Goto Github PK

Line of sight mass reconstruction in the Universe

License: GNU General Public License v2.0

Python 97.77% Shell 2.23%

pangloss's Introduction

Pangloss

Pangloss is software module for reconstructing all the mass within a light cone through the Universe. Understanding complex mass distributions like this is important for accurate time delay lens cosmography, and also for accurate lens magnification estimation. It aspires to use all available data in an attempt to make the best of all mass maps. However, this problem is sufficiently difficult that we are mostly spending our time cultivating a garden of toy models and baby steps.

If you are interested in the ongoing investigation of this interesting problem, you should:

Read our first paper, "Reconstructing the lensing mass in the universe from photometric catalogue data;"
Browse the wiki to see what our current plans are;
Fork the code and play around with it;
Contact us! Phil and Tom are keen to hear from other researchers in the field who would like to coordinate and/or collaborate.

License

Pangloss is distributed under the Gnu Public License (GPL) v2, a copy of which is delivered with this code. This basically means you can do anything you like with it, as long as you keep propagating the license information.

However, when reporting on your use of Pangloss, we do ask that you include:

A citation to: Collett et al. (2013), MNRAS, 432, 679, arxiv/1303.6564
An acknowledgment including a link to this repo, such as: "This work made use of the Pangloss code, written by Tom Collett and Phil Marshall, which is freely available at https://github.com/drphilmarshall/Pangloss."

Installation

After cloning or forking the repository, add the following lines (or similar) to your .login file:

setenv PANGLOSS_DIR ${WORK_DIR}/Pangloss
setenv PYTHONPATH ${PANGLOSS_DIR}/Pangloss:${PYTHONPATH}
setenv PATH ${PATH}:${PANGLOSS_DIR}

Then "import pangloss" from python should just work, and the command line scripts should too. However, you will need to have the following packages installed as well:

import atpy,asciitable,pyfits
import numpy,scipy,getopt,matplotlib

Most (if not all) are available via pip and/or homebrew.

Example use

Right now, there are three main scripts that carry out Pangloss's functions:

Drill.py: from an input catalog of galaxies (either real or simulated), drill out a narrow lightcone (or set of lightcones) centred on some sky position of interest, and save the resulting smaller catalog (or catalogs).
Reconstruct.py: read in a lightcone catalog (or set of catalogs), and assign dark matter mass to the galaxies within, probabilistically. Each lightcone yields its own Pr(kappah|D), where kappah is the total gravitational lensing convergence at the centre of the lightcone - this quantity is of great relevance to the strong lens time delay distance in particular.
Calibrate.py: kappah is (at the moment) a fairly crude approximation to the "true" convergence kappa - but we have shown that one can recover a better approximation by considering Pr(kappah,kappa|C), where C is large set of lightcone catalogs drawn from a cosmological simulation. The resulting PDF Pr(kappah|D,C) contains the assumption that this simulation is an accurate representation of our real Universe - but we're working towards relaxing this.

They all take, as their sole input, the same configuration file, an example of which is given here.

In the calib directory we include the means to obtain the halo catalog for a 1x1 square degree patch of Millennium Simulation sky, and its associated ray traced convergence map, for making the calibration lightcones. A small mock observed galaxy catalog is included in the example directory, for testing.

To get the test data, please do the following:

cd calib
Fetch.csh

The Fetch script uses wget. Additional galaxy catalogs and ray-traced convergence maps from the Millenium Simulation are available from Stefan Hilbert on request.

You should then be able to execute the following example analysis:

cd example
Drill.py example.config
Reconstruct.py example.config
Calibrate.py example.config

Analysing from start to finish will take some time. Be patient! (Drill ~2 mins, Reconstruct ~10 mins)

For more details of what the Pangloss scripts are doing, start reading the code here.

Also, check out Tom's flowchart that describes the process of data simulation and testing carried out in Collett et al (2013):

PDFs based on overdensities

The original Pangloss has been modified so that it can be used to find convergence and magnification pdfs for any given group of fields based on comparing overdensities of the observed fields with simulation lightcones.

pangloss's People

Contributors

Stargazers

Watchers

Forkers

kcwong tcollett albertfxwang charlottenosam cdfassnacht frelikh eduardrusu enoordeh mclaughlin6464 davidthomas5412

pangloss's Issues

Create ForegroundCatalog class

We need to create a subclass of Catalog that will contain the foreground galaxies contained in each map and their respective properties such as their magnitude, redshift, and mass. ForegroundCatalog's should be able to plot themselves as scatter plots in world coordinates with the size of the point optionally dependent on either magnitude or mass. The class should also be able to count the number of galaxies in the entire catalog, as well as the number in different redshift, mass, and magnitude ranges. Much of this code is likely already in Lightcone, so we might want to move those defs to a separate file and import into each instead.

Background catalogs should be able to plot themselves as sticks

BackgroundCatalogs can currently plot themselves either as scatter plots or as ellipses. It may be easier to see the weak lensing if we plot just the background galaxies' intrinsic and lensed ellipticities as sticks, similar to shear maps. However, quiver() likely won't work anymore as the galaxies aren't in a uniform grid. We'll look into drawing the sticks manually, and converting all of the different plotting techniques into a def.

PDFs need to be able to plot themselves

How's this going, Tom?

Literature review

There's a lot of relevant literature out there! We could try and write an introduction as we go, reviewing the literature around large scale density field modeling.

Single demo with Kappamap and Shearmap overlay

We should combine all map demos into a single demo that explains how to use and plot Kappamaps and Shearmaps, and then overlay one with the other to visually check if the shear and convergence maps align correctly.

kappah values too low in calibration lines of sight

Possible causes:

mis-reading of catalog file and/or mis-assignment of variables (config.py, io.py, lightcone.py, example.config) -> check Mhalo, Mstar values
mis-use of variables in kappah calculation (lightcone.py) -> check propagation of variables

Are all halos' kappah contributions too low, or just some? Smae question for Mstar, Mhalo etc.

Phil needs to give Spencer the full data set

I only have the (0,0) maps and the (0,0) foreground galaxy catalog data. Moving forward, I'll need @drphilmarshall to send me the rest of the data so I can make sure the physical to wcs transformations are working and for calculating the summary statistics over larger data sets.

Brief BORG wiki documentation

Hi Charlotte! Let's put up some minimal wiki documentation about the BORG project, so that people have another paper to read and cite. At the very least the home page needs updating with your project, and you might feel moved to make an additional wiki page besides.

(Linked from the #18 super-issue)

README should contain a link to Stefan's website

If only I could assign this to Stefan himself!

wiki/organisation updates: Stanford Data Science Initiative project, BORG project

Hi all,

I'm working on getting the Pangloss project documentation in order, in preparation for some new work on fitting the mass model to weak lensing data. This project has some funding from Stanford, but it would be great to continue collaborating with you all.

There's a few things that need doing!

@rhw could you check over the new wiki pages I made, please? The opening paragraph summary is something we could send to the SDSI people as they keep track of the projects they are supporting. #22
@tcollett would you mind also reading through the new pages, and add comments and questions? This is probably best done on a separate issue thread, here: #19.
@charlottenosam It would be great to provide some simple notes on your BORG work on the Pangloss wiki, so that people at least know what to read (and then cite). I've given you push access to this base repo so you can add to the wiki. Here's the issue thread for discussion: #20
Another issue thread: merging back in the code used in the BORG project? Discussion at #21
If anyone has future plans for the Pangloss code, please do add them to the wiki home page, so we can try and keep track of things! We'll use the wiki to log progress on the weak lensing stuff.

Thanks!

Phil

WLMaps need to be able to plot themselves in world coordinates

This means we need a method to parse Stefan's filenames to extract the world coordinates (Ra, Dec) of the map field center. Since all his maps have the same size and pixel scale, we can carry on hard coding these in the read_binary_data() method. When we start working with other sims we'll add more io methods anyway.

Alignment of foreground galaxies and maps

There appears to be a small alignment issue with ForegroundCatalogs and the WLMaps. This could be because the foreground galaxy positions are actually weakly lensed themselves - we (i.e. @drphilmarshall ) need to ask Stefan Hilbert whether this is the case or not.

Example analysis needs designing and implementing

We need a sensible, plausible mock catalog, containig only observable quantities. Then, we need the example.config to represent a meaningful test analysis, with suitable numbers of calibration lightcones, PDF samples etc.

example/example_catalog.txt is missing?

@tcollett, is the repo missing a file? I am running the example, which runs fine (in the wl branch, using astropy tables!) until line 239 of Drill.py

Drill: All 1000 calibration lightcones made.
Drill: Reading in observed catalog: /Users/pjm/work/stronglensing/H0/lightcones/Pangloss/example/example_catalog.txt
Traceback (most recent call last):
File "/Users/pjm/work/stronglensing/H0/lightcones/Pangloss/Drill.py", line 239, in
Drill(sys.argv[1:])
File "/Users/pjm/work/stronglensing/H0/lightcones/Pangloss/Drill.py", line 201, in Drill
table = pangloss.readCatalog(obscat,experiment)
File "/Users/pjm/work/stronglensing/H0/lightcones/Pangloss/pangloss/io.py", line 59, in readCatalog
raise IOError("Cannot open %s\n" % filename)
IOError: Cannot open /Users/pjm/work/stronglensing/H0/lightcones/Pangloss/example/example_catalog.txt

Joint PDF for <kappah> and kappa should be a PDF object

This will make plotting it easier: all the PDF plotting code should live in the PDF class!

Observed catalogs will have positions in J2000 Ra, Dec

We need to be able to cope with normal astronomy sky positions, and convert them to lightcone x,y coordinates in radians. I guess future simulated catalogs may also have object positions in mock RA and Dec too.

Switch from atpy, pyfits to astropy

Decide on a summary statistic for lensed galaxies and calculate it

We need to decide whether we will first summarize the lensing of a BackgroundCatalog using an ellipticity correlation function or a galaxy-mass correlation function. We'll then need to decide whether we write our own method to calculate it, or use someone else's method. We also need to decide how to handle galaxies on the edge of the map.

Write up HB theory in a latex document

Because latex math is easier to read than ipad notes.

Alignment of foreground galaxy/halo catalogs and convergence/shear maps

Let's plot a piece of the foreground catalog on top of the convergence map. The clusters of galaxies had better lie on top of the kappa map density peaks! Take a look at how the Lightcone objects plot themselves - some of those functions might be able to be extracted into separate defs and re-used from the ForegroundCatalog class (as well as from the Lightcone class).

Weak lensing hierarchical inference

Hey @tcollett, do you have comments/queries on the SDSI-sponsored weak lensing / hierarchical inference project described on the wiki here? If you have time to work on this with us in the coming year it would be great to carry on collaborating :-) At the very least, we have the H0LiCOW application to consider.

(Linked from the super-issue at #18)

Catalogs need to be able to convert physical positions to world coordinates

This is to enable meaningful, general overlays of catalog objects on WLMaps.
The positions in Stefan's catalogs are in physical coordinates relative to the corresponding weak lensing maps - so we'll need to use the right map to do the catalog coordinate transformation.

For reference: each field (map) is 4 by 4 degrees, and each catalog is a 1x1 deg patch in that field. In the catalog filename GGL_los_8_7_7_3_3_N_4096_ang_4_*.txt, the field is (7,7) and the patch is (3,3). Let's choose the WCS center to be the center of field (0,0). In total we have 8x8 = 64 fields (1024 square degrees!), each containing 16 patches/catalogs.

Are Shearmaps being plotted correctly?

The overlay plots of the convergence and shear maps look to be aligned, but the shear sticks don't appear to be plotting in the correct orientation. Here is an example image:

One possible fix could be that the .gamma_1 and .gamma_2 files are in the opposite order of how I am using them (i.e. I'm using $\gamma=\gamma_1+i\gamma2$ when instead it is $\gamma=\gamma_2+i\gamma_1$). When I made this change I got the following image:

Neither images are convincing. We should develop a test that doesn't rely on eyeballing the images to make sure we are plotting the shear sticks correctly.

Merge back in BORG code?

The base repo and @charlottenosam 's fork are about to diverge, as we get started on the weak lensing project. Ideally we would merge the code used in the BORG project back into Pangloss, so that any improvements you made relating to magnification etc will remain useable by others forking from the base. @tcollett, @charlottenosam can you comment on this, please? The minimal solution to this issue would to merge into a BORG branch, which could then accept changes from the WL project while retaining its own structure - but much better would be to submit a pull request to the base repo's master branch, so that we are all on the same page.

(Linked from the super-issue at #18)

No Pangloss4!

Pangloss4 has reappeared, containing Pangloss/distances.py and nothing else. The Pangloss4 directory needs removing. If its distances.py is the correct one to use, it needs to be moved to pangloss. Below is a suggested seequence of commands!

mv Pangloss4/Pangloss/distances.py pangloss/
\rm -rf Pangloss4
git status
git commit -am "Moved to distances.py correct place, removed unwanted alternative Pangloss directory"
git push

Summer 2015 project planning

We need an outline of a summer project for @sweverett . Let's make a wiki page, and some related issues.

Fix BackgroundCatalog subplots

There appears to be a minor bug with reading in the current plot information when trying to overlay BackgroundCatalogs over maps and foreground objects when the subplot isn't square. This should be fixed to work in the same way as overlaying ForegroundCatalogs.

We don't have an arxiv number to refer people to!

Let's fix this ASAP :-)

Kappamaps should be able to plot themselves

Add a method to the class, then demo it

Lightcone needs a plot method

Let's give lightcone objects the option to plot themselves - preferably as the 4-panel composite seen in Fig 1, athough theyll need to have finished furnishing themselves with Mhalos and kappah contributions first. Any given realisation should be plottable. I guess we might also consider having them optionally make other plots too - like distributions of kappah, cumulative distributions of kappah with radius etc etc. All these can be implemented gradually, after astroph submission.

Shearmap sampling

There is a strange size of subplots where Shearmaps are not plotting correctly. When the number of pixels on an axis are ~40-200, it looks like the bin sampling is calculated incorrectly and wrapping around to the other side of the image. We need to look back in Shearmap to find a better way to sample shear stick locations.

Add shape and measurement noise to background catalogs

Every background galaxy in a BakcgroundCatalog should have its image distorted with random measurement noise and a systematic shape noise. The contribution of each distortion should be an input in the add_noise() method of BackgroundCatalog.

WL/HB project paragraph for SDSI records

Hi @rhw,

I'm thinking we could simply send our SDSI sponsors this link once the opening paragraph looks good enough. Feel free to edit, and then send - once it's gone you can close this issue :-)

Cheers

Phil

PS. This is a spin-off from #18

Simplest possible toy problem?

@tcollett suggested in #23 that we formulate a toy problem that we can use to explore the computational feasibility further. This thread is for discussing that idea.

Code docstrings could be interpreted as wiki pages!

How hard can this be? I like the human-readable docstrings we have (as opposed to the sort of thing sphinx uses) - we just need a tool that strips them out and makes guthub wiki pages... This should probably be a standalone github project, however.

Two slide update for WL group meeting

This is also the Elevator Pitch rehearsal! You'll need to remind the group what our project is, before giving them an update.

ipython notebooks do not open: "Unreadable Notebook global name 'NBFormatError' is not defined"

I think this is my problem: ipython --version returns 2.2, and from some googling it looks like ipython made a breaking change at version 3.0 that causes this error. I am updating my anaconda ipython by doing:

conda update conda
conda update ipython ipython-notebook ipython-qtconsole

Flowchart ought to be published somehow

Can you scan in and email me that flowchart you drew, please Tom? It'd be great if it could be turned into a proper figure. Only so many hours in the day though.

SHMR class is incomplete

Functionality required: makeCDF, loadCDF, drawMstar, drawMhalo, others? Probably a plot function - at lower urgency.

Calculations in databases?

While we discuss the details of the inference in #23 , I think it's clear that to predict weak lensing data and evaluate our log likelihood we will need to be computing the lensing effects (gamma, kappa) of ~100s of halos on each source, for 10^5 sources per sq deg in under a second (to keep our lives sane). I started wondering about fast methods to do this computation in parallel like Map/Reduce - my limited understanding from half an hour's reading is that some databases may enable this automatically/by design.

So, I'm thinking that we are likely to want to do this project with some sort of actual database, preferably one sitting on top of thousands of CPUs!. Since the importance sampling could, I think, be a very general use case of an LSST object catalog that contains interim posterior samples of things like galaxy shape and galaxy photo-z, I'm thinking about boiling this problem down to its simplest possible pseudocode and plain python/everything in memory demo, and then getting in touch with Jacek and KT at SLAC to see if they want to collaborate.

Thoughts?

calib/Fetch.csh cannot find necessary files

Can you please fix your website so that the correct files are available, with names in accordance with calib/Fetch.csh? Thanks!

WL hierarchical inference theory

We have a simple PGM for this problem, but now we need to:

Expand that into maths
Discuss!
Translate that in to a latex document, that describes how each PDF might be handled (samples, weighting, etc)
Agree on this
Make some simple feasibility calculations to see how computationally intensive this inference will be, and how it could be made tractable (eg which parts can be parallelised and how, whether there are any speed-ups that can be made, etc).

@beckermr and @tcollett, let's do this together! Tom suggests we attempt to work this through independently and then compare, what do you think?

README/wiki needs finishing off

In particular we need:

Bib details for the paper for the README and wiki
Library function descriptions on the wiki

Create BackgroundCatalog class

We need to create another subclass of Catalog that contains galaxy catalog information for the background galaxy sources that are weakly lensed by the foreground galaxies. Each BackgroundCatalog object should be able to generate N galaxies that are uniformly distributed in an inputted domain, along with uniformly distributed ellipticities and intrinsic shapes. Each source should be at a redshift of z=1.3857 as this is where Stefan's ray tracing was done from. BackgroundCatalog should be able to plot itself (along with the optional subplots, units, etc from previous plotting methods), both on its own and overlaid over ForegroundCatalog's and WLMap's. We should also determine how to represent the orientation of a source galaxy, perhaps with a colored stick with size proportional to magnitude similar to ShearMap objects. Catalog's generate() method can either be modified or absorbed into BackgroundCatalog to accomplish much of this.

Shearmaps

Let's sub-class the Kappamap and enable commands like:

MSgamma = pangloss.Shearmap('gamma1.fits','gamma2.fits')
gamma_hilbert = MSgamma.at(x,y)

Tricky things will be: having the superclass understand what to do with two files, then two maps.
And I guess Shearmaps should be able to plot themselves, with red sticks plotted at the vertices of some downsampled grid. (Also sub-plotting, we'll need that :-))

README should be short and snappy

Probably we just want to introduce the project and ourselves, mention the licence and then outline the workflow implied by the three main scripts (and their config file).

Catalog class

We need a bit more flexibility with our catalogs. We're going to want to start generating catalogs of weakly lensed background objects from scratch, and then compare them with catalogs of background objects weakly lensed by catalogs of foreground objects - and then at some point we'll need to cope with Matt's Dark Sky data rather than Stefan's Millennium Simulation data. So, let's put all this in a Catalog class, and give it some useful general methods before subclassing it to handle foreground and background objects.

The readCatalog function in io.py can be the initial read() method of a Catalog class, I think - we can add a kwarg for source=Millennium later. I guess each Catalog object will need to contain an astropy table called data so that we can pass just this as input to Lightcone (for example). Then, the Catalog base class will also need to be able to write() itself out (using some standard astropy call), and also generate() itself - just drawing world coordinates on the sky (RA and Dec, in standard units (rad?)).

Then, we can make BackgroundCatalog inherit from Catalog, and extend the generate() method to include adding galaxy shapes as well as sky positions. To check that the catalogs make sense, we'll need to give them plot() methods showing sky positions and then distributions of generated (or read-in) quantities.

Let's not upgrade the Lightcone class for now, so that we don't break the top level scripts that @tcollett and @charlottenosam wrote - but I'm mentioning them here so that they can follow along with this upgrade in the wl branch, and give feedback if they spot any potential pitfalls!