gbrammer / eazy-py Goto Github PK

View Code? Open in Web Editor NEW

34.0 7.0 23.0 9.05 MB

Pythonic photometric redshift tools based on EAZY

License: MIT License

Python 100.00%

eazy-py's Introduction

eazy-py: Pythonic photometric redshift tools based on EAZY

Under heavy construction....

Documentation will be here: https://eazy-py.readthedocs.io/, though it's essentially just the module API for now.

Templates and filter files still here: https://github.com/gbrammer/eazy-photoz/.

Note

Please submit any questions/comments/problems you have through the Issues interface.

Installation instructions

pip install eazy

# Forked dependencies that are not yet released on PyPI
pip install git+https://github.com/gbrammer/dust_attenuation.git

Demo

Citation

Please cite both this repository and Brammer et al. (2008). A BiBTeX for this repository can be generated via the Cite this repository link in the upper left corner of the GitHub page.

eazy-py's People

Contributors

Stargazers

Watchers

eazy-py's Issues

Additional information on newer templates.

Would it be possible to get further information on how some of the newer template sets, particularly tweak_fsps, were constructed?

It would be useful to know the assumed metallicity, dust attenuation law, etc, so they can be used consistently.

Additionally it would be great if the nature of the values in the SFH file could be clarified. When plotting what must be the SFR against time based on the given units (kg/yr) in units M_sun/Gyr, the values are identical to the accompanying plot labelled as sSFR/Gyr. I'm unable to make sense of this.

re-write input catalogs with zeropoint corrections

does this function exist, or would it be easy to implement?

Scipy error on running eazy HDFN-demo notebook

Hi,
I get this error while running the HDFN-demo notebook after installing eazy-py.
ImportError: cannot import name 'splmake' from 'scipy.interpolate'
It seems splmake is no more in the recent versions of Scipy.

two requests about eazypy architecture for cloud computing and scalabilty

I am a principal scientist at a korean astronomy institute, especially interested in applying Big Data techs to Astronomical Problems.

I have found two issues when I try to run eazypy on my Spark Cluster.

[1] local file access for filters and parameters
When running programs on Cloud, we do not have local file system, though we have "bucket", a cloud storage.
Hence, all filters and sed-parameters need to be "in-memory" objects or "cloud-storable" objects.

your approach using symbolic links is not friendly for running eazypy on cloud or big data platform.

[2] your hard-wired, single node + multi-thread, optimization
Unfortunately, I have found many astronomical tools are hard-optimized on "single node" + "multithread".

This specific optimization is not good for writing a "scalable" code.

Just, single thread + one by one SED fitting architecture, not loading thousands objects with running them on multi-threads,
could be enough to massively parallelize the code for thousands or millions threads simulanesouly on hundreds multi-nodes cluster using big data platform.

===
I do not know whether this can be applied or not, but single node + multi-thread optimization is not good for both simple single thread run and massive multi-nodes run.

Update setup scripts to be more CI and pip friendly

See the update-setup branch.

TBD:

how to include the pieces from the eazy-photoz submodule that are currently symlinked in eazy/data
how to include the dependencies of my github clones of the dust_extinction and dust_attenuation packages

Pivot wavelengths are incorrectly calculated?

In line 178 of filters.py, the pivot wavelengths are calculated:

    def pivot(self):
        """
        Pivot wavelength
        
        http://pysynphot.readthedocs.io/en/latest/properties.html
        """
        integrator = np.trapz
        
        num = integrator(self.wave, self.wave*self.throughput)
        den = integrator(self.wave, self.throughput/self.wave)
        pivot = np.sqrt(num/den)
        return pivot

The issue is that np.trapz accepts the y value first, and the x value second, such that it should read:

        num = integrator(self.wave*self.throughput, self.wave)
        den = integrator(self.throughput/self.wave, self.wave)

This has the effect of producing incorrect pivot wavelengths in zout.

Inflated error bars in eazy-py output

@gbrammer There seems to be some artificial inflation of the error bars at the IRAC ch1 and ch2 points for a z>3 object in my catalog. I calculated the expected errors using this formula from the manual:

and that expected based on SYS_ERR alone (e_fnu^2 = e_fnu_cat^2 + (SYS_ERR*fnu_cat)^2), with SYS_ERR = 0.01.
From the table below, it seems that an additional error is being added to the IRAC ch1 and ch2 points somehow.

Doubts on the use of eazy-py

How does eazy-py account for non detections (upper limits)?
If I have one source that is out of the field for one specific observed band I have to put a value of Fn<-90 (NOT_OBS_THRESHOLD), but what if the source is undetected due to the fact that the observation is 'not deep enough' to reach the emission flux?
I have in mind how HYPERZ works. In that case I have to tell the code the limiting magnitudes associated to each observation in each filter, and then, in case of undetection in a certain band (filter), there are different possible ways through which the code can account for these upper limits. In which way does eazy-py code account for them in the z_phot computation?
The prior magnitude file is applicable only if i have between my photometric points exactly the filter of the prior or is it enough having roughly the same filter band?
That is: if I have the photometry of the K band from WIRCAM can I use the generic K prior contained in the given file 'prior_K_extend.dat'?
Is there a way not to combine the templates in during the analysis?
Putting 1 in TEMPLATE_COMBOS in the file "zphot.param.default" doesn't work, moreover whatever value i put (2,45,True,Ciao ecc..) the code works giving the same results and always linearly combining all the templates (i see different components in the final plot). I didn't find any references to the parameter TEMPLATE_COMBOS neither in the photoz.py file.
Which is the origin and for which kind of objects are best suited the default templates "tweak_fsps_QSF_.." or "tweak_fsps_temp_kc13_.."?

Thank you

Problems with installation of Photo-z

in the terminal , some errors occur after running the make command . as you can see:

gcc -msse2 -c -O3 -Wall main.c In file included from main.c:1: defs.h:35:1: error: unknown type name 'int32_t'
35 | int32_t nusefilt32, NTEMP32, nobj32, NZ32, NK32,*izsave32, NTEMPL32; defs.h:185:8: error: unknown type name 'int32_t'
185 | extern int32_t *klim_idx; main.c:126:1: error: unknown type name 'int32_t'
126 | int32_t *klim_idx; main.c: In function 'main':
main.c:245:34: error: 'int32_t' undeclared (first use in this function)
245 | klim_idx = malloc(sizeof(int32_t)*nobj); main.c:245:34: note: each undeclared identifier is reported only once for each function it appears in
main.c:446:31: error: expected ';' before 'nusefilt'
446 | nusefilt32 = (int32_t) nusefilt; main.c:447:28: error: expected ';' before 'NTEMP'
447 | NTEMP32 = (int32_t) NTEMP; main.c:448:25: error: expected ';' before 'NZ'
448 | NZ32 = (int32_t) NZ; main.c:449:27: error: expected ';' before 'nobj'
449 | nobj32 = (int32_t) nobj; main.c:611:43: error: expected ';' before 'Kcolumn'
611 | klim_idx[iobj] = (int32_t) Kcolumn; main.c:775:29: error: expected ';' before 'NK_prior'
775 | NK32 = (int32_t) NK_prior; main.c:178:16: warning: unused variable 'zm2' [-Wunused-variable]
178 | double zm1,zm2,zpeak_best,zpeak_prob,pztot; main.c:178:12: warning: unused variable 'zm1' [-Wunused-variable]
178 | double zm1,zm2,zpeak_best,zpeak_prob,pztot; main.c:177:33: warning: unused variable 'iz_zm2' [-Wunused-variable]
177 | long i,j,iobj,izbest,iz_zm1,iz_zm2, *izsave,ipeakmax; main.c:177:26: warning: unused variable 'iz_zm1' [-Wunused-variable]
177 | long i,j,iobj,izbest,iz_zm1,iz_zm2, *izsave,ipeakmax;`

make: *** [makefile:3: main.o] Error 1

HDFN-demo.ipynb issue and questions

I am trying to use my own catalog using fluxes I calculated from the magnitudes I got from the B,V,g',I',r, sdss filtered pictures I took. I have been having issues with getting it set up. I was using HDFN-demo.ipynb to learn how to do so and I got an error but had help how to figure it out. the issue was with

self = eazy.photoz.PhotoZ(param_file=None, translate_file=translate_file, zeropoint_file=None, params=params, load_prior=True, load_products=False)

see picture below of the error

what the issue was the templates file

Had to do
!rm -R templates

os.symlink(os.path.join('easy-photos','templates'), 'templates')

to fix it.

now I was using this demo to figure out how to add my magnitudes and fluxes csv file so I can estimate the redshift and what im doing is not working. do you have any recommendations on how to do so?

Binder Demo: Building wheel for grizli (setup.py) ... error

@gbrammer I am just starting to work with eazy-py and in the binder demo I keep getting errors on the importing/installing of grizli in the second cell:

followed by a slew of other red error messages.

I have tried on a few different machines to see if that was the issue and am still getting the same result. (As a disclaimer, this is my first time using binder so it could well be that.)

Unable to enforce TEMPLATE_COMBOS

I've been trying to set TEMPLATE_COMBOS = -2, per the eazy manual and the source code for eazypy. However, including this parameter in my params dictionary that I pass to the PhotoZ object fails to actually enforce this template restriction. I have even tried it with TEMPLATE_COMBOS = 2 and TEMPLATE_COMBOS = 1, and the PhotoZ.showfit() output still reveals that eazypy is using as many templates as it wants.

I have even gone so far as to change zphot.param.default under /eazy/data and eazypy still does not enforce TEMPLATE_COMBOS.

Please let me know what I might be missing

Thx

Error in example HDFN

Hello, I'm having troubles to end the example in the HDFN example, basically, running cell [9]:

zout, hdu = self.standard_output(rf_pad_width=0.5, rf_max_err=2, prior=True, beta_prior=True)

I get the following error:

238         if isinstance(item, str):

--> 239 return OrderedDict.getitem(self, item)
240 elif isinstance(item, (int, np.integer)):
241 return list(self.values())[item]

KeyError: 'energy_abs'

Any help?
Cheers

Consistency between EAZY and eazypy

Hi @gbrammer

I'm trying to understand why there are differences between outputs in the legacy way of running EAZY vs eazypy. Since both run the same C code, I'm bit puzzled by the difference in outputs I get as seen on the image below.

The photo-z parameters are different between the two: in the legacy I use z_peak (which I prefer to be the z_photo) over z_ml. Legacy z_mc has worst comparisons. Both param files are the same.

Would you know why there is a discrepancy here?

[jaguar_mock_glass.param.txt](https://github.com/gbrammer/eazy-
jaguar_mock_glass.cat.txt
py/files/8897166/jaguar_mock_glass.param.t
jaguar_mock_glass.translate.txt
xt)

Bug in fit_catalog

The fit_catalog method that replaced fit_parallel in 2b3caf8 has a serious bug in the iteration over redshift steps in that it evaluates the TEF at the unit index of the redshift grid rather than the redshift itself!

Uncertainties on photometric redshifts

Hi,

I am trying to use eazy-py on a test sample of 1000 sources, and I am trying to understand why I am not getting uncertainties in photo-zs. For example, when looking at zout['z025'] or zout['z160'](which I guess are the percentiles of the p(z) distribution) all the values are 0.005.

I am able to plot SEDs and P(Z). They look fine to me. To give you more context, I am using this tutorial, I am just using my own catalog with my own templates.
When I try running the tutorial, I get "right" values for zout['z025'] or zout['z160'].

Other than catalog and templates, this is what I have set up differently:

I am using magnitudes instead of flux densities, and I have set up params['MAGNITUDES'] = True
I do not turn off the iterative corrections, so I commented out self.set_sys_err(positive=True)
when doing zout, hdu = self.standard_output(simple=False, rf_pad_width=0.5, rf_max_err=2, prior=True, beta_prior=True, absmag_filters=[], extra_rf_filters=[])

I get this message: Couldn't find template parameters file templates/ananna17_seds/ananna_list.param.fits for population synthesis calculations.

Error in extinction

Hi,

I am running eazy-py for the first time, but I got an fundamental error, I guess, as follows (even mentioned libraries already installed)

ImportError: Couldn't import extinction module from dust_extinction, extinction or specutils

Thanks in advance for help.

Issue: eazy dividing by a factor of ~2.756 all flux densities??

Hi, I was trying to calculate the UV mag from the eazy best fit templates, in a catalog of simulated data, for which I have both the photometry and the spectroscopy. My results were overestimated by a factor of ~1 mag. Trying to understand why this was happening (the photometric redshifts are well estimated, and the same is for the beta factors), I noticed that when eazy fits the catalog, for some reason each flux is divided by a factor of ~2.756. I attach an example plot, with the comparison between the SED from eazy best fit, the spectrum from the catalog, the photometry of the catalog and the photometry and best fit of eazy.
The parameters I am using are the following (if not listed, I used the default ones):
params['Z_STEP'] = 0.01
params['Z_MIN'] = 4
params['Z_MAX'] = 15.
params['TEMPLATES_FILE'] = 'templates/LarsonSEDTemplates/tweak_fsps_QSF_12_v3_newtemplates_34.param' (some new templates added to the default ones)
params['FIX_ZSPEC'] = False
params['IGM_SCALE_TAU'] = 1.0
params['SYS_ERR'] = 0.03

Any idea of why this is happening?

Best-fit template not agreeing with the best-fit photometry

When fitting some simulated HST+JWST fluxes, comparing the original EAZY to eazy-py, where I've fit the same data set with the same templates (and template error function), without using a prior. The minimum chi-square values (and the chi-square surfaces) agrees quite well between EAZY and eazy-py, and the resulting redshifts are similar (well, the redshifts corresponding to the minimum chi-square).

However, when using self.show_fit(), I've discovered that many of the targets have fits that do not seem to agree with the best-fit templates. For instance:

The F150W and F200W template fluxes are significantly higher than the best-fit template. I've gone and pulled this template out by examining how self.show_fit() calculates the template from the coefficients, and passed it through the F150W and F200W filters by hand:

When I fit this object with the original EAZY (and the same parameters, including the same templates), I get this fit (in green):

Notice that the best-fit photometry is almost identical between this fit and the eazy-py fit, but the template combination here leads to stronger line emission accounting for the high F150W and F200W fluxes. I don't know enough about how the coefficients are calculated within eazy-py to understand why this discrepancy occurs. The templates that I am using are the standard set of EAZY templates along with some generated from fsps:

# Template definition file
#
# No blank lines allowed (for now).
#
# Column definitions:
#   1. Template number
#   2. Template file name
#   3. Lambda_conv (multiplicative factor to correct wavelength units)
#   4. Age of template model in Gyr (0 means template is always used)
#   5. Template error amplitude (for INDIVIDUAL template fits)
#   6. Comma/space separated list of template numbers to be combined
#      with current template if combined fits are enabled.
#
# Sample entry:
# 1 [path_to_file]/template1.sed 1.0 14.7 0.2 2,3,5
1   templates/eazy_v1.1_sed7.sed    1.0 0 1.0   
2   templates/eazy_v1.1_sed6.sed    1.0 0 1.0   
3   templates/eazy_v1.1_sed5.sed    1.0 0 1.0   
4   templates/eazy_v1.1_sed4.sed    1.0 0 1.0   
5   templates/eazy_v1.1_sed3.sed    1.0 0 1.0   
6   templates/eazy_v1.1_sed2.sed    1.0 0 1.0   
7   templates/eazy_v1.1_sed1.sed    1.0 0 1.0   
8   templates/ssp_25Myr_z008_withem.sed    1.0 0 1.0   
9   templates/ssp_5Myr_z008_withem.sed    1.0 0 1.0   
10  templates/c09_del_8.6_z_0.019_chab_age09.40_av2.0.sed  1.0 0 1.0   
11 templates/erb2010_highEW.sed   1.0 0 1.0    
12  templates/tau_0.01_age_0.1_dust2_0.0_fsps_model.sed    1.0 0 1.0 
13  templates/tau_0.0398_age_0.3162_dust2_0.0_fsps_model.sed    1.0 0 1.0 
14  templates/tau_1.0_age_11.749_dust2_0.0_fsps_model.sed    1.0 0 1.0 
15  templates/tau_10.0_age_0.01_dust2_0.0_fsps_model.sed    1.0 0 1.0 
16  templates/tau_10.0_age_0.01_dust2_0.6_fsps_model.sed    1.0 0 1.0

And the simulated data that I'm using is:

# id z_spec f_HST_F435W e_HST_F435W f_HST_F606W e_HST_F606W f_HST_F775W e_HST_F775W f_HST_F814W e_HST_F814W f_HST_F850LP e_HST_F850LP f_NRC_F090W e_NRC_F090W f_NRC_F115W e_NRC_F115W f_NRC_F150W e_NRC_F150W f_NRC_F200W e_NRC_F200W f_NRC_F277W e_NRC_F277W f_NRC_F335M e_NRC_F335M f_NRC_F356W e_NRC_F356W f_NRC_F410M e_NRC_F410M f_NRC_F444W e_NRC_F444W  
# id z_spec F233 E233 F236 E236 F238 E238 F239 E239 F240 E240 F363 E363 F364 E364 F365 E365 F366 E366 F375 E375 F381 E381 F376 E376 F383 E383 F377 E377
4116 -9999.0 2.8299999237060547 7.302999973297119 3.0799999237060547 8.663000106811523 3.2699999809265137 6.857999801635742 3.3399999141693115 4.50600004196167 3.569999933242798 1.1139999628067017 5.039000034332275 1.0789999961853027 6.1539998054504395 0.7509999871253967 7.366000175476074 0.8029999732971191 7.269999980926514 0.7829999923706055 4.236999988555908 0.5210000276565552 3.7190001010894775 0.8180000185966492 4.986999988555908 0.5659999847412109 4.456999778747559 0.8560000061988831 4.303999900817871 0.7170000076293945  
4152 -9999.0 19.8799991607666 11.027000427246094 16.579999923706055 5.099999904632568 15.0600004196167 1.437000036239624 14.9399995803833 7.045000076293945 14.529999732971191 1.8279999494552612 17.166000366210938 1.0880000591278076 18.32900047302246 0.7570000290870667 26.575000762939453 0.8180000185966492 28.052000045776367 0.8059999942779541 28.492000579833984 0.6110000014305115 24.31399917602539 0.9589999914169312 24.016000747680664 0.652999997138977 24.506000518798828 0.9789999723434448 22.983999252319336 0.8100000023841858  
4157 -9999.0 4.400000095367432 9.006999969482422 3.9000000953674316 3.565000057220459 4.070000171661377 1.6019999980926514 4.360000133514404 9.281999588012695 5.329999923706055 2.114000082015991 3.8940000534057617 1.0770000219345093 7.982999801635742 0.7509999871253967 7.0980000495910645 0.796999990940094 7.150000095367432 0.7799999713897705 4.627999782562256 0.5460000038146973 3.75600004196167 0.8529999852180481 4.074999809265137 0.5860000252723694 4.761000156402588 0.8930000066757202 4.1579999923706055 0.7450000047683716  
4212 -9999.0 13.289999961853027 13.012999534606934 7.869999885559082 7.86299991607666 10.039999961853027 9.72700023651123 2.759999990463257 2.447999954223633 0.07999999821186066 8.678000450134277 1.6820000410079956 1.055999994277954 4.639999866485596 0.734000027179718 4.000999927520752 0.7799999713897705 3.6710000038146973 0.7599999904632568 2.308000087738037 0.5400000214576721 2.8239998817443848 0.8550000190734863 2.609999895095825 0.5870000123977661 2.877000093460083 0.8949999809265137 3.8559999465942383 0.7590000033378601

with fluxes in nJy. I can provide other files, including the param file, the extra templates file, and/or the h5 file, if necessary.

Question about coefficients being output for fits at all redshifts

Hello! Going through the python scripts that were written to help analyze EAZY results in the past, I've seen how EAZY can be set to output a binary file which contains the coefficients corresponding to one of the "best fits" to the observations (I think that this is z_a in what I'm doing, but I could be wrong). Looking at the chi-square distribution, though, often results in two solutions that have similarly low chi-square values. I'm wondering if the coefficients are saved for the lowest-chi-square solution in each redshift bin, so that I can assess multiple minima in the chi-square distribution and plot these other alternate fits at different redshifts. Are these saved in that binary file? If so, how can the be accessed?

Multiprocessing raises a RuntimeError when using Python >=3.8 on MacOS

On Mac systems using Python 3.8 and later, the process start method defaults to "spawn" which raises a RuntimeError if mp.Pool is not called within a if __name__ == '__main__': block. I temporarily fixed this by calling mp.set_start_method('fork') when multiprocessing is imported at the start of TemplateGrid, however it may be better to do this by importing multiprocessing at the start of photoz.py and setting the start method to 'fork' then.