akleroy / phangs_imaging_scripts Goto Github PK
View Code? Open in Web Editor NEWCASA+python imaging and product creation scripts for PHANGS-ALMA
License: MIT License
CASA+python imaging and product creation scripts for PHANGS-ALMA
License: MIT License
We automatically identify line channels for blanking in continuum data and we use the velocity to identify SPWs of interest. This involves a frame conversion - from the LSRK or BARY frame where we know the galaxy velocity to the "TOPO" velocity in CASA. In theory the scripts should handle this correctly, right now they rely on the magnitude of the correction being modest compared to the line width, so that the right SPW is identified and the channel range flagged just slops slightly in one direction or another.
The issue is that CASA's TOPO frequencies have an associated date, which needs to be extracted automatically from the measurement set in question. And because the reduction scripts often concatenate several executions on different dates, we really need to figure out how to get the relevant date for the frequencies.
Should not be too hard, just need to do this and we haven't figured it out yet. Andreas has scripts to do the rest, and the analysisUtils can also do the conversion given a reference date.
Distinguish between 0s and NaNs in the moment maps (Jiayi)
Lower the threshold used in moment 1 construction per the discussion led by Annie.
Carry out the high res / low res "and" operation to fill in the bright cases (lingering issue).
I'm not sure how much this is useful at later stages, but at the staging step there are cases when we might get new data from a project and just want to stage that. Add a selection using the "project" field in the ms_file_key . If we use this lot, we may end up wanting to revise the short code to the full ALMA code.
It's not clear what the best scope here is. Think about whether we can consolidate things.
This will take products and assemble a release. Follows IDL code build_release. This issue is to build the skeleton and spec tasks and recipes.
We will need to implement the reading of the key file "cleanmask_key.txt", which is stored in KeyHandler._cleanmask_keys
, into KeyHandler._cleanmask_dict
. Then to add a function KeyHandler.get_cleanmask_for_target()
. These will be needed by our ImagingHandler
.
Noise / other details can leave the 12m, 12m+7m, etc. slightly offset. There's a request to get these all onto exactly the same grid, so that there is just "12m" grid, "7m" grid. Doable with a little effort, but I'm going leave this until we resolve it.
Make the staging script a bit smarter - it should offer an overwrite option so that it can be run and put new data in place without needing the names of the new galaxies.
Do this
Do this then close this issue.
We should add a diagnostic script that images and compares the dirty beam field by field.
We need to go through the UVDataHandler and discuss/iterate what should go into separate modules and what should remain in the handler object/module.
Right now this is done by hand, which is fine. But for generality we probably want to clean things up even more.
In theory, we'd like to get Erik's pheathering visualization folded in to this more or less seamlessly. I'm just not sure on the dependencies at the other sites.
This is a lower priority issue than the others, since we don't yet have new TP data flowing in. Come back to it.
OVERALL
EXTRACTION SCRIPTS
Homogenization of parameter names across scripts.
Go towards a single mstransform call without a Hanning smooth. Use both chanaverage, regridms, and mode='velocity'
Do we want weight reporting or plotting as a sanity check in the extraction? getMeanWeights or a scripted plotms calls.
Could eventually look at other options for frequency averaging in the continuum case.
IMAGING SCRIPTS
We need to make sure that we save all files needed for cube heuristics.
We need to expose the appropriate set of convergence criteria and major cycle-forcing criteria?
Automation of phase center is certainly possible but needs a bit of care.
uvtaper should in the end probably be a "target_resolution" and then treatment in quadrature to solve for the taper that we will use. Then estimate the beam size and work this out.
fluidly switch between cube and image, just want one script. Also allow subset of channels to allow tests and/or parallelization.
CLEAN/STOPPING
Clean needs to call mode="velocity" if our mstransform does
We need to write the intermediate analysis script that stats out the current state of the cube.
Need to be happy with the upscaling of iterations + major cycle triggers as the loop progresses.
Need to refine our stopping criteria via one of a few options:
... ... convergence in flux (flux in model changes by <~ YYY%)
... ... reach some overall threshold
... ... run out of iterations? Seems arbitrary...
MASKING
. Need three paths:
default: just use PB mask at ~0.2
make a mask: convolve to coarse resolution and do a joint bright signal + signal at low resolution mask.
Here the "coarse resolution" is probably set by the biggest scale in the multi-scale clean. Or a dilation in 2d takes you to the biggest scale. In any case, the mask needs to be expansive enough to allow the use of the big scales.
Jerome's recommendation was a very broad mask, so perhaps convolution to ~10" or so may even be our approach?
import a user supplied mask that gets lined up to the dirty cube.
FUTURE
interface with galbase to note sources in cube
identify lines in cube given the galaxy
by hand parallel-ization can probably wait. Andreas's experience is 4-5 cores used on a 20 core machine? So gain would be factor of ~3-4. But right now a day is sort of part for the course.
Add a stopping criteria so that if the model flux goes net negative then we stop.
This represents a larger extension of the scripts. Probably need a dedicated post processing for this. We would need to modify the clean scripts to deal with this.
So far only the delivered data have mosaic parameters. Fill out the rest of these ASAP and reducing the delivered data will be easier.
Add a third organizational script - this should handle flagging, directory swapping, and running the calibration scripts.
These go to the screen right now and the log is saved, but we should write our own small log file, showing the progression of cleaning. This could also be used to pick up the number of iterations in the "deep" clean from the end of the "bright" iterations.
But scripting this is a pain in the butt given the not-quite-straightforward interaction of u-v coverage + taper to produce the beam. We can ask around at the data reduction telecon to see if people have smart ideas.
In principle, this can gain us ~0.1 arcsec or similar.
A single end-to-end call works, but something is off in the clean call generation (or somewhere else) that stops revert_to_multiscale, etc. from allowing a partial reset.
There appear to be issues right now due to variables hanging around from previous calls in the ipython shell, so that it's not possible to just call a bunch of cleans one after the next. Fix this.
This is certainly due to the lack of encapsulation (stupid execfiles) from just running everything at the basic shell level, but we can't fix that unless we drop to calling CASA from the linux shell. That seems like overkill.
Instead, go hunt down the variables that hang around and cause the problems. These should be either deleted or trapped so that things reset on the next call.
Want to be able to run the whole script from copy -> image -> cubes -> output hands off.
Once these are done, migrate the phangs-specific processing into the the postprocessHandler object, which just juggles file names and applies recipes.
the backup/replace functions in the pipeline in v1 need to untangle the names. That will allow resuming from intermediate stages. This is fixed in v2, just need to check out a v1 and fix it.
Possibly project, too. This should simplify the code inside the tasks considerably and shift all iteration to the loop. The list building routines should already be in the data.
This should (mostly) be subtraction of code rather than addition, I hope. There are actually issues for each individual routine: copying, custom scripts, line extraction, etc. We can open those or put a checklist here or whatever.
In general, add this capability to "makeMask." It's just some axis shuffling, regridding, and read-and-write. Then faint lines can/should begin with a prior based on a brighter line.
This doesn't always happen, so let's wait and see, but I encountered the case that some of the .temp files made by extractLine weren't deleted because inside CASA they were viewed as being used. On quitting CASA they could be deleted by hand.
We're now using python's logging features. Need to extend this to the casaFeather, etc. routines.
Mostly this is fighting syntax related to using the tools deep inside functions. Should take no more than a couple of hours.
Channel 0 and continuum imaging should be relatively easy to execute using the pipeline, but I haven't yet tested this. Will require a bit of test and debugging. Do this.
Finish pushing the clean loop structure to be fully arbitrary by:
saving the previous clean output
accepting the convergence criteria as a function with the arguments current cube and previous cube, along with a running dictionary. Then implement our simple delta flux and threshold criteria as an example convergence function.
...
Should be only a few lines to finish this.
Refactor the release building code into separate "cube" and "product" pipelines. Should mostly just be a split down the middle of the current code.
Cube : deal with round beams, units, feathering, merging of multiple fields, and then convolution to various physical resolutions.
Product : estimate noise, make the various masks, including clean masks, and then build the release.
This isn't affecting anything other than general cleanliness of the code, so come back to this once things run end-to-end for the refactored v1+.
There are hooks for user-defined overrides in the postprocess handler. Put these in and document them. Many of these existed in v1.0. Then resolve this ticket.
Document these scripts better.
For now, leave out the tasks and implementation. We should start by building the loops and infrastructure in parallel to the postprocessHandler and uvdataHandler.
That's all. I'm closing this immediately.
We want both a high res and a tapered version of most of our cubes. Once the first batch runs for all galaxies, we should update the scripts to do this and then run these.
We need python (spectral cube or np/scipy) code to collapse cubes.
Architecture is up for discussion. I think this could be its own scXXXYYY module. And probably it makes sense to have a series of functions rather than one monster function (cpropstoo collapse_cube is probably too much "monster").
We can do a full checklist, but we need:
mom0
tpeak
tpeak over a 12.5km/s (or other) window
mom1
vpeak (and maybe vpeak_quad)
mom2
equivalent width
That's a minimum list - we can for sure use more.
They all need:
I would suggest to make this a modular as possible while keeping sanity.
@low-sky That's you!
will go in each of the various "handlers"
Probably this is a bunch of try except statements in the casaStuff module or maybe something simpler in the handlers themselves that heads off any attempted CASA imports (like a quick test line followed by casaimports if successful).
We should remove this, as the logger levels serve this function already.
Not sure - should we move this repo to the phangs team ownership?
NB a number of non-phangs people (EDGE, VERTICO, etc.) are collabs on this. Does that mess this up?
UV continuum subtraction and other data set specific processing needs to be added.
Add this, so that we can call the loops and debug the handler without doing all operations.
start with the loops and initialization and leave out the tasks for now. Similar to postprocessHandler and uvdataHandler ...
print_uv_ranges and my by-hand inspection of the clean mask vs residuals and image should just drop in to the pipeline as functions. This isn't blocking anything so is probably low priority.
Not urgent. Fix eventually.
For the run through the 4303 CO 2-1 cube, I saw some artifacts - the striping over a few fields - appear over a couple channels and a couple fields. Pretty limited, but not okay to keep around.
These are not in the "bright" checkpoint, and so appear only in the deep clean. Will need to tinker a bit to see if we can change that part of the loop to avoid clean wandering off into a pathological place.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.