Git Product home page Git Product logo

pyse's Introduction

Python Source Extractor

This is the Python Source Extractor (PySE) for radio astronomical images.

This project was formerly part of the Transient Detection Pipeline:

https://github.com/transientskp/tkp/

Installation

PySE available on pypi:

$ pip install radio-pyse

PySE is compatible with Python 3.10 or up.

License

PySE is released under the BSD-2 license.

Authors

The list of authors, sorted by the number of commits:

  • Hanno Spreeuw
  • John Swinbank
  • Gijs Molenaar
  • Tim Staley
  • Evert Rol
  • John Sanders
  • Bart Scheers
  • Mark Kuiack

Developer information

PySE uses hatch to manage the different environments during development. So make sure you have hatch installed globally. You could either use your system's package manager to install hatch, or use pipx to install as a regular user. Please ensure that you are using a version hatch>=1.10, otherwise you might encounter this bug.

In general, to run command in a specific environment managed by hatch, you can do this:

$ hatch run <environment>:<command> [--options]

You can see a summary of all the evironments managed by hatch like this:

$ hatch env show
             Standalone
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Name    ┃ Type    ┃ Dependencies ┃
┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ default │ virtual │              │
├─────────┼─────────┼──────────────┤
│ test    │ virtual │ pytest       │
│         │         │ pytest-cov   │
├─────────┼─────────┼──────────────┤
│ lint    │ virtual │ black        │
│         │         │ flake8       │
│         │         │ mypy         │
│         │         │ ruff         │
└─────────┴─────────┴──────────────┘

Some common tasks using hatch are summarised below.

Package builds

hatch does package builds in an isolated environment. The package build setup also uses a dynamic hook (also known as build hooks) to generate the package version from Git repository release tags. So to do a local package build, you need to ensure all Git tags are present.

  1. Fetch all Git release tags locally.

    $ git fetch --tags
  2. You can now build a distribution (a wheel file and a source tarball) locally using:

    $ hatch build

    This creates the distribution files in the dist/ directory in the project root.

    $ tree dist/
    dist/
    ├── radio_pyse-0.3.2.dev9+gfb04dc7.d20240729-py3-none-any.whl
    └── radio_pyse-0.3.2.dev9+gfb04dc7.d20240729.tar.gz
    
  3. If you want to trigger only the build hooks (like generating the package version), you can do:

    $ hatch build --hooks-only

    This is necessary to refresh the version information if you update any of the build configuration in pyproject.toml, or if you are implementing something that depends on the version, e.g. making a new capability available only for a newer version.

Running the test suite

$ hatch run test:pytest [tests/test_iwanttorun.py] [-k match_string] [--options]
$ hatch run test:pytest --no-cov  # to disable coverage

Running formatters and static analysis tools

You can run supported linters/formatters (see the environment definition for lint) like this.

$ hatch run lint:mypy [--options]
$ hatch run lint:flake8 [--options]
$ hatch run lint:ruff check sourcefinder
$ hatch run lint:black --check sourcefinder

Note that on first run, mypy might need to install type-stubs. You can do that with:

$ hatch run lint:mypy --install-type --non-interactive

Running scripts that use PySE

Normally a regular user would install a released version from PyPI, but to use a development version you may run such scripts like this:

$ hatch run scripts/pyse [--options]

Since the development environment is the default, you don't need to specify the <envrironment>: prefix in the run command.

pyse's People

Contributors

antoniar avatar gijzelaerr avatar hannospreeuw avatar jdswinbank avatar mkuiack avatar suvayu avatar tmillenaar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pyse's Issues

iscasa() errors when a string is supplied as path

When porting over the accessors from tkp to use the accessors from pyse, I encountered an error when attempting to open a casacore file. The file is being opened by the test tests/test_inject.py::TestLofarCasaInject::test_no_injection in tkp. It tries to open the following file: https://github.com/transientskp/trap-test-data/tree/master/casatable/L55596_000TO009_skymodellsc_wmax6000_noise_mult10_cell40_npix512_wplanes215.img.restored.corr
The open call is made on this line:
https://github.com/transientskp/tkp/blob/8a19cd23c7141c66c1ee8e42295957bbcf809531/tests/test_inject.py#L55

It then errors with the following error:

*** Boost.Python.ArgumentError: Python argument types in
    Table.__init__(table, list, list, int, int, int)
did not match C++ signature:
    __init__(_object*, casacore::String, casacore::String, casacore::String, bool, casacore::IPosition, casacore::String, casacore::String, int, int, casacore::Vector<casacore::String, std::allocator<casacore::String> >, casacore::Vector<casacore::String, std::allocator<casacore::String> >)
    __init__(_object*, casacore::String, casacore::Record, casacore::String, casacore::String, int, casacore::Record, casacore::Record)
    __init__(_object*, std::vector<casacore::TableProxy, std::allocator<casacore::TableProxy> >, casacore::Vector<casacore::String, std::allocator<casacore::String> >, int, int, int)
    __init__(_object*, casacore::Vector<casacore::String, std::allocator<casacore::String> >, casacore::Vector<casacore::String, std::allocator<casacore::String> >, casacore::Record, int)
    __init__(_object*, casacore::String, casacore::Record, int)
    __init__(_object*, casacore::String, std::vector<casacore::TableProxy, std::allocator<casacore::TableProxy> >)
    __init__(_object*, casacore::TableProxy)
    __init__(_object*)

The one noteworthy difference I found is that pyse uses .encode():

table = casacore_table(filename.encode(), ack=False)

Where tkp does not:
https://github.com/transientskp/tkp/blob/8a19cd23c7141c66c1ee8e42295957bbcf809531/tkp/accessors/detection.py#L64

I think this was fixed by @HannoSpreeuw when moving from python2 to python3 in this commit:
transientskp/tkp@a8c7d02

But it seems the fix never made it to pyse.

Is this correct? If so, let's fix it on pyse as well :)

Cheers,
Timo

Implementation of `ParamSet` is inconsistent

Description

Output from mypy

sourcefinder/extract.py:244: error: Cannot assign to a method  [method-assign]
sourcefinder/extract.py:244: error: Incompatible types in assignment (expression has type "dict[str, Uncertain]", variable has type "Callable[[], ValuesView[Any]]")  [assignment]
sourcefinder/extract.py:271: error: Value of type "Callable[[], ValuesView[Any]]" is not indexable  [index]
sourcefinder/extract.py:274: error: Unsupported right operand type for in ("Callable[[], ValuesView[Any]]")  [operator]
sourcefinder/extract.py:276: error: Unsupported target for indexed assignment ("Callable[[], ValuesView[Any]]")  [index]
sourcefinder/extract.py:278: error: Value of type "Callable[[], ValuesView[Any]]" is not indexable  [index]
sourcefinder/extract.py:279: error: Unsupported right operand type for in ("Callable[[], ValuesView[Any]]")  [operator]
sourcefinder/extract.py:280: error: Value of type "Callable[[], ValuesView[Any]]" is not indexable  [index]
sourcefinder/extract.py:295: error: "Callable[[], ValuesView[Any]]" has no attribute "keys"  [attr-defined]

The implementation is inconsistent. While it inherits from collections.abc.MutableMapping, it doesn't respect the mapping API

Problem is here (the first error above):

self.values = {
'peak': Uncertain(),
'flux': Uncertain(),
'xbar': Uncertain(),
'ybar': Uncertain(),
'semimajor': Uncertain(),
'semiminor': Uncertain(),
'theta': Uncertain(),
'semimaj_deconv': Uncertain(),
'semimin_deconv': Uncertain(),
'theta_deconv': Uncertain()
}

For example, in the line mentioned in the error above, it should not be overriding the values() method. AFAIU, the implementation stores it's data as a regular dict in ParamSet.values, and all downstream use looks like this: instance.values.<dict_method call> instead of instance.<dict_method_call>.

Resolution

  1. Either MutableMapping should be removed.

    @@ -220,7 +220,7 @@ class Island(object):
             return measurement, gauss_residual
     
     
    -class ParamSet(MutableMapping):
    +class ParamSet:
         """
         All the source fitting methods should go to produce a ParamSet, which
         gives all the information necessary to make a Detection.
  2. Or the implementation should be fixed to reflect the Python mapping API.

RMS map masked

I'm using radio_pyse 0.2.2 and getting a masked RMS map. The PySE messages are given below and the image file is attached. The image looks fine in DS9, I don't understand what's going on in PySE.
DISK01C0656.fits.gz

pyse --grid 128 --regions DISK01C0656.fits
Processing DISK01C0656.fits (file 1 of 1).
WARNING:sourcefinder.accessors.fitsimage:WCS units unknown; using degrees
WARNING:sourcefinder.accessors.fitsimage:End time not specified in DISK01C0656.fits, setting to start
Thresholding with det = 10.000000 sigma, analysis = 3.000000 sigma
WARNING:root:RMS map masked; sourcefinding skipped

Import error

I ran into the error below when trying to run pyse -h installed from the main branch. I believe this link should provide a straightforward solution: https://stackoverflow.com/questions/70870041/cannot-import-name-mutablemapping-from-collections

Traceback (most recent call last):
  File "/mnt/Gunther/Software/pysenv/bin/pyse", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/mnt/Gunther/Software/pyse/scripts/pyse", line 32, in <module>
    from sourcefinder.accessors import open as open_accessor
  File "/mnt/Gunther/Software/pyse/sourcefinder/accessors/__init__.py", line 16, in <module>
    from sourcefinder.image import ImageData
  File "/mnt/Gunther/Software/pyse/sourcefinder/image.py", line 11, in <module>
    from sourcefinder import extract
  File "/mnt/Gunther/Software/pyse/sourcefinder/extract.py", line 8, in <module>
    from collections import MutableMapping
ImportError: cannot import name 'MutableMapping' from 'collections' (/usr/lib/python3.10/collections/__init__.py)

Add developer documentation

  • building packages
  • versioning
  • running the test suite
  • running linters/static analysers/formatters
  • running scripts

Not python 2 compatible.

from pip install radio-pyse

$ pyse --detection 5 --radius 400 --csv --force-beam * 
Processing 20160930093114UTC_S307.5_I16x1_W6_A1.5.fits (file 1 of 2719).
Thresholding with det = 5.000000 sigma, analysis = 3.000000 sigma
/scratch/mkuiack/venv2/local/lib/python2.7/site-packages/sourcefinder/image.py:761: RuntimeWarning: invalid value encountered in greater
  (self.data_bgsubbed > analysisthresholdmap) &
Traceback (most recent call last):
  File "/scratch/mkuiack/venv2/bin/pyse", line 396, in <module>
    print_(run_sourcefinder(files, options), end=' ')
  File "/scratch/mkuiack/venv2/bin/pyse", line 388, in run_sourcefinder
    csvfile.write(csv(sr))
  File "/scratch/mkuiack/venv2/bin/pyse", line 93, in csv
    file=output)
  File "/scratch/mkuiack/venv2/local/lib/python2.7/site-packages/six.py", line 782, in print_
    _print(*args, **kwargs)
TypeError: unicode argument expected, got 'str'

Troubles with assscalar

Hello. I am getting this error when running pyse:

Traceback (most recent call last):
  File "/Users/meriembehiri/anaconda3/bin/pyse", line 32, in <module>
    from sourcefinder.accessors import open as open_accessor
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/sourcefinder/accessors/__init__.py", line 13, in <module>
    from sourcefinder.accessors import detection
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/sourcefinder/accessors/detection.py", line 9, in <module>
    from sourcefinder.accessors.aartfaaccasaimage import AartfaacCasaImage
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/sourcefinder/accessors/aartfaaccasaimage.py", line 5, in <module>
    from sourcefinder.accessors.casaimage import CasaImage
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/sourcefinder/accessors/casaimage.py", line 8, in <module>
    from sourcefinder.utility.coordinates import WCS
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/sourcefinder/utility/coordinates.py", line 13, in <module>
    from astropy import wcs as pywcs
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/astropy/wcs/__init__.py", line 26, in <module>
    from .wcs import *
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/astropy/wcs/wcs.py", line 50, in <module>
    from astropy import units as u
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/astropy/units/__init__.py", line 17, in <module>
    from .quantity import *
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/astropy/units/quantity.py", line 28, in <module>
    from .quantity_helper import (converters_and_unit, can_have_arbitrary_unit,
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/astropy/units/quantity_helper/__init__.py", line 10, in <module>
    from . import helpers, function_helpers
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/astropy/units/quantity_helper/function_helpers.py", line 119, in <module>
    np.asscalar,
  File "/Users/meriembehiri/anaconda3/lib/python3.8/site-packages/numpy/__init__.py", line 320, in __getattr__
    raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'asscalar'

What could it be? I looked for the asscalar in the pyse files and related to replace it but I didn't find any.
Tell me if you have any clue please.
Thanks,
Meriem

Missing FitsImage fixes

Hey guys,

I was trying to read fits images using the pyse accessors, but ran into an error while the fits accessor from tkp worked fine. I looked into it and found that the version in pyse is from 7 years ago and is missing some fixes.

The problem I ran into is that I encounter the following line:

freq_eff = header['RESTFRQ']

but don't have 'RESTFRQ' in the header, hence a KeyError is raised.

It looks like this was addressed in the following tkp commit:
transientskp/tkp@ae5aa59

I imagine the simplest way to address this is by cargo-culting fits_image.py from tkp:
https://github.com/transientskp/tkp/blob/8a19cd23c7141c66c1ee8e42295957bbcf809531/tkp/accessors/fitsimage.py

@suvayu Shall I make a PR with the copied fits_image.py or would you prefer to tackle this yourself?

Cheers,
Timo

Modules missing from setup.py

There seems to be a few modules missing from setup.py that generate errors like the one below. I ran into these errors with dask, dask-distributed, and psutil.

Traceback (most recent call last):
  File "/mnt/Gunther/Software/pysenv/bin/pyse", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/mnt/Gunther/Software/pyse/scripts/pyse", line 32, in <module>
    from sourcefinder.accessors import open as open_accessor
  File "/mnt/Gunther/Software/pyse/sourcefinder/accessors/__init__.py", line 16, in <module>
    from sourcefinder.image import ImageData
  File "/mnt/Gunther/Software/pyse/sourcefinder/image.py", line 18, in <module>
    import dask.array as da
ModuleNotFoundError: No module named 'dask'

rms too high

I constructed a 4k*4k image with 409² 100 Jy sources with a Gaussian profile on a regular grid, inspired by the authors of this paper. They use the same size image, with 167000 sources.
I inserted those sources in the image plane on top of pure Gaussian - so uncorrelated - noise with a std of 1 Jy.

I ran
pyse --detection=50 --analysis=50 --grid=4096 --rmsmap --islands SOURCESINSERTED_100Jy.FITS to get a single noise level - should be close to 1 Jy - and, hopefully 167281 detections.

Unfortunately, the rms value from the rms.fits is 19.5608 Jy. Way too high and way higher than the ground truth of 1 Jy.

Same output with pyse.py from the tkp repo.

Generate and deploy docs

  • API docs
  • Installation instructions & developer docs
  • Multiple versions: master, latest & past releases

Reimplement `DataAccessor` for easier testing & validation

From Mypy:

sourcefinder/accessors/dataaccessor.py:90: error: "DataAccessor" has no attribute "tau_time"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:91: error: "DataAccessor" has no attribute "freq_eff"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:92: error: "DataAccessor" has no attribute "freq_bw"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:93: error: "DataAccessor" has no attribute "taustart_ts"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:94: error: "DataAccessor" has no attribute "url"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:95: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:96: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:97: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:98: error: "DataAccessor" has no attribute "centre_ra"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:99: error: "DataAccessor" has no attribute "centre_decl"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:100: error: "DataAccessor" has no attribute "pixelsize"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:101: error: "DataAccessor" has no attribute "pixelsize"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:112: error: "DataAccessor" has no attribute "wcs"  [attr-defined]

This is because the base class uses a custom metaclass, and the base class doesn't really have these defined, but some methods use them (e.g. extract_metadata).

This kind of enforcement of API behaviour is much more cleanly achieved by using either of the following,

Make a new minor release

  • Convert to pyproject.toml (#48)
  • casacore doesn't have 3.12 wheels, restrict to <3.12
  • Restrict Python version >=3.10
  • Separate build/test dependencies from runtime requirements
  • Clean-up entry points (scripts)
  • #49
  • Automatically upload wheel to PyPI

Migrate test data from LFS to plain Git

In principle, LFS is the “right” way to manage large data files like the ones in the PySE tests.

In practice, the transientskp organization keeps running out of LFS bandwidth quota.

All of the test data files are smaller than the GitHub limits on file sizes.

I propose, therefore, to migrate this data out of LFS and into plain Git.

Perform kappa, sigma clipping using multiple CPU cores. Try Ray.

Usually, a large image is divided into many subimages because background statistics can vary over the image.
The background statistics are derived by kappa, sigma clipping.
Presently, this is done serially, so subimage after subimage.

Since this is an "embarassingly parallel" problem, so the output of one subimage does not depend on another, we can use all the cores of a CPU simultaneously.

Ray is just one package we can try.

Missing argument in `ImageData.reverse_se`

Description

Mypy tells me:

sourcefinder/image.py:468: error: Missing positional argument "anl" in call to "extract"  [call-arg]

I don't know what should be the correct value of anl here, so I
couldn't fix it.

Remarks

This method is not used anywhere in PySE. I'm guessing this was
implemented because it might be useful elsewhere in TraP? If so,
there should be explicit tests for the ImageData class testing all
potential downstream API calls. There are no such tests.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.