Git Product home page Git Product logo

cdutil's Introduction

cdat

⚠️ WARNING: Maintenance-only mode until around the end of 2023.
The CDAT library is now in maintenance-only mode, with plans for deprecation and cease of support around the end of calendar year 2023. Until this time, the dependencies for specific CDAT packages (cdms2, cdat_info, cdutil, cdtime, genutil, libcdms) will be monitored to ensure they build and install in Conda environments. We currently support Python versions 3.7, 3.8, 3.9, and 3.10. Unfortunately, feature requests and bug fixes will no longer be addressed.
If you are interested in an alternative solution, please check out the xarray and xCDAT - Xarray Extended With Climate Data Analysis Tools projects.

build status stable version platforms DOI

Anaconda-Server Badge Anaconda-Server Badge

CDAT builds on the following key technologies:

  1. Python and its ecosystem (e.g. NumPy, Matplotlib);
  2. Jupyter Notebooks and iPython;
  3. A toolset developed at LLNL for the analysis, visualization, and management of large-scale distributed climate data;
  4. VTK, the Visualization Toolkit, which is open source software for manipulating and displaying scientific data.

These combined tools, along with others such as the R open-source statistical analysis and plotting software and custom packages (e.g. DV3D), form CDAT and provide a synergistic approach to climate modeling, allowing researchers to advance scientific visualization of large-scale climate data sets. The CDAT framework couples powerful software infrastructures through two primary means:

  1. Tightly coupled integration of the CDAT Core with the VTK infrastructure to provide high-performance, parallel-streaming data analysis and visualization of massive climate-data sets (other tighly coupled tools include VCS, DV3D, and ESMF/ESMP);
  2. Loosely coupled integration to provide the flexibility of using tools quickly in the infrastructure such as ViSUS or R for data analysis and visualization as well as to apply customized data analysis applications within an integrated environment.

Within both paradigms, CDAT will provide data-provenance capture and mechanisms to support data analysis.

CDAT is licensed under the [BSD-3][bds3] license.


We'd love to get contributions from you! Please take a look at the Contribution Documents to see how to get your changes merged in.

cdutil's People

Contributors

aashish24 avatar dnadeau4 avatar doutriaux1 avatar downiec avatar durack1 avatar jasonb5 avatar linamuryanto avatar muryanto1 avatar painter1 avatar remram44 avatar tomvothecoder avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

docetom habte1345

cdutil's Issues

var.squeeze() truncates axis information

I don't think the behavior below is what a user expects - certainly not me:

so: (1, 27, 180, 360)
['time', 'LEVEL', 'LATITUDE', 'LONGITUDE']
so.squeeze()
so: (27, 180, 360)
['axis_0', 'axis_1', 'axis_2']

Migrated from: CDAT/cdat#1738

Weight problems in cdutil.ANNUALCYCLE.climatology (daily data)

I have found out that when using cdutil.ANNUALCYCLE.climatology on daily data, the months of different years seem to be equally weighted.

This results in (slight but visible) errors:

  • for February: eg for years, when averaging the February values for 1959-1960-1961, the weights used seem to be [1, 1, 1], when they should be the actual month length: [28, 29, 28]
  • for months that have missing days: e.g. if the first month of the data file starts on January 2nd, but all the other January month in the time series have 31 days, the weights used seem to be [1, 1, 1, ...] when they should be [30, 31, 31, ..., 31]

Test script: https://files.lsce.ipsl.fr/public.php?service=files&t=115de1c1cb57217b85776fd08fe21b23
Test data: https://files.lsce.ipsl.fr/public.php?service=files&t=c7258c48f12d3451bfa67e5585bbedef

Running the test script above will get the following output, and you can see that the difference between ANNUALCYCLEclimatology and computation by hand for all months, including unweighted February is ~1e6. But there is a bigger difference when I compute a correctly weighted February mean

python -i bug_ANNUALCYCLEclimatology.py
Variable shape = (3653, 9, 11)
Time axes go from 1959-1-1 12:0:0.0 to 1968-12-31 12:0:0.0
Latitude values = [-66.7283256  -67.84978441 -68.97123954 -70.09269035 -71.21413608
 -72.33557575 -73.45700815 -74.57843166 -75.69984422]
Longitude values = [ 120.375  121.5    122.625  123.75   124.875  126.     127.125  128.25
  129.375  130.5    131.625]
Fri Nov  6 17:35:08 2015  - Computing the climatology with cdutil.ANNUALCYCLE.climatology,
                please wait...
Fri Nov  6 17:35:26 2015  - Done!
time.clock diff = 17.36

Fri Nov  6 17:35:26 2015  - Computing the climatology by hand...
year, month, nb of selected days = 1959  2 28
year, month, nb of selected days = 1960  2 29
year, month, nb of selected days = 1961  2 28
year, month, nb of selected days = 1962  2 28
year, month, nb of selected days = 1963  2 28
year, month, nb of selected days = 1964  2 29
year, month, nb of selected days = 1965  2 28
year, month, nb of selected days = 1966  2 28
year, month, nb of selected days = 1967  2 28
year, month, nb of selected days = 1968  2 29
Difference range for month 1 = (-4.903731820604662e-06, 6.318861416332311e-06)
Difference range for February (unweighted) = (-5.2588326724389844e-06, 5.322724128120626e-06)
Difference range for February (WEIGHTED) = (-0.014045400572531008, -0.0006070483053850495)
Difference range for month 3 = (-7.192550192769431e-06, 6.8172331566529465e-06)
Difference range for month 4 = (-8.519490563685395e-06, 6.39597574547679e-06)
Difference range for month 5 = (-7.727838379878449e-06, 7.407895935784836e-06)
Difference range for month 6 = (-8.519490563685395e-06, 6.81559244952723e-06)
Difference range for month 7 = (-7.284841260002395e-06, 9.992045690410123e-06)
Difference range for month 8 = (-8.761498278886393e-06, 7.900114923131696e-06)
Difference range for month 9 = (-8.850097671597723e-06, 7.451375310552066e-06)
Difference range for month 10 = (-7.30945221505408e-06, 6.18965391652182e-06)
Difference range for month 11 = (-6.2370300426550784e-06, 6.243387851156967e-06)
Difference range for month 12 = (-6.288097772255696e-06, 6.534207223296562e-06)
Fri Nov  6 17:35:26 2015  - Finished computing the climatology by hand!
time.clock diff = 0.3
Acceleration = 57.8666666667

Migrated from: CDAT/cdat#1664

cdat-info in cdutil/vertical

I'm using the vertical remapping utility in cdutil. And get this strange error for threading. Looking at the vertical.py script. It doesn't seem to use cdat-info and it's odd that this script has to pin PCMDIdb? And insights will be helpful...Thank you! Error information as follows

File "/export/zhang40/anaconda2/envs/e3sm_diags_v2_dev_env/lib/python3.7/site-packages/cdutil-v3.0_25_gb320949-py2.7.egg/cdutil/vertical.py", line 37, in reconstructPressureFromHybrid
"cdutil.vertical.reconstructPressureFromHybrid")
File "/export/zhang40/anaconda2/envs/e3sm_diags_v2_dev_env/lib/python3.7/site-packages/cdat_info-8.1-py2.7.egg/cdat_info/cdat_info_src.py", line 229, in pingPCMDIdb
t.start()
File "/export/zhang40/anaconda2/envs/e3sm_diags_v2_dev_env/lib/python3.7/threading.py", line 847, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread

cdutil.averager should print an error message when called with 'irregular' data

cdutil.averager should print an explicit error message when the data is on a grid it can't handle

Ideally, of course, cdutil.averager should be able to handle data on non-rectilinear grids!

One of our post-docs got the following when trying to get a global average of a CMIP5 file on the ORCA ocean grid. The sample data file (and other files with non rectilinear grids) can be downloaded from: https://files.lsce.ipsl.fr/public.php?service=files&t=ca134197a67c3cd311df6ede04747f6f

>>> import cdms2, cdutil

>>> f = cdms2.open('ORCA_grid_sample.nc')
Warning: bounds variable not found in ORCA_grid_sample.nc: time_centered_bounds
Warning: bounds variable not found in ORCA_grid_sample.nc: time_centered_bounds

>>> v = f('sosstsst', squeeze=1)

>>> v.shape
(149, 182)

>>> v.info()
*** Description of Slab sosstsst ***
id: sosstsst
shape: (149, 182)
filename: 
missing_value: 1e+20
comments: 
grid_name: grid_2
grid_type: CurvilinearGrid
time_statistic: 
long_name: sea surface temperature
units: degC
_FillValue: [  1.00000002e+20]
autoApiInfo: <AutoAPI.AutoAPI.Info instance at 0x7fa4dd4a8e60>
interval_write: 1mo
tileIndex: None
coordinates: time_centered nav_lon nav_lat
online_operation: average
cell_methods: time_counter: mean
interval_operation: 3600s
Grid has Python id 0x7fa4dd930850.
** Dimension 1 **
   id: y
   units:  
   Length: 149
   First:  0.0
   Last:   148.0
   Python id:  0x7fa4dd930890
** Dimension 2 **
   id: x
   units:  
   Length: 182
   First:  0.0
   Last:   181.0
   Python id:  0x7fa4dd930910
*** End of description for sosstsst ***

>>> global_average = cdutil.averager(v, axis='yx')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/install/cdat/versions/cdat_install_uv-2.1.0_x86_64_gcc4_VB_13/lib/python2.7/site-packages/genutil/averager.py", line 1042, in averager
    axis_order = _check_axisoptions(V, axis)
  File "/usr/local/install/cdat/versions/cdat_install_uv-2.1.0_x86_64_gcc4_VB_13/lib/python2.7/site-packages/genutil/averager.py", line 25, in _check_axisoptions
    raise AveragerError, 'Error: You have specified an invalid axis= option.'
genutil.averager.AveragerError: ('E', 'r', 'r', 'o', 'r', ':', ' ', 'Y', 'o', 'u', ' ', 'h', 'a', 'v', 'e', ' ', 's', 'p', 'e', 'c', 'i', 'f', 'i', 'e', 'd', ' ', 'a', 'n', ' ', 'i', 'n', 'v', 'a', 'l', 'i', 'd', ' ', 'a', 'x', 'i', 's', '=', ' ', 'o', 'p', 't', 'i', 'o', 'n', '.')

>>> global_average = cdutil.averager(v, axis='12')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/install/cdat/versions/cdat_install_uv-2.1.0_x86_64_gcc4_VB_13/lib/python2.7/site-packages/genutil/averager.py", line 1042, in averager
    axis_order = _check_axisoptions(V, axis)
  File "/usr/local/install/cdat/versions/cdat_install_uv-2.1.0_x86_64_gcc4_VB_13/lib/python2.7/site-packages/genutil/averager.py", line 25, in _check_axisoptions
    raise AveragerError, 'Error: You have specified an invalid axis= option.'
genutil.averager.AveragerError: ('E', 'r', 'r', 'o', 'r', ':', ' ', 'Y', 'o', 'u', ' ', 'h', 'a', 'v', 'e', ' ', 's', 'p', 'e', 'c', 'i', 'f', 'i', 'e', 'd', ' ', 'a', 'n', ' ', 'i', 'n', 'v', 'a', 'l', 'i', 'd', ' ', 'a', 'x', 'i', 's', '=', ' ', 'o', 'p', 't', 'i', 'o', 'n', '.')

Migrated from: CDAT/cdat#1464

cdutil.ANNUALCYCLE.climatology fails with "months since *" time axis

In a file that I downloaded, the time axis looks like:

netcdf argo_2005-2017_grd {                                                                                   
dimensions:                                                                                                   
        LONGITUDE = 360 ;                                                                                     
        LATITUDE = 180 ;                                                                                      
        LEVEL = 27 ;                                                                                          
        TIME = UNLIMITED ; // (150 currently)                                                                 
variables:                                                                                                    
...
        float TIME(TIME) ;
                TIME:units = "months since 2005-01-15" ;
                TIME:title = "Months in Monthly Means" ;
                TIME:long_name = "Months in Monthly Means" ;
                TIME:axis = "T" ;
...

When attempting to use cdutil.ANNUALCYCLE.climatology I get the following error:

import cdms2, cdutil
f = 'argo_2005-2017_grd.nc'
fH = cdms2.open(f)
t = fH('PTEMP',time=('2005','2017'))
# Calculate annualCycle climatology
tAnClim = cdutil.ANNUALCYCLE.climatology(t)

Traceback (most recent call last):                                                             
  File "makeObsClims.py", line 56, in <module>                                                 
    tAnClim = cdu.ANNUALCYCLE.climatology(t)                                                   
  File "/export/duro/anaconda2/envs/uvcdat2120/lib/python2.7/site-packages/cdutil/times.py", line 1451, in climatology
    tmp = TimeSlicer.get(self,slab,self.seasons[i],criteriaarg,statusbar=statusbar,weights=True,sum=sum)                 
  File "/export/duro/anaconda2/envs/uvcdat2120/lib/python2.7/site-packages/cdutil/times.py", line 433, in get         
    slices,bounds,norm=self.slicer(tim,slicerarg)                                                                        
  File "/export/duro/anaconda2/envs/uvcdat2120/lib/python2.7/site-packages/cdutil/times.py", line 680, in monthBasedSlicer
    b0=cdtime.reltime(bnds[i][0],units)                                                                                      
TypeError: 'NoneType' object has no attribute '__getitem__'

@dnadeau4 @doutriaux1

cdutil.ANNUALCYCLE.climatology() failed in cdutil 3.1.0

As discussed in an email with @doutriaux1 and @dnadeau4 the same function works in cdutil 2.10, while failed in cdutil 3.1.0

Below is the code:

import cdms2
import cdutil

filename = 'test_data_amip_r1i1p1_mo_regrid_3x3_sgp.nc'
fin = cdms2.open(filename)
var = fin('tas')
cdutil.setTimeBoundsMonthly(var)
var_season = cdutil.ANNUALCYCLE.climatology(var).squeeze()

Below is the error:

Traceback (most recent call last):
  File "cdutil_test.py", line 8, in <module>
    var_season = cdutil.ANNUALCYCLE.climatology(var).squeeze()
  File "/Users/zhang40/anaconda2/envs/arm_diags_env_0207_1/lib/python2.7/site-packages/cdms2/avariable.py", line 1706, in squeeze
    return(MV.squeeze(self))
  File "/Users/zhang40/anaconda2/envs/arm_diags_env_0207_1/lib/python2.7/site-packages/cdms2/MV2.py", line 352, in squeeze
    maresult, axes=axes, attributes=attributes, grid=grid, id=id)
  File "/Users/zhang40/anaconda2/envs/arm_diags_env_0207_1/lib/python2.7/site-packages/cdms2/tvariable.py", line 203, in __init__
    self.initDomain(axes, copyaxes=copyaxes)
  File "/Users/zhang40/anaconda2/envs/arm_diags_env_0207_1/lib/python2.7/site-packages/cdms2/tvariable.py", line 357, in initDomain
    raise CDMSError("Wrong number of axes to initialize domain.")
cdms2.error.CDMSError: Wrong number of axes to initialize domain.

'TransientCurveGrid' object has no attribute 'getWeights'

Hi,
I am trying to use the cdutil.averager function on cmip5 datasets.
Out of 43 models tested, 6 (BBC and GFDL models) return the same error when averaging oceanic variables (like 'tos')
I am using Transient Variables (MV2s) so weights must be generated:
Traceback (most recent call last):
File "test.py", line 15, in
d_ave_xy = CDUTILaverager(tos, axis='xy')
File "/usr/local/uvcdat/2.4.0/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/genutil/averager.py", line 1078, in averager
filled_wtoptions = __check_weightoptions(V, axis, weights)
File "/usr/local/uvcdat/2.4.0/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/genutil/averager.py", line 359, in __check_weightoptions
weightoptions = area_weights(x,axisoptions)
File "/usr/local/uvcdat/2.4.0/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/genutil/averager.py", line 159, in area_weights
latwts, lonwts = dsgr.getWeights()
AttributeError: 'TransientCurveGrid' object has no attribute 'getWeights'

Here is the grid:

tos.getGrid()
<TransientCurveGrid, id: grid_1, shape: (232, 360)>

I do not have the same problem when using other TransientCurveGrid models like CNRM-CM5

Models returning errors:
bcc-csm1-1-m, bcc-csm1-1, GFDL-CM2p1, GFDL-CM3, GFDL-ESM2G, GFDL-ESM2M
A common point between these models is that they use GFDL’s Modular Ocean Model (MOM) or Generalized Ocean Layer Dynamics (GOLD) (only GFDL-ESM2G).

Is there a problem with GFDL's ocean models? Is this error known?

Thanks
Yann

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.