znicholls / netcdf-scm Goto Github PK
View Code? Open in Web Editor NEWSimple wrappers for processing netcdf files for use in simple climate models
Home Page: https://netcdf-scm.readthedocs.io/en/latest/
Simple wrappers for processing netcdf files for use in simple climate models
Home Page: https://netcdf-scm.readthedocs.io/en/latest/
Is your feature request related to a problem? Please describe.
The units of the wrangled files always match the source files. This isn't always ideal.
Describe the solution you'd like
Add the ability to specify units in wrangling. If the source file's units are "/m2" then the wrangler should automatically take the area sum of the input variable.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Is your feature request related to a problem? Please describe.
The dataframes returned at the moment aren't in the openscm style.
Describe the solution you'd like
Update to use openscm dataframe style.
Describe alternatives you've considered
Leave it as is. This will lead to annoying format conversions being necessary all the time.
Additional context
The openscm project aims to unify region naming, units handling etc. to make running simple climate models much simpler for all involved. Matching this standard is a good idea.
Things are pretty messy and definitely need to be split up and tidied
Builds on #9. At the moment our conda packaging process is quite manual and it's easy to get dependencies wrong. Improving this so our conda recipe is automatically written/checked would be good.
Is your feature request related to a problem? Please describe.
At the moment we don't have a way to test our plots.
Describe the solution you'd like
A way to automatically test plots. I think copying what pyam or iris someone similar have done is the best idea.
Describe alternatives you've considered
Always running the notebooks. I think if we can actually compare plots it's a better check though.
Available here https://docs.readthedocs.io/en/latest/conda.html
Not sure if the extra build time is worth it just to be able to undo the import hacks...
The convert_scmdf_to_tuningstruc routine writes the .mat "data" field with the dimensions around the wrong way.
In the final .mat structure, the field "data" should consist of two columns (years,data), but instead it is two rows. Should be an easy transpose fix :)
Please put the code (ideally in the form of a unit
test) which fails below
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
System (please complete the following information):
conda list --export
and pip freeze
as applicableAdditional context
Add any other context about the problem here.
Describe the bug
When crunching the ACCESS1-0 CMIP5 data, the following error is thrown
'real_datetime' object has no attribute 'nanosecond'
Failing Test
def read_access_data(access_data_path):
cube = MarbleCMIP5Cube()
cube.load_data_from_path(TEST_ACCESS_CMIP5_FILE)
Expected behavior
NetCDF-SCM should be able to read this file.
Screenshots
If applicable, add screenshots to help explain your problem.
System (please complete the following information):
conda list --export
and pip freeze
as applicable$ python --version
Python 3.6.6
$ conda list --export
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: osx-64
asn1crypto=0.24.0=py36_1003
beautifulsoup4=4.6.3=py36_1000
blas=1.0=mkl
bokeh=1.0.1=py36_1000
bzip2=1.0.6=1
ca-certificates=2018.10.15=ha4d7672_0
cartopy=0.16.0=py36h81b52dc_2
certifi=2018.10.15=py36_1000
cf_units=2.0.1=py36h7eb728f_2
cffi=1.11.5=py36h5e8e0c9_1
cftime=1.0.2.1=py36h7eb728f_0
chardet=3.0.4=py36_1003
click=7.0=py_0
cloudpickle=0.6.1=py_0
conda=4.5.11=py36_1000
conda-build=3.16.2=py36_0
conda-env=2.6.0=1
cryptography=2.3.1=py36hdffb7b8_0
cryptography-vectors=2.3.1=py36_1000
curl=7.62.0=h74213dd_0
cycler=0.10.0=py_1
cytoolz=0.9.0.1=py36h470a237_1
dask=0.20.1=py_0
dask-core=0.20.1=py_0
distributed=1.24.1=py36_1000
expat=2.2.5=hfc679d8_2
filelock=3.0.10=py_0
freetype=2.9.1=h6debe1e_4
geos=3.6.2=hfc679d8_3
glob2=0.6=py_0
hdf4=4.2.13=h951d187_2
hdf5=1.10.3=hc401514_2
heapdict=1.0.0=py36_1000
icu=58.2=hfc679d8_0
idna=2.7=py36_1002
intel-openmp=2019.0=118
iris=2.2.0=py36_1
jinja2=2.10=py_1
jpeg=9c=h470a237_1
kiwisolver=1.0.1=py36h2d50403_2
krb5=1.16.2=hbb41f41_0
libarchive=3.3.3=h823be47_0
libcurl=7.62.0=hbdb9355_0
libcxx=7.0.0=h2d50403_2
libedit=3.1.20170329=haf1bffa_1
libffi=3.2.1=hfc679d8_5
libgfortran=3.0.1=h93005f0_2
libiconv=1.15=h470a237_3
libnetcdf=4.6.1=h350cafa_11
libpng=1.6.35=ha92aebf_2
libssh2=1.8.0=h5b517e9_2
libtiff=4.0.9=he6b73bb_2
libxml2=2.9.8=h422b904_5
libxslt=1.1.32=h88dbc4e_2
llvm-meta=7.0.0=0
locket=0.2.0=py_2
lxml=4.2.5=py36hc9114bc_0
markupsafe=1.1.0=py36h470a237_0
matplotlib=3.0.1=1
matplotlib-base=3.0.1=py36h45c993b_1
mkl=2018.0.3=1
mkl_fft=1.0.6=py36_0
mkl_random=1.0.2=py36_0
msgpack-python=0.5.6=py36h2d50403_3
nc-time-axis=1.1.0=py_0
ncurses=6.1=hfc679d8_1
netcdf4=1.4.2=py36hac939d9_0
numpy=1.15.4=py36h6a91979_0
numpy-base=1.15.4=py36h8a80b8c_0
olefile=0.46=py_0
openssl=1.0.2p=h470a237_1
owslib=0.17.0=py_0
packaging=18.0=py_0
pandas=0.23.4=py36hf8a1672_0
partd=0.3.9=py_0
pillow=5.3.0=py36hc736899_0
pip=18.1=py36_1000
pkginfo=1.4.2=py_1
proj4=4.9.3=h470a237_8
psutil=5.4.8=py36h470a237_0
pycosat=0.6.3=py36h470a237_1
pycparser=2.19=py_0
pyepsg=0.3.2=py_1
pyke=1.1.1=py36_1000
pyopenssl=18.0.0=py36_1000
pyparsing=2.3.0=py_0
pyproj=1.9.5.1=py36h508ed2a_6
pyshp=2.0.0=py_0
pysocks=1.6.8=py36_1002
python=3.6.6=h5001a0f_0
python-dateutil=2.7.5=py_0
python-libarchive-c=2.8=py36_1004
pytz=2018.7=py_0
pyyaml=3.13=py36h470a237_1
readline=7.0=haf1bffa_1
requests=2.20.0=py36_1000
ruamel_yaml=0.15.71=py36h470a237_0
scipy=1.1.0=py36h28f7352_1
setuptools=40.5.0=py36_0
shapely=1.6.4=py36h164cb2d_1
six=1.11.0=py36_1001
sortedcontainers=2.0.5=py_0
sqlite=3.25.3=hb1c47c0_0
tblib=1.3.2=py_1
tk=8.6.8=ha92aebf_0
toolz=0.9.0=py_1
tornado=5.1.1=py36h470a237_0
tqdm=4.28.1=py_0
udunits2=2.2.27.6=h3a4f0e9_1
urllib3=1.23=py36_1001
wheel=0.32.2=py36_0
xz=5.2.4=h470a237_1
yaml=0.1.7=h470a237_1
zict=0.1.3=py_0
zlib=1.2.11=h470a237_3
$ pip freeze
alabaster==0.7.12
appdirs==1.4.3
appnope==0.1.0
asn1crypto==0.24.0
atomicwrites==1.2.1
attrs==18.2.0
Babel==2.6.0
backcall==0.1.0
beautifulsoup4==4.6.3
black==18.9b0
bleach==3.0.2
bokeh==1.0.1
Cartopy==0.16.0
certifi==2018.10.15
cf-units==2.0.1
cffi==1.11.5
cftime==1.0.2.1
chardet==3.0.4
Click==7.0
cloudpickle==0.6.1
codecov==2.0.15
conda==4.5.11
conda-build==3.16.2
coverage==4.5.1
cryptography==2.3.1
cryptography-vectors==2.3.1
cycler==0.10.0
cytoolz==0.9.0.1
dask==0.20.1
decorator==4.3.0
defusedxml==0.5.0
distributed==1.24.1
docutils==0.14
entrypoints==0.2.3
ExpectException==0.1.1
f90nml==1.0.2
filelock==3.0.10
flake8==3.6.0
glob2==0.6
heapdict==1.0.0
idna==2.7
imagesize==1.1.0
ipykernel==5.1.0
ipython==7.1.1
ipython-genutils==0.2.0
jedi==0.13.1
Jinja2==2.10
jsonschema==2.6.0
jupyter-client==5.2.3
jupyter-core==4.4.0
kiwisolver==1.0.1
libarchive-c==2.8
locket==0.2.0
lxml==4.2.5
MarkupSafe==1.1.0
matplotlib==3.0.1
mccabe==0.6.1
mistune==0.8.4
mkl-fft==1.0.6
mkl-random==1.0.2
more-itertools==4.3.0
msgpack==0.5.6
nbconvert==5.4.0
nbformat==4.4.0
nbresuse==0.3.0
nbval==0.9.1
nc-time-axis==1.1.0
-e [email protected]:znicholls/netcdf-scm.git@7968219efcec5b4cff0d7473166b07b758ae58ae#egg=netcdf_scm
netCDF4==1.4.2
notebook==5.7.0
numpy==1.15.4
olefile==0.46
OWSLib==0.17.0
packaging==18.0
pandas==0.23.4
pandas-datapackage-reader==0.11.1
pandocfilters==1.4.2
parso==0.3.1
partd==0.3.9
pexpect==4.6.0
pickleshare==0.7.5
Pillow==5.3.0
pkginfo==1.4.2
pluggy==0.8.0
progressbar2==3.38.0
prometheus-client==0.4.2
prompt-toolkit==2.0.7
psutil==5.4.8
ptyprocess==0.6.0
py==1.7.0
pyam-iamc==0.1.1
pycodestyle==2.4.0
pycosat==0.6.3
pycparser==2.19
pyepsg==0.3.2
pyflakes==2.0.0
Pygments==2.2.0
pyke==1.1.1
pymagicc==2.0.0a0
pyOpenSSL==18.0.0
pyparsing==2.3.0
pyproj==1.9.5.1
pyshp==2.0.0
PySocks==1.6.8
pytest==3.10.0
pytest-cov==2.6.0
python-dateutil==2.7.5
python-utils==2.3.0
pytz==2018.7
PyYAML==3.13
pyzmq==17.1.2
readme-renderer==24.0
requests==2.20.0
requests-toolbelt==0.8.0
ruamel-yaml==0.15.71
scipy==1.1.0
scitools-iris==2.2.0
seaborn==0.9.0
Send2Trash==1.5.0
Shapely==1.6.4.post1
six==1.11.0
snowballstemmer==1.2.1
sortedcontainers==2.0.5
Sphinx==1.8.1
sphinx-rtd-theme==0.4.2
sphinxcontrib-websupport==1.1.0
tblib==1.3.2
terminado==0.8.1
testpath==0.4.2
toml==0.10.0
toolz==0.9.0
tornado==5.1.1
tqdm==4.28.1
traitlets==4.3.2
twine==1.12.1
urllib3==1.23
versioneer==0.18
wcwidth==0.1.7
webencodings==0.5.1
xlrd==1.1.0
XlsxWriter==1.1.2
zict==0.1.3
Additional context
Add any other context about the problem here.
.nc
format instead (#60 and #62 )
_convert_scm_timeseries_cubes_to_openscmdata
to use cf name as variable
rather than long_name
(#63)*.IN
style files
.MAG
writer in pymagicc (openscm/pymagicc#250).nc
.MAG
files with all data.IN
files with scope to extend to user choice about variable mapping, rebasing and timeseries averaging/cutting
.IN
files in pymagicc (openscm/pymagicc#252)source files
attribute of crunched nc files so version is captured toodrs
and cube-type
flags@lewisjared add whatever you want
Yearly tas
timeseries are available in the appendix of AR5 WGI report:
http://www.climatechange2013.org/report/full-report/
Atlas of Global and Regional Climate Projections 44.7MB, EndNote/BibTeX
Supplementary Material: RCP2.6, RCP4.5, RCP6.0, RCP8.5, datafiles
IPCC WGI AR5 Atlas datasets included in KNMI Climate Change Atlas
ZIP Download:
http://www.climatechange2013.org/images/report/WGIAR5_AnnexI_all.zip
From the PDF:
Data and Processing:
The figures have been constructed using the CMIP5 model output available at the time of the AR5 cut-off for accepted papers (15 March 2013). This data set comprises 32/42/25/39
scenario experiments for RCP2.6/4.5/6.0/8.5 from 42 climate models (Table AI.SM2.6.1). Only concentration-driven experiments are used (i.e., those in which concentrations rather than emissions of green-house gases are prescribed) and only one ensemble member from each model is selected, even if multiple realizations exist with different initial conditions and different realizations of natural variability. Hence each model is given equal weight.
It might be possible to get the used model/run combo from this table
Table AI.SM2.6.1 |
The CMIP5 models used in this Annex for each of the historical and RCP scenario experiments. A number in each column is the identifier of the single ensemble
member from that model that is used. A blank indicates no run was used, usually because that scenario run was not available.
If you have ideas on how to include this as a verification of your code, let me know @znicholls !
Is your feature request related to a problem? Please describe.
If we've crunched a bunch of files, there's no easy way to tell if they're up to date. For example, did they make assumptions about cell areas which are no longer necessary as we now have the cell area file.
Describe the solution you'd like
We should be able to do something like netcdf-scm-crunch --check-up-to-date <src> <crunch-path> --database-file <database-file>
and get output which tells us which files are not up to date and why (new data file is available, new metadata file is available etc.). We might also want to make the --force
flag more flexible so it's not just force or not but is rather force, update if outdated or don't force (with default still being don't force).
Describe alternatives you've considered
Force crunch repeatedly, super slow...
Additional context
Required to make our CMIP6 crunching sane.
Describe the bug
At the moment, when crunching with e.g. get_scm_timeseries
, land_mask_threshold
expects a percentage (i.e. 0-100) value. Sometimes data is provided with a fraction (i.e. 0-1) value. get_scm_timeseries
should do some sort of sensible test.
Failing Test
tas.get_scm_timeseries(
sftlf_cube=sftlf,
land_mask_threshold=50,
areacella_scmcube=None
)
on a cube which has sftlf values from 0-1
Expected behavior
A warning that the threshold looks wrong.
Screenshots
If applicable, add screenshots to help explain your problem.
System (please complete the following information):
Additional context
Add any other context about the problem here.
Is your feature request related to a problem? Please describe.
At the moment the setup instructions are in three places: Makefile
, .travis.yml
and README.rst
/docs/source/development.rst
(depending on whether minimal or for development). This sort of duplication is extremely error prone.
Describe the solution you'd like
Some way to write these setup instructions in a 'master' file, from which the other files are automatically generated. I'm not sure how exactly this would work/look yet.
Describe alternatives you've considered
Continue to try to maintain these three places concurrently. It feels somehow dangerous...
Is your feature request related to a problem? Please describe.
To take area sums of variables, we need to know the effective area of the region in the model.
Describe the solution you'd like
Keep the effective area metadata when crunching.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
We've hit our bandwith limit on git lfs storage. This means we're locked out. GitLab seems to have nowhere near as strict limits so I'm going to move there instead (it also has way better CI support for Docker containers etc. so will speed that up a lot too).
pymagicc.io
Is your feature request related to a problem? Please describe.
If we don't have an sftlf file, we only get NH/SH split and no ocean/land.
Describe the solution you'd like
We should be able to guess land fractions in the absence of an sftlf file (output tracker can tell us what we used or not)
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Becomes very fail safe once #51 is merged.
@lewisjared what do you think of something like the following for writing?
import netcdf_scm
ts = test_cube.get_scm_timeseries()
from iris.pandas import as_cube
def to_cube(series, calendars={1: cf_units.CALENDAR_GREGORIAN}):
cube = as_cube(series, calendars=calendars)
cube.coord("index").rename("time")
return cube
def get_cubes(df, calendars={1: cf_units.CALENDAR_GREGORIAN}):
cube_list = []
for i, (v, c) in enumerate(ts.timeseries().iterrows()):
cube = to_cube(c, calendars=calendars)
cube.units = df.index.get_level_values("unit")[i]
cube.long_name = df.index.get_level_values("variable")[i]
metadata = {
level: df.index.get_level_values(level)[i]
for level in df.index.names
if level not in ["unit", "variable"]
}
metadata["netcdf_scm_info"] = "netcdf-scm v{}".format(netcdf_scm.__version__)
metadata.update(test_cube.cube.attributes)
cube.attributes = metadata
# we can add parameters like so although it might be smarter
# to add them as variables with dimension `run_no` so the mapping
# is simpler later...
cube.add_aux_coord(iris.coords.AuxCoord(
1.3,
long_name="example_scalar",
units="K"
))
cube.add_aux_coord(iris.coords.AuxCoord(
"string here",
long_name="example_generic",
units="no_unit"
))
cube_list.append(cube)
return cube_list
cube_list = get_cubes(ts.timeseries())
iris.save(
cube_list,
"tmp.nc",
local_keys=ts.timeseries().index.names # this lets us separate timeseries metadata from 'info'
)
It gives something like the below. It would be horrible to read without our readers but I think reading should be pretty trivial cause we know what to expect (loop over variables, store data and metadata for timeseries, get parameters, get info, make ScmDataFrame instance). The thing I'm really not sure about is how to save e.g. parameters vs. just metadata? Should we just put all the parameters in as variables and add an extra dimension, run_no
, which we can use to save them?
$ ncdump -h tmp.nc
netcdf tmp {
dimensions:
time = 120 ;
string11 = 11 ;
variables:
double toa_outgoing_longwave_flux(time) ;
toa_outgoing_longwave_flux:long_name = "toa_outgoing_longwave_flux" ;
toa_outgoing_longwave_flux:units = "W m^-2" ;
toa_outgoing_longwave_flux:activity_id = "CMIP" ;
toa_outgoing_longwave_flux:climate_model = "BCC-CSM2-MR" ;
toa_outgoing_longwave_flux:member_id = "r1i1p1f1" ;
toa_outgoing_longwave_flux:model = "unspecified" ;
toa_outgoing_longwave_flux:region = "World" ;
toa_outgoing_longwave_flux:scenario = "1pctCO2" ;
toa_outgoing_longwave_flux:coordinates = "example_generic example_scalar" ;
double time(time) ;
time:axis = "T" ;
time:units = "hours since 1970-01-01 00:00:00" ;
time:standard_name = "time" ;
time:calendar = "gregorian" ;
char example_generic(string11) ;
example_generic:units = "no_unit" ;
example_generic:long_name = "example_generic" ;
double example_scalar ;
example_scalar:units = "K" ;
example_scalar:long_name = "example_scalar" ;
double toa_outgoing_longwave_flux_0(time) ;
toa_outgoing_longwave_flux_0:long_name = "toa_outgoing_longwave_flux" ;
toa_outgoing_longwave_flux_0:units = "W m^-2" ;
toa_outgoing_longwave_flux_0:activity_id = "CMIP" ;
toa_outgoing_longwave_flux_0:climate_model = "BCC-CSM2-MR" ;
toa_outgoing_longwave_flux_0:member_id = "r1i1p1f1" ;
toa_outgoing_longwave_flux_0:model = "unspecified" ;
toa_outgoing_longwave_flux_0:region = "World|Northern Hemisphere" ;
toa_outgoing_longwave_flux_0:scenario = "1pctCO2" ;
toa_outgoing_longwave_flux_0:coordinates = "example_generic example_scalar" ;
double toa_outgoing_longwave_flux_1(time) ;
toa_outgoing_longwave_flux_1:long_name = "toa_outgoing_longwave_flux" ;
toa_outgoing_longwave_flux_1:units = "W m^-2" ;
toa_outgoing_longwave_flux_1:activity_id = "CMIP" ;
toa_outgoing_longwave_flux_1:climate_model = "BCC-CSM2-MR" ;
toa_outgoing_longwave_flux_1:member_id = "r1i1p1f1" ;
toa_outgoing_longwave_flux_1:model = "unspecified" ;
toa_outgoing_longwave_flux_1:region = "World|Southern Hemisphere" ;
toa_outgoing_longwave_flux_1:scenario = "1pctCO2" ;
toa_outgoing_longwave_flux_1:coordinates = "example_generic example_scalar" ;
// global attributes:
:branch_method = "branch" ;
:branch_time_in_child = 0. ;
:branch_time_in_parent = 0. ;
:cmor_version = "3.3.2" ;
:comment = "at the top of the atmosphere (to be compared with satellite measurements)" ;
:contact = "Dr. Tongwen Wu ([email protected])" ;
:creation_date = "2018-10-15T06:27:37Z" ;
:data_specs_version = "01.00.27" ;
:description = "DECK: 1pctCO2" ;
:experiment = "1 percent per year increase in CO2" ;
:experiment_id = "1pctCO2" ;
:external_variables = "areacella" ;
:forcing_index = 1 ;
:frequency = "mon" ;
:further_info_url = "https://furtherinfo.es-doc.org/CMIP6.BCC.BCC-CSM2-MR.1pctCO2.none.r1i1p1f1" ;
:grid = "T106" ;
:grid_label = "gn" ;
:history = "2018-10-15T06:27:35Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards." ;
:initialization_index = 1 ;
:institution = "Beijing Climate Center, Beijing 100081, China" ;
:institution_id = "BCC" ;
:license = "CMIP6 model data produced by BCC is licensed under a Creative Commons Attribution ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at https:///pcmdi.llnl.gov/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law." ;
:mip_era = "CMIP6" ;
:netcdf_scm_info = "netcdf-scm v1.0.0+41.gbb3285d.dirty" ;
:nominal_resolution = "100 km" ;
:original_name = "FLUT" ;
:parent_activity_id = "CMIP" ;
:parent_experiment_id = "piControl" ;
:parent_mip_era = "CMIP6" ;
:parent_source_id = "BCC-CSM2-MR" ;
:parent_time_units = "days since 1850-01-01" ;
:parent_variant_label = "r1i1p1f1" ;
:physics_index = 1 ;
:product = "model-output" ;
:realization_index = 1 ;
:realm = "atmos" ;
:references = "Model described by Tongwen Wu et al. (JGR 2013; JMR 2014; submmitted to GMD,2018). Also see http://forecast.bcccsm.ncc-cma.net/htm" ;
:run_variant = "forcing: GHG" ;
:source = "BCC-CSM 2 MR (2017): aerosol: none atmos: BCC_AGCM3_MR (T106; 320 x 160 longitude/latitude; 46 levels; top level 1.46 hPa) atmosChem: none land: BCC_AVIM2 landIce: none ocean: MOM4 (1/3 deg 10S-10N, 1/3-1 deg 10-30 N/S, and 1 deg in high latitudes; 360 x 232 longitude/latitude; 40 levels; top grid cell 0-10 m) ocnBgchem: none seaIce: SIS2" ;
:source_id = "BCC-CSM2-MR" ;
:source_type = "AOGCM" ;
:sub_experiment = "none" ;
:sub_experiment_id = "none" ;
:table_id = "Amon" ;
:table_info = "Creation Date:(30 July 2018) MD5:e53ff52009d0b97d9d867dc12b6096c7" ;
:title = "BCC-CSM2-MR output prepared for CMIP6" ;
:tracking_id = "hdl:21.14100/4f3fdd6c-bef7-4ec0-a692-5ca29f51e1ba" ;
:variable_id = "rlut" ;
:variant_label = "r1i1p1f1" ;
:Conventions = "CF-1.5" ;
}
Is your feature request related to a problem? Please describe.
We need a way to keep track of the files used when crunching to CSV. This provides traceability and the ability to perform smart updating in future when new land area masks become available.
These files will replace the *-failures-and-warnings.txt
files written in the root of a dataset.
Describe the solution you'd like
paper.md
Is your feature request related to a problem? Please describe.
Running this code always through a Python session or a notebook is overly slow, a command line interface would super nice.
Describe the solution you'd like
Use the packages described in this blog post and also add in a few other tricks they recommend (making sure to update docs as well as CI etc. when doing this).
Describe alternatives you've considered
No command line options...
Is your feature request related to a problem? Please describe.
When making CMIP data reference syntax compliant files, you need to make the file, set the filename and set the filepath all correctly. However, you can actually deduce the filename and path simply from the file's attributes. Adding this feature would eliminate duplication in the file production process and make generating compliant, properly named files much easier.
Describe the solution you'd like
It would be great to just be able to make a netCDF file, then have it be named and put in a directory automatically based off the file attributes.
Is your feature request related to a problem? Please describe.
The wrangling hasn’t been optimised so is unnecessarily slow.
Describe the solution you'd like
Profile the wrangling functions then optimise based on the results. From brief experience I think parallelisation needs to be done in separate processes as most time is spent interpolating and filtering, not reading data off disk.
Describe alternatives you've considered
Put up with it, relative to crunching it’s still quick.
Additional context
Add any other context or screenshots about the feature request here.
Is your feature request related to a problem? Please describe.
Adding new cubes requires writing a bunch of new tests and unreasonable amounts of duplication.
Describe the solution you'd like
To be able to define a new cube simply by giving it a name and giving an example of its directory structure and file name construction. A series of basic tests should be automatically built.
Describe alternatives you've considered
Continue with adding lots of tests by hand - seems cumbersome and error prone.
We can't yet crunch 3D ocean data. We should be able to (wrangling is another question).
The weights come back with a range of normalisations and it's a pain. The weights should really all come back in the range [0, 1] and then there should be separate methods for working out areas etc.
I have no idea about this, @rgieseke or @lewisjared do you have an idea about what's best?
Is your feature request related to a problem? Please describe.
ScmCube.get_scm_timeseries_cubes
crunches each of the regional timeseries in serial. On machines with multiple cores, this is a slow way to do this.
Describe the solution you'd like
ScmCube.get_scm_timeseries_cubes
should crunch each of the regional timeseries cubes in parallel if the resources are available. The implementation needs to be clever if this is to actually provide a performance boost. In particular:
self
in order to do the parallelisation. This will require a solution like this or thisDescribe alternatives you've considered
Just leave things being serial.
Additional context
Add any other context or screenshots about the feature request here.
Is your feature request related to a problem? Please describe.
At the moment we don't have any docs or a way to automatically build them.
Describe the solution you'd like
I'd like to use sphinx, but just have no idea how so if anyone wants to take that on it would be very welcome. It would be great if we could include automatic testing of examples in docstrings.
Describe alternatives you've considered
N/A
Is your feature request related to a problem? Please describe.
Installing purely with pip is super hard because of iris' dependencies.
Describe the solution you'd like
Add ability to install with conda.
Describe alternatives you've considered
Not installing with conda and forcing users to install with pip. Given I didn't do this, bad idea.
The entire masking scheme needs to be re-thought. The masks should not be based on allocating cells to boxes, rather by applying area * surface land fraction (or area * (1 - surface land fraction) where ocean boxes are sought) weights to the raw data when taking means. This requires some realm awareness too i.e. when the realm is ocean, the land boxes shouldn't be available; when the realm is land, the ocean boxes shouldn't be available; when the realm is atmos, the weightings should work such that an area weighted mean of e.g. World|Land and World|Ocean gives the same as World.
https://github.com/git-lfs/git-lfs/wiki/Tutorial#migrating-existing-repository-data-to-lfs
Follow advice here (git-lfs/git-lfs#3238 (comment)) to ensure that gitattributes file is consistent across all history before doing migration
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.