Git Product home page Git Product logo

geoschem / integrated_methane_inversion Goto Github PK

View Code? Open in Web Editor NEW
23.0 10.0 19.0 42.43 MB

Integrated Methane Inversion workflow repository.

Home Page: https://imi.readthedocs.org

License: MIT License

Shell 21.00% Python 50.04% Jupyter Notebook 27.24% Perl 1.73%
atmospheric-chemistry atmospheric-composition atmospheric-modeling aws climate-change climate-modeling cloud-computing geos-chem greenhouse-gases inverse-modeling

integrated_methane_inversion's Introduction

Integrated Methane Inversion (IMI) Workflow

Overview:

This directory contains the source code for setting up and running the Integrated Methane Inversion with GEOS-Chem.

Documentation:

Please see the IMI readthedocs site

Reference:

Varon, D. J., Jacob, D. J., Sulprizio, M., Estrada, L. A., Downs, W. B., Shen, L., Hancock, S. E., Nesser, H., Qu, Z., Penn, E., Chen, Z., Lu, X., Lorente, A., Tewari, A., and Randles, C. A.: Integrated Methane Inversion (IMI 1.0): a user-friendly, cloud-based facility for inferring high-resolution methane emissions from TROPOMI satellite observations, Geosci. Model Dev., 15, 5787–5805, https://doi.org/10.5194/gmd-15-5787-2022, 2022.

integrated_methane_inversion's People

Contributors

djvaron avatar jiaweizhuang avatar laestrada avatar msulprizio avatar nicholasbalasus avatar sabourbaray avatar williamdowns avatar yantosca avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

integrated_methane_inversion's Issues

Default hyperthreading setting in AMI

What should we use for the default hyperthreading setting in the AMI? By default sbatch submits as many jobs as there are vCPU’s, but this might be slower or only marginally faster than submitting jobs equal to the number of physical CPUs (#vCPU/2).

I think Will tested this. Right now I believe the default setting is to do hyperthreading, so num_jobs = num_vCPU.

[FEATURE REQUEST] Incorporate information on point sources

Include information on known point sources in region of interest where available.

Possibilities:

  • Display known point sources as part of the preview
  • Increase prior error for grid cells with known point sources
  • Increase prior estimate for grid cells with known point sources

[FEATURE REQUEST] Average TROPOMI observations for each grid cell in GC

From Daniel:

Presently we use individual TROPOMI observations in the observation vector y, and compare to the Kx from GEOS-Chem with TROPOMI operator applied. But there are typically manyTROPOMI observations per GEOS-Chem grid cell per day, and we could average them to reduce the size of y and resulting SO. We haven’t done it this way because the TROPOMI avker (and hence the TROPOMI operator) is different for each observation, but that’s not a good reason. Averaging would help reduce the dimensions of y and SO, and decrease the error correlation within SO that partly contributes to our need for the regularization coefficient gamma. Here’s how to do it:

Continue to apply the TROPOMI operator to the GEOS-Chem fields for each observation – no change here.
Average the observations for each grid cell and day to populate y and correspondingly average the simulated TROPOMI observations to populate Kx.
Keep track of the number of observations being averaged so that we can adjust SO – we don’t have a formula for that yet, but Zhen’s current work on error characterization will give us that.

Adjust recommended vCPU configuration for default user

Followed documentation to set up first EC2 instance. In step 3 for setting up the EC2 instance, documentation suggests c5.9xlarge 36 core vCPU instance, however as a new user to AWS there appears to be a limit on 32core vCPUs which results in the following error when attempting to launch the instance:

image

Suggested resolution: Alter documentation to recommend c5a.8xlarge 32 core vCPU for first time users to bypass this limitation.

Can we avoid user needing to set slurm resources when submitting with sbatch?

Submitting simulations with sbatch, the user needs to know how many cores to request. Can we automate this for them?

Otherwise we need to tell them to edit all the run scripts to use the number of cores their instance has. E.g. on c5.xlarge I had to change num cores from 8 to 4.

At a minimum we should set the default num cores to 1 instead of 8, which is easier to explain.

Users also need to know:

  • how much memory to request
  • how much time to request — can this be set to infinite?

If we can't automate the memory/cores to request, then we need to provide instructions on choosing values in the Readthedocs. That would add more steps to the workflow. Avoidable?

[FEATURE REQUEST] Add capability to optimize offshore emissions

The current state vector creation scripts only consider regions over land. Ideally we would also include offshore emissions; however that requires emissions output from a prior simulation (perhaps even just a HEMCO standalone simulation). Hannah Nesser has developed this capability in her work and we may be able to bring that into the IMI as an option.

[BUG/ISSUE] Grid centers, edges for state vector/HEMCO

Minor grid differences between state-vector and HEMCO diagnostics emission fields:

Presently input.geos uses the minimum/maximum grid-cell centers from the gridded state vector as the domain boundaries (grid-cell edges) for the simulation. As a result, the GEOS-Chem/HEMCO domain is slightly smaller than the gridded state vector domain (0-2 grid cells along lat/lon).

Currently we are using a dimension-matching function to ensure total emissions are still properly computed across the region of interest (match_size() in src/inversion_scripts/utils.py).

It's possible this could lead to incomplete removal of the 3 3 3 3 GC buffer zone from the TROPOMI analysis in some cases, but this hasn't been observed yet in test inversions.

We should consider revamping the domain/grid definitions across the IMI so that HEMCO and the state vector both produce an identical domain as close as possible to the user's settings for lat/lon boundaries, obviating the need for thematch_size() function.

[FEATURE REQUEST] Diagnostic for overfitting

Elise Penn provided the following feedback:

The Gamma regularization factor should depend on the state vector and how many observations are used. Users might unwittingly overfit to observations with a fixed Gamma value. It could be helpful to output J_A/n and J_O/m in post-processing as a diagnostic for overfitting. If there is overfitting, users could then choose a new Gamma value using Xiao Lu’s method.

This could go into the visualization notebook in an update.

[BUG/ISSUE] StateVector.nc fails to build for very narrow region of interest

Using a very small latitude range for the region of interest:

LonMin: 3
LonMax: 7
LatMin: 42
LatMax: 43
REGION: "EU"

BufferDeg: 5
nBufferClusters: 8
LandThreshold: 0.25
CreateStateVectorFile: true

Res: "0.5x0.625"
Met: "merra2"

IMI throws the following error when building and then attempting to read the state vector file:

/home/ubuntu/CH4_Workflow/Test_France_3days/make_state_vector_file.py:110: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  statevector.values[land.isnull()] = 0
Traceback (most recent call last):
  File "/home/ubuntu/CH4_Workflow/Test_France_3days/make_state_vector_file.py", line 183, in <module>
    make_state_vector_file(land_cover_pth, save_pth, lat_min, lat_max, lon_min, lon_max, buffer_deg, land_threshold, k_buffer_clust)
  File "/home/ubuntu/CH4_Workflow/Test_France_3days/make_state_vector_file.py", line 110, in make_state_vector_file
    statevector.values[land.isnull()] = 0
IndexError: too many indices for array: array is 2-dimensional, but 19 were indexed

Works normally when using a larger region of interest, e.g. setting LatMax: 53.

Add environment files for spack environment and conda environment on aws

Up to this point we have added the dependencies manually on the ami, but going forward we should use environment files to track, load, and install the dependencies needed for the IMI workflow. This will help to improve our documentation of dependencies and make future updates of the ami easier/ more predictable.

Permian/Cannon relics in the AMI

We left some relics from previous Permian/Cannon testing in the AMI. Here are two:

  1. AMI comes with a backup_files/input_data_permian/ directory. Some of the contents may not be needed, and the permian should not be in the standard directory name. Maybe backup_files/input_data instead.
  2. The sbatch header of run_inversion.sh contains huce_intel.

[FEATURE REQUEST] Automatic inversion ensembles

Currently users need to manually re-run the IMI with different inversion parameters to generate an inversion ensemble for better error characterization.

Add feature to automatically generate an inversion ensemble.

  • An example would be to vectorize the config input for Gamma and have the IMI run the inversion once for each value. This would necessitate a new prior/posterior run directory for each Gamma value.

[FEATURE REQUEST] Additional regional domains offered by the IMI

The IMI originally included the following regions:

  • China/SE Asia
  • Europe
  • North America

It will also be expanded to offer:

  • South America
  • Middle East
  • Oceania
  • Africa

This primarily involves processing the global 0.25x0.3125 meteorology fields available to the IMI. The global fields are cropped to the regions above to reduce file size and speed up file I/O in the GEOS-Chem simulations run within the IMI.

[BUG/ISSUE] s3 cp error for boundary condition file

There appears to be a (seemingly) innocuous s3 cp (or sync) error in the output file of the IMI:

fatal error: An error occurred (404) when calling the HeadObject operation: Key "HEMCO/../BoundaryConditions/GEOSChem.BoundaryConditions.20180401_0000z.nc4" does not exist

config and output files are attached to reproduce, but this error has been noted in other runs as well.

config.yml.txt
imi_output.log

HEMCO and automating the cluster file

Melissa is working on automating the generation of the cluster file.

Just a quick note that HEMCO currently points to an irrelevant default cluster file. So we will need to make sure HEMCO knows where to look for the automatically generated one.

HEMCO and automating the cluster file

Melissa is working on automating the generation of the cluster file.

Just a quick note that HEMCO currently points to an irrelevant default cluster file. So we will need to make sure HEMCO knows where to look for the automatically generated one.

Bad boundary conditions directory in HEMCO_Config.rc

The UMI template for HEMCO_Config.rc has a bad target directory for the boundary condition files. Currently, HEMCO looks for the boundary conditions in:

/home/ubuntu/ExtData/HEMCO/SAMPLE_BCs/v2019-05/CH4/

but this doesn't exist in the AMI. As a result our GEOS-Chem simulations show XCH4 levels that are systematically far too low.

Instead, HEMCO should look here for the boundary conditions:

/home/ubuntu/ExtData/BoundaryConditions/

Add flexibility to choose between running IMI on AWS or on local cluster

The IMI was originally developed for running locally on Harvard's Cannon Cluster (currently in main branch). Will Downs added the capability to run the IMI on AWS (currently in the add_download branch). The updates specific to AWS should be merged into main and an option will be added for users to select whether they are running on AWS or on a local cluster. Based on the user selection, the setup script, GEOS-Chem configuration, and scripts in CH4_TROPOMI_INV will automatically choose the correct settings for that system.

AMI does not exist

The docs do not actually specify an AMI. This make setting up an EC2 instance way more difficult.

image

Setup script requires pre-built ExtData folder to run successfully

Because it tries to build the state vector before running the dry-runs to built ExtData.

Running the setup script with no pre-existing ExtData folder throws two errors:

(1) can't find the landcover file for the state vector
(2) can't find the BC file for the first day of the spinup simulation

Solution may be to perform dry-run(s) earlier in the script.

Problem switching between EC2 instance types

To save money, I used a c5.xlarge instance to set up the UMI and run the spin-up simulation, and then switched to a c5.9xlarge instance to run the Jacobian simulations.

Everything works normally when I do this, but switching back to c5.xlarge from c5.9xlarge introduces a problem with slurm.

For some reason, slurm shows the node's resources as drained, and jobs are no longer initiated after submission.

Will found a fix for this:

sudo scontrol update nodename=ip-172-31-20-68 state=idle

which resets the slurm state if I understand well.

If we can't avoid this problem, then we should document it on the Readthedocs.

[BUG/ISSUE] Multiple inversions creates error in file path in visualization notebook

The visualization notebook reads from the config.yml, which is edited when new inversions are run. If you have an instance with multiple inversions and you run the visualization notebook of an inversion that wasn't the most recently run, the prior_pth variable will refer to the most recently run inversion and you get an error. It could be worth storing a copy of config.yml somewhere after submitting a run, for future reference in the vis notebook.

Add end-to-end script

This issue specifically tracks the progress of adding and end-to-end script, as originally requested in #10.

An initial (rough draft) version of this script has been pushed to the feature/EndToEndScript branch, but a lot of work is still needed.

[BUG/ISSUE] STATE "drained" after restarting instance

Using c5.9xlarge instance.

Closed instance after running an inversion with sbatch. After restarting the instance, sbatch cannot schedule new jobs due to (Resources).

sinfo shows drained STATE:

sinfo --Node --long

Thu Feb  3 22:23:32 2022
NODELIST         NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              
ip-172-31-79-91      1    debug*     drained   36   1:36:1  70235        0      1   (null) Low RealMemory

Not sure why this is happening, but to correct I needed to do:

sudo scontrol update nodename=ip-172-31-79-91 state=idle

Extract user settings from setup script and put in YML file instead

Users can currently modify the settings for their inversion at the top of setup_ch4_inversion.sh but there are many options now and it can be confusion on what settings users should modify. We should think about moving the settings to a separate file to clean up the setup script and make things more user friendly.

[BUG/ISSUE] Error setting up preview GC simulation

The dry run for the 1-day preview simulation throws this error:

fatal error: An error occurred (404) when calling the HeadObject operation: Key "HEMCO/../BoundaryConditions/GEOSChem.BoundaryConditions.20180401_0000z.nc4" does not exist

This path would work:

../../ExtData/HEMCO/../BoundaryConditions/GEOSChem.BoundaryConditions.20180401_0000z.nc4

Unclear why ../../ExtData is missing for this file.

All that said, the error seems to be irrelevant -- the needed boundary condition file is present and the simulation runs correctly.

Only save out StateMet and LevelEdgeDiags for base run to save on disk space

Zichong Chen wrote:

I am thinking about the Jacobian runs and the large storage it is eating. Assume we will run 1000 perturbation runs, but we actually need the output of StateMet and LevelEdgeDiags only once, right?; since the dry air density and pressure levels are exactly the same at each perturbation run. With that said, we just randomly select one run and have the output of StateMet and LevelEdgeDiags, but for other runs we just turn them off at HISTORY.rc. This will save us a lot storage and also will likely make our model run faster (via avoiding large I/O in writing out output).

I already did that, and I think I am almost done for the perturbation run. (1) After the modification, for each perturbation run it consumes 33G instead of 195G (as before) in my case (a quarter res over China); (2) I chatted with Hannah Nesser Elise Penn the other day, to calculate the obs operator, for StateMet output, the only variable we may need is Met_AD; for LevelEdgeDiags, we may only need Met_PEDGE; But if we only need these pressure/air density output once, it does not bother that much even if we want more variables. (3) I am now using 12T, but will soon go down to 1T when my operator scripts finish running.

Problem setting IAM roles

The UMI requires the user to set up an AWS IAM role to allow EC2 to access S3.

Will provided instructions for how to do this on the Readthedocs, but when I tried, something went wrong. The role I made was not automatically applied to the instance, and when I applied it manually it still didn’t work.

Luckily I already had an equivalent IAM role set up, but fresh AWS users may get stuck here.

[FEATURE REQUEST] Include capability to optimize annual OH

For global inversions, we should also include optimization of OH in addition. This will require updates to the GEOS-Chem simulations to perturb OH. These perturbations may be applied globally or separately to the North and South Hemispheres, but for most applications the global perturbation is sufficient.

Comments and suggestions from Daniel Jacob

In comments on the manuscript, Daniel made some suggestions for improving the UMI user experience.

  1. Right now the user needs to think about the buffer clusters when they define their inversion domain. They also need to think about the GEOS-Chem "buffer zone" (3 pixels on all domain sides). Daniel thinks the user should only have to tell the UMI what inversion domain they're interested in, and the UMI should automate everything else -- i.e. it should add pixels around the edges for the buffer clusters and handle the buffer zone automatically.
  2. Daniel also thinks the user should have to execute just one script to make everything work, end-to-end. Can we do this? For the inversion part, run_inversion.sh does the trick. For the spin-up and other simulations, can we make a bash script that runs everything in series, with wait statements between each step? Then combine this with run_inversion.sh? This seems a bit painful to me but maybe it's worth it. Otherwise we need to defend the multi-step UMI (run setup script -> run spin-up simulation -> run jacobian simulations -> run inversion -> visualize results)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.