geoschem / geos-chem-cloud Goto Github PK
View Code? Open in Web Editor NEWRun GEOS-Chem easily on AWS cloud
Home Page: http://cloud.geos-chem.org
License: MIT License
Run GEOS-Chem easily on AWS cloud
Home Page: http://cloud.geos-chem.org
License: MIT License
The current one is just taken from my IGC8 presentation on May 2017. Will need those updates:
This is a minor point but worth recording.
Using https://github.com/yantosca/aws-env might be a good option.
Daniel suggested a survey & testing among users to see how quickly they can get started with the AWS cloud. I think this is a great idea and will add a significant value to the paper.
Here's my preliminary plan:
Users/volunteers will be asked to go through one or two pre-defined workflows, and report the time spent on each stage.
Test 1 (required). The absolutely minimum beginner demo, following the Quick start guide.
The entire process should only take 10~20 minutes if everything goes smoothly. Factors that could slow down the process include:
Test 2 (optional). A more complete, customizable workflow, resembling a realistic project. This requires the user to read the all beginner tutorials for additional stuff like S3 and spot.
s3://gcdata
bucket.Those steps will be documented more explicitly, to replace the current research workflow section. The entire process should take 1~2 hours, if the users have already read most of beginner tutorials like AWS-CLI, S3, and spot. Would it be necessary to also record the time spent on those individual chapters? They should be finished in 1~2 hours in total, depending on how focused a user is.
Besides the time spent on each stage of the workflow, additional information includes:
We could in principle use Google Forms, but I want to users to directly reply on GitHub issue tracker. This makes the "raw data" transparently visible and ensures information credibility. All participants wound need to sign up their own GitHub accounts.
Where to obtain AWS accounts? I would like to let users sign up their own accounts using their own credit cards. But not everyone might be willing to do so. We can also send out temporary user account (IAM users derived from our own root account) for testing. I should be able to get some credits for this.
Collaborate with existing classes & conferences? Students from EPS200/EPS236 seem good candidates. I also really like to collaborate with CS205 because that class teaches a lot about scientific computing on AWS, but it is not offered in the coming semester. A great chance to gather information is the workshop at ICC9, but it is two semesters ahead. We can get some initial results by inviting users online, and gradually add more data if there's a chance to perform offline user testing.
How many samples should we get? 20 doesn't seem very compelling? Maybe 50?
CC @yantosca @lizziel @msulprizio @sdeastham @ayshaw for any comments. Let's make all our discussions transparent on GitHub. This is the approach adopted by Pangeo (https://github.com/pangeo-data/pangeo/issues) and I think this transparency is quite valuable for the community.
Need a developer guide so more people can help maintain this system. It can also be a reference for building other atmospheric models, or basically any models that use Fortran/C and NetCDF.
Will need to cover:
On AWS, consider the pre-built Singularity module provided by AlcesFlight:
http://docs.alces-flight.com/en/stable/apps/singularity.html
Harvard's Odyssey cluster will soon have Singularity available. That's a chance to show that we can get exactly the same environment on cloud and local clusters.
https://www.rc.fas.harvard.edu/odyssey-3-the-next-generation/
Hello,
I have gone through your wiki page on Setting Unix environment variables for GEOS-Chem. However, I found that GEOSChem_env file does not specify the OMP_NUM_THREADS
. In this sense, I just wonder will make -j4 mpbuild
make the geos.mp
use all the cores available automatically? If I choose to set OMP_NUM_THREADS
as the maximum number of cores I have in ~/.bashrc
, will it give me the quickest speed? Previously, I have tried c5.9xlarge
and c5.18xlarge
with they having 18 and 36 Cores, respectively. However, the latter one did not double the speed of the former one. Hope you could clarify these kinds of things to me. Many thanks in advance!
Yours faithfully,
Fei
The existing geo env is not sufficient to work with the gcpy package (https://bitbucket.org/gcst/gcpy) and available packages for GEOS-Chem data regridding, both lat/lon and cubed sphere. However, I am able to successfully use gcpy and regrid using xESMF with the following environment installed (see below).
The package list is based on the one from gcpy but with additional packages added, particularly for handling cubed sphere data. Overall it includes more than is needed but provides very good coverage for what users might want. Storing the yml file within the AMI would allow users to be able to create their own environments very easily. I have a README available that I put together last year concisely giving directions for how to do this if you would like to adapt it.
Note that I cloned the gcpy, xESMF, and cubedsphere packages and then manually installed them on top of my environment, e.g. pip install -e /home/ubuntu/src/xESMF. I also had to specify an older version of xarray since gcpy is not compatible with the latest xarray version (!)
file: gcpy.yml
name: gcpy
channels:
- defaults
- conda-forge
- nesii/label/dev-esmf
dependencies:
- python=3.6 # Python version 3.6
- basemap # Precursor to cartopy
- bottleneck # C-optimized array functions for NumPy
- cartopy # Geographic plotting toolkit
- cython # Transpile Python->C
- dask # Parallel processing library
- esmpy # ESMF python package
- graphviz # visualize dask graph (binary)
- future # Python 2/3 compatibility
- h5py # Wrapper for HDF5
- ipython # IPython interpreter and tools
- jupyter # Jupyter federation architecture
- matplotlib # 2D plotting library
- netcdf4 # Wrapper for netcdf4
- notebook # Notebook interface
- numpy # N-d array and numerics
- pandas # Labeled array library
- pyresample # Geographic resampling tools
- scipy # Common math/stats/science functions
- scikit-learn # Macine learning library
- statsmodels # Regression/modeling toolkit
- seaborn # Statistical visualizations
- six # Python 2/3 compatibility
- tqdm # Nice progressbar for longer computations
- xarray=0.9.6 # N-d labeled array library
- xbpch # Interface for bpch output files
- sphinx # documentation
- pip:
- codecov # coverage tool
- xbpch # Interface for bpch output files
- h5pyd # HDF5 for Amazon S3
- h5netcdf # allow HDF5-backend for xarray
- graphviz # visualize dask graph (python package)
- pycodestyle # tool to check style conventions (formerly pep8)
- pytest-cov # coverage tool
Maggie Marvin at the University of Edinburgh asked whether the data on Dalhousie's FTP can be also uploaded to S3.
My questions are:
My reference is
http://wiki.seas.harvard.edu/geos-chem/index.php/Downloading_GEOS-Chem_source_code_and_data#Dalhousie_data_directory_archive
Looks like most of them are metfields and should largely duplicate s3://gcgrid/ ?
Further, custom nested fields can be cropped from the global data:
s3://gcgrid/GEOS_0.25x0.3125/GEOS_FP/
Hi Jiawei,
I am trying to run a geosfp_2x2.5_standard run on AWS. I've followed the instructions given here (https://cloud-gc.readthedocs.io/en/latest/chapter02_beginner-tutorial/research-workflow.html#get-source-code-and-checkout-model-version) including making a GC directory, downloading the source code and Unit Tester, and changing the copyrundirs.input file with the appropriate information (GCGRID_root, data_root, etc.).
I went into the run directory and tried to compile (make realclean worked, make -j4 mpbuild NC_Diag gave an error).
Do you have any thoughts why this may have happened?
Preliminary estimation of computing cost, with spot instances:
TODO:
TODO:
Cost saving tricks:
After #2 is done, we will be able to run any types of GEOS-Chem simulations over any periods. We can go beyond proof-of-concept runs and do serious research projects.
Preliminary workflow design:
The working directory should probably be in a standalone EBS volume, so can it can be quickly shared between instances and will not be affected by spot instance termination. The root volume should only have software libraries and model source code.
Instead of using my personal AMI with little metadata, we should use AWS marketplace to share the official GEOS-Chem AMI. It will provide better version control and a more official user interface. We will definitely NOT charge the software. That AMI probably needs to be managed by an official Harvard ACMG account (can we use the same account for #2 ?). Also need to solve #1 first.
Good references are:
I will be leading an AWS workshop on the 9th International GEOS-Chem Meeting. The workshop will be on May 9th, Thursday, 3pm-5pm (2-hour long). Nick Ragusa, Senior Solutions Architect at AWS Boston office, will help organize the workshop. He will create 40~50 AWS accounts for temporary use during the workshop.
I am planning to cover the following parts:
Things to double-check before the workshop:
I also want to gather feedbacks as in #15. People should post their feedbacks under a GitHub issue. This seems a very effective way of getting visible outcomes from the workshop. We can then reference those public user comments in papers/blog-posts/funding-proposals.
The workshop participants will need to sign-up for GitHub accounts (ideally before workshop), to post comments on the GitHub issue tracker (and star GEOS-Chem repositories🙂). After finishing the workshop, they will post feedbacks including:
Please leave any comments on this workshop plan under this issue. I will open another issue to gather actual workshop feedbacks.
With every version update, part of the readthedocs documentation will become out-of-date or even misleading (e.g. #31, 831121d)
All version-specific contents are vulnerable to such problem. For example:
I thought this issue previously but just realised that it might be worthy of an effort.
From this link we know it is quite convenient to launch an EC2 instance using aws ec2 run-instances
. However, the solution the link provides only targets Linux/Mac users. For Windows users, scripts modifications are needed. Using the GitBash recommended by the Quick Start Guide, we won't have aws
software by default. Installing aws
on Windows can be achieved with Anoconda or Miniconda. And a line of code, e.g. export PATH=.../miniconda3/Scripts:$PATH
, needs to be added to tell the GitBash where to find the aws
. Besides this, I also found that --block-device-mapping DeviceName="/dev/sda1",Ebs={VolumeSize=$EBS_SIZE}
does not work properly if launching from Windows but don't know how to resolve it.
CfnCluster might complement #4 , especially for a large number of ensemble runs. You will get HPC-cluster-like environment, with a job scheduler handling multiple compute nodes. But it takes much longer to launch (~10 min) than basic EC instances (~seconds), and is quite difficult for a new user to learn. Will try this after #4 is largely finished.
CfnCluster is probably an overkill for GC-classic with OpenMP-only parallelization. But we will eventually use it for GCHP.
Model version: 12.1.0
AMI id: ami-0ee8892ae47c31be1
Instance type: c5.4xlarge
All the steps were copied from https://cloud-gc.readthedocs.io/en/latest/chapter02_beginner-tutorial/research-workflow.html except for several configurations:
In input.geos:
%%% SIMULATION MENU %%% :
Start YYYYMMDD, hhmmss : 20160701 000000
End YYYYMMDD, hhmmss : 20160703 000000
%%% NESTED GRID MENU %%%:
Save TPCORE BC's : T
Input BCs at 2x2.5? : T
Over North America? : F
TPCORE NA BC directory : BC_2x25_NA/
Over Europe? : F
TPCORE EU BC directory : BC_2x25_EU/
Over China? : T
TPCORE CH BC directory : BC_2x25_CH/
Over Asia? : F
TPCORE AS BC directory : BC_2x25_AS/
Over Custom Region? : F
TPCORE BC directory : BC_2x25/
BC timestep [sec] : 10800
LL box of BC region : 9 26
UR box of BC region : 29 41
I0_W, J0_W, I0_E, J0_E : 3 3 3 3
In HISTORY.rc
#============================================================================
EXPID: ./OutputDir/GEOSChem
COLLECTIONS: 'Restart',
#'SpeciesConc',
#'Budget',
'AerosolMass',
'Aerosols',
# %%%%% THE AerosolMass COLLECTION %%%%%
#
# Aerosol and PM2.5 mass
#
# Available for full-chemistry and aerosol-only simulations
#==============================================================================
AerosolMass.template: '%y4%m2%d2_%h2%n2z.nc4',
AerosolMass.format: 'CFIO',
AerosolMass.frequency: 00000001 000000
AerosolMass.duration: 00000001 000000
AerosolMass.mode: 'time-averaged'
AerosolMass.fields: 'AerMassBC ', 'GIGCchem',
'AerMassNH4 ', 'GIGCchem',
'AerMassNIT ', 'GIGCchem',
'AerMassPOA ', 'GIGCchem',
'AerMassSAL ', 'GIGCchem',
'AerMassSO4 ', 'GIGCchem',
'AerMassSOAGX ', 'GIGCchem',
'AerMassSOAMG ', 'GIGCchem',
'PM25 ', 'GIGCchem',
'TotalOA ', 'GIGCchem',
'TotalOC ', 'GIGCchem',
::
#==============================================================================
# %%%%% The Aerosols COLLECTION %%%%%
#
# Aerosol optical depth, surface area, number density, and hygroscopic growth
#
# Available for full-chemistry or aerosol-only simulations
#==============================================================================
Aerosols.template: '%y4%m2%d2_%h2%n2z.nc4',
Aerosols.format: 'CFIO',
Aerosols.frequency: 00000000 010000
Aerosols.duration: 00000001 000000
Aerosols.mode: 'instantaneous'
Aerosols.fields: 'AODDust ', 'GIGCchem',
'AODDustWL1_?DUSTBIN? ', 'GIGCchem',
Also made BC_2x25_CH and OutputDir directories prior to running the model. The model run smoothly but no BC files generated with BC_2x25_CH directory.
Hello, this picture shows the problem I encountered.
This picture shows the list of files in my current folder.
This picture shows the modified content of the input。geo file.
I tried to search this question in Google browser and got some reference answers(http://wiki.seas.harvard.edu/geos-chem/index.php/GEOS-Chem_restart_files#Restart_files_in_GEOS-Chem_12). Unfortunately, I still haven't solved this problem.
Hope to get help, thank you.
Thanks to the fixes in geoschem/GCHP#6, now GCHP can run correctly in the same software environment as GC-classic (Ubuntu 18.04, gcc 7.3.0, netCDF 4.6.0. The only addition is OpenMPI3). I see no reason of maintaining two separate AMIs, where the software libraries largely duplicate. This also avoids duplicating ~100 GB of minimum input data.
Users can test GCHP immediately after playing with GC-classic on the cloud. Seems a great education opportunity for users.
My only concern is that GCHP might lag behind GC-classic's version. For example, when the new version of MAPL is implemented, it is not clear how much work it would take to make it run properly on the cloud.
If there're no objections I will use the same AMI for both.
@yantosca @lizziel @msulprizio
The biggest difference between cloud and local machines is perhaps data management. Doing small-scale computation is not too different because they are all just Linux servers...
Here're several ways to preserve data after the work is done, listing from fast&expensive to slow&cheap:
The latest pricing can be found at:
Will need to go through them one-by-one. S3 tutorials can use other Earth science data on AWS to show additional benefits of the cloud.
In HEMCO_data_directories I notice that
In GEOS-Chem 12.5.0 AFCID was separated from the DEAD dust emissions extension to allow for use with offline dust emissions as well.
Now DUST_DEAD/v2018-04
no longer exist in S3 bucket:
$ aws s3 ls --request-payer=requester s3://gcgrid/HEMCO/DUST_DEAD/
PRE v2014-07/
PRE v2019-06/
This is fine for 12.5.0+ that does not read from DUST_DEAD/v2018-04
, but older version still reads it
On Odyssey, it is linked to ../AFCID/v2018-04
$ cd /n/holylfs/EXTERNAL_REPOS/GEOS-CHEM/gcgrid/gcdata/ExtData/HEMCO/DUST_DEAD
$ ls -l v2018-04
lrwxrwxrwx 1 msulprizio jacob_gcst 17 2019-08-12 10:36 v2018-04 -> ../AFCID/v2018-04/
However S3 doesn't support symlinks. We can fix it just like fixing GMI files
DATA_ROOT=$HOME/ExtData
aws s3 cp --request-payer=requester --recursive s3://gcgrid/HEMCO/AFCID/v2018-04/ $DATA_ROOT/HEMCO/AFCID/v2018-04/
cd $DATA_ROOT/HEMCO/DUST_DEAD
rmdir v2018-04
ln -s ../AFCID/v2018-04 ./
Note that this issue breaks the HEMCO parser script suggested at #25 (comment). The generated script will not download from AFCID/v2018-04/
as it is not shown in the log file.
This is a long-term goal. Just put it here for record. Will focus on GC-classic right now.
TODO:
AWS has signed a formal agreement with Harvard to host the data for free. We need to start moving and managing the data.
@JiaweiZhuang and I have noted an intermittent error that can happen when trying to request spot instances for the c5 instance type. The requested spot instances would be denied with the "bad parameters" error:
In this case, this error occurs because the default subnet field (red box below) must be defaulting to a region (us-east-1e) that does not have any c5 nodes. As you can see, spot prices are quoted for all regions except us-east-1e, which means that there are no nodes of this instance type.
The solution is simple. Manually select a subnet value for one of the regions for which a spot price is listed (e.g. us-east-1a):
This should cause your spot request to be fulfilled succesfully.
Currently our S3 bucket only syncs the data on Harvard Odyssey, but not the extra data on Compute Canada, notably global high-res metfields. But users can still download data from Compute Canada to EC2 (pulling data into the cloud is free). We should see how fast this is. If it is too slow we should consider adding those data to S3.
The speed can highly depend the AWS region -- I believe that the Canada region would be very fast.
us-east-1
and ca-central-1
regionsHi,
I am trying to running a 2x25 global complexSOA_SVPOA simulation on AWS using the latest GEOS-Chem. Again, I have followed all the steps from your tutorial except that:
I change the simulation period to 20130701-20130901 and I have downloaded all the met data. Then I encountered the following error copied from HEMCO.log
HEMCO ERROR: Cannot find file for current simulation time: /home/ubuntu/ExtData/HEMCO/CEDS/v2018-08/1712/NO-em-anthro_CMIP_CEDS_1712.nc - Cannot get field CEDS_NO_AGR. Please check file name and time (incl. time range flag) in the config. file
I then went to the s3://gcgrid to find that file aws s3 ls --request-payer=requester s3://gcgrid/HEMCO/CEDS/v2018-08/
but unfortunately failed. Instead, I found NO-em-anthro_CMIP_CEDS_195001-201412.nc under s3://gcgrid/HEMCO/CEDS/v2018-04/
It must be problem of this data missing or I need to change my configurations for this period?
Time: Thursday (05/09), 3:00-5:00 pm
Location: Maxwell-Dworkin G115
Requirement: Bring a laptop with internet connection. To connect to WIFI, follow Connect to Harvard Wireless as a guest, "Connecting to Harvard wireless as a visitor" (second section).
Free temporary AWS accounts will be provided during the workshop. We have 30+ registered participants right now and can accommodate at most 40 people.
Here are the preparations you can do before the workshop. You don't have to do any of those before coming, and can instead do those during the workshop. But those steps can help you get most out of the workshop (so you can spend less time waiting for software installations, etc.)
Sign-up for a GitHub account at https://github.com/join. You will need such an account to provide workshop feedbacks (detailed later). You can further use the account to ask questions and report bugs about GEOSChem-on-cloud, on the GitHub issue tracker. Treat it as an online forum. GitHub is also the currently recommended way to submit your code updates, so it is nice to have an account, anyway. Finally, consider giving a star to the repositories under the GEOS-Chem team. Just click on the "star" button on the upper-right corner of each repository page. This "shows appreciation to the repository maintainer for their work".
Study our Python tutorial at https://github.com/geoschem/GEOSChem-python-tutorial, if you have no Python experience before. The workshop will involve a little bit of Python. But it is also fine to come without prior knowledge.
For Windows users, install Git-Bash as a Linux-like terminal. This is mentioned in more detail in our online guide:
On Windows, I highly recommend installing Git-BASH to emulate a Linux terminal, so you can follow exactly the same steps as on Mac/Linux. Simply accept all default options during installation, as the goal here is just to use Bash, not Git. Alternatively, you can use MobaXterm, Putty, Linux Subsystem or PowerShell with OpenSSH. But the Git-BASH solution should be the most painless and will also work smoothly in later steps where we add port-forwarding options to connect to Jupyter.
You will follow the comprehensive online tutorials at http://cloud.geos-chem.org. Due to time constraints, the workshop will only cover two essential parts:
Part 1: Working with EC2. Follow the Quick start guide. You will launch your own server on AWS, run a demo GEOS-Chem simulation, and plot output data in Jupyter notebooks.
Special instructions for Part 1: Different from the standard tutorial, in this workshop you will use the free temporary AWS account instead of signing up for your own account; you will terminate the server at the end of this workshop (after finishing both Part 1 and 2), not at the end of Part 1.
Part 2: Working with S3. Follow Configuring AWSCLI and retrieving data from S3. You will upload your own files to S3 for persistent storage, retrieve GEOS-Chem input data and other public Earth science dataset from S3, and change GEOS-Chem configurations to read the new input data.
Special instructions for Part 2: When obtaining the AWS CLI credentials, follow the " IAM user account" subsection, not the "root account" subsection.
The slides used for Thursday morning's model clinic are also available here, as a summary for the online tutorial.
You will reply to this survey by simply posting comments within this GitHub issue (i.e. the current page you are looking at). See the "comment" cell at the bottom of this page, with a big, green "comment" button (need to login to GitHub first).
To get a feeling of how the replies would look like, see out past survey at #15. All replies under this issue should adhere to the above user-feedback format. For general discussions on workshop plans, please comment under #23.
Hi!I run GEOS-Chem based on GEOSChem_13.2.1_tutorial_20211005. I have got the output result successfully. But when I run it again, I encounter two problems that i have no idea to solve.
1
./gcclassoc --dryrun > log.dryrun
Log with unique file paths written to: log.dryrun.unique
2
I want to demonstrate CH4 with horizontal resolution of 0.5*0.625. But it showed that some data (ExtData/HEMCO/SAMPLE_BCs/v2021-07/CH4/GEOSChem.BoundaryConditions.YYYYMMDD_HHNNz.nc4)does not exist in amazon or washu.
TODO:
Should be done jointly with #5.
It seems that IDL has not been installed on ami-0ee8892ae47c31be1. Are there any other tools that can regrid and crop restart files in preparation for nested simulation?
Hi all,
I am working on using the AWS cloud to run GEOS-CHEM, I have registered my student account, but when I was launching r5.large after clicking Launch Instances, I got an error saying that "You are not authorized to perform this operation. Encoded authorization failure message: ...".
And when I use a normal account, I am able to launch the EC2. Do you have any ideas about this error?
Model version: 12.7.0
AMI Name: GEOSChem_12.7.0_tutorial_20200205
Instance type: c5.18xlarge (To prevent the EC2 from being killed or running out of memory)
Hello, we've been trying to set up an Aerosol Only Nested Run for North America with a special focus on western U.S., but we haven't been able to identify the complete steps to accomplish this. We've decided to use GC12.7 because of this issue. Also, we've read in depth the following links: Nested Grid Wiki, Nested Model Clinic, Flex Grid Simulation Error, 3 hour boundary conditions for nested grid, and @FeiYao-Edinburgh steps to run a nested simulation for Complex SOA. We have also troubleshooted with the preconfigured nested grid (tropchem), to identify some errors, but we haven't succeeded at running a nested grid for the specialty of our interest (Aerosol). Because of this, we would like to have your input on the steps to tackle this.
Here is our understanding of the way to process it, along with some questions (In italics) associated to it, :
I don't want to confuse anyone so please take these steps as our notion of what might have to be done rather than what should actually be done.
1.- Creating the boundary conditions: First we create a Directory from /UT/perl/CopyRunDirs.Input for Offline Aerosol (or should we create the BCs by it activating geosfp standard?) by activating GEOSFP Aerosol only 4x5 aerosol (or is 2x2.5 better?), for the date in which we plan to generate our boundary conditions. This date has to be one day before the date of our nested run.
2.-Once the directory is generated, we go there, and modify HISTORY.rc, there we turn on 'Restart', 'SpeciesConc', and 'BoundaryConditions' (or should we turn on/off something else?), then we modify BoundaryConditions.frequency: 00000000 030000 to 1 hour, so that we can address the hourly issue specified by @msulprizio.
3.-In the HEMCO_config.rc file, we modify the paths to the data as it was [explained] (#39 (comment)) by Bob, so that we are able to download them once we do the dryrun as explained by @yantosca.
4.- Then in input.geos we modify the Latitude and Longitude to the area of interest, for example, lat and lon for california, (or is it for North America or is it globally, (leaving it as it was)). Here, should we turn Nested grid simulation on, or should we leave it off? Also, If we want to run, a Nested Grid for California, should we run the BC for a global area (-180 180; -90 90) for NA (-140 -40; 10 70), or can we do it for a Custom Area (Lets say, western US) ? Do nested and global dimensions have to agree, or can we do global for North America and nested for California?).
`%%% GRID MENU %%% :
Grid resolution : 4.0x5.0
Longitude min/max : 114.0 119.0 (Questions regarding these specs)
Latitude min/max : 38.0 42.0 (Questions regarding these specs)
Half-sized polar boxes?: F
Number of levels : 47
Nested grid simulation? : F (Questions regarding these specs)
Buffer zone (N S E W ) : 3 3 3 3`
5.- Then compile, using make realclean ; make -j4 build NC_DIAG=y BPCH_DIAG=n TIMERS=1
, (Or is it something else, as @feiyao was doing for version 12.2?).
5.1.- Download data, using the dry run, and execute ./geos to get the boundary conditions in OutputDir
6.- Once we have created the BC, we move again to /UT/perl/CopyRunDirs.Input and create a new directory by activating GEOSFP Aerosol only 4x5 aerosol and modifying it to nested as @FeiYao-Edinburgh did (or would it be advisable to modify it from the tropchem nested grid?)
`
#--------|-----------|------|------------|------------|------------|---------|
# MET | GRID | NEST | SIMULATION | START DATE | END DATE | EXTRA? |
#--------|-----------|------|------------|------------|------------|---------|
geosfp 4x5 - aerosol 2013070100 2013070101 - (Questions regarding these specs)
## ======= Nested model runs ==================================================
# merra2 05x0625 as tropchem 2016070100 2016080100 -
# merra2 05x0625 na tropchem 2016070100 2016080100 -
# geosfp 025x03125 ch tropchem 2016070100 2016080100 -
geosfp 025x03125 na tropchem 2016070100 2016080100 - (Questions regarding these specs)
## ======= HEMCO standalone ===================================================`
We want:
`## ======= Nested model runs ==================================================
# merra2 05x0625 as tropchem 2016070100 2016080100 -
# merra2 05x0625 na tropchem 2016070100 2016080100 -
# geosfp 025x03125 ch tropchem 2016070100 2016080100 -
geosfp 025x03125 na **aerosol** 2016070100 2016080100 -`
7.-We modify the generated directory (either geosfp_4x5_aerosol or geosfp_025x03125_tropchem_na). How do we do this? What are we supposed to change to make this happen? . This is one of the most important questions, but it was not entirely clear how Fei-Yao approached it here
8.-Once the geosfp_4x5_aerosol_na directory is set up, and we modify everything to make it run (How does input.geos look?). Can we select the lat and longitud of our interest (P.ex to Los Angeles).
`%%% GRID MENU %%% :
Grid resolution : 025x03125 (Questions regarding these specs)
Longitude min/max : 114.0 119.0 (Questions regarding these specs)
Latitude min/max : 38.0 42.0 (Questions regarding these specs)
Half-sized polar boxes?: F
Number of levels : 47
Nested grid simulation? : T (Questions regarding these specs)
Buffer zone (N S E W ) : 3 3 3 3`
9.- We modify HEMCO_config.rc, just as we did in step 3, to be able to download the data, and we also modify the Metdir, to be able to read _NA data.
METDIR: /project/data/ExtData/GEOS_0.5x0.625_NA/GEOSFP
in the same file, we also modify the path to our boundary conditions just as @msulprizio Explained (For 12.7 solved for later versions).
`#==============================================================================
# --- GEOS-Chem boundary condition file ---
#==============================================================================
(((GC_BCs
* BC_ $**MyPathToBoundaryConditions/OutputDir**/GEOSChem.BoundaryConditions.$YYYY$MM$DD_$HH$MNz.nc4 SpeciesBC_?ADV? 1980-2019/1-12/1-31/* RFY xyz 1 * - 1 1 (Note that we substituted 1-23 for *)
)))GC_BCs
(((CHEMISTRY_INPUT`
Also from the wiki
`# ExtNr ExtName on/off Species
0 Base : on *
# ----- RESTART FIELDS ----------------------
--> GC_RESTART : true
--> GC_BCs : true
--> HEMCO_RESTART : true
`
10.- Do we modify HISTORY.rc t_o ask for boundary conditions? Or we don't really need to do this_. If we need to, should we also change BoundaryConditions.frequency: 00000000 030000 to 1 hour, just as we did in step 2 or is it not really necessary? Also, in HISTORY.rc we will request 'Restart', as output. (This might bring an error later on).
#==============================================================================
# %%%%% THE BoundaryConditions COLLECTION %%%%%
#
# GEOS-Chem boundary conditions for use in nested grid simulations
#
# Available for all simulations
#==============================================================================
BoundaryConditions.template: '%y4%m2%d2_%h2%n2z.nc4',
BoundaryConditions.format: 'CFIO',
BoundaryConditions.frequency: 00000000 030000 **Should we change this to 1**
BoundaryConditions.duration: 00000001 000000
BoundaryConditions.mode: 'instantaneous'
BoundaryConditions.LON_RANGE: -130.0 -60.0, **This is does not show up in HISTORY.rc should we add it?**
BoundaryConditions.LAT_RANGE: 10.0 60.0, **This is does not show up in HISTORY.rc should we add it?**
BoundaryConditions.fields: 'SpeciesBC_?ADV? ', 'GIGCchem',`
11.- We compile using this (Or is it something else? @FeiYao-Edinburgh)
make realclean make -j4 build NC_DIAG=y BPCH_DIAG=n TIMERS=1
12.- We download files from dry run, and modify some file extensions manually like ".NA.nc" to ".nc" in ExtData to be able to pull them
13.- We then run ./geos, and then get our .nc file in Output Dir?
I should be clearly missing many steps, or this might be entirely wrong so your input would be very valuable
**Once we have some input on how to approach this, we will be sharing with you a Public ami, to troubleshoot some of the errors that we expect to see, also we will share the following files (once we make them):
GC_log.txt, HEMCO.log.txt, input.geos.txt, HISTORY.rc.txt, HEMCO_Config.rc.txt **
If the manual AMI creation is getting annoying and repetitive, there are many ways to automate the process. For example, Packer is a popular way to build virtual machine images, just like Docker for building container images.
Useful materials:
It looks pretty easy to use. The Packer template completely defines the AMI content. You basically just need source_ami
to specify which base AMI id to start with, the shell commands you want to run on top of this AMI specified in script
(for long scripts) or inline
(for short commands), and additional security settings like the AWS key.
@yantosca This might be worth taking a look at.
We are trying to perform a 4x5 aerosol only simulation following the workflow in the tutorial for 20140101, however we are having some issues when pulling data from the repository to our EC2, and when trying to run it using ./geos
Specifically, we have been using the last lines from deploy_GC.sh to get the files that we need from the file specification output created by the dry run. However, when running ./geos, we found that the GFED4 files are not being pulled correctly. For instance, we wanted to include GFED4 emissions for 2014, which are only available in version /v2015-10/, but the download_data.py code was trying to pull these emissions from the 2020 version as /v2020-02/2014/GFED4_gen.025x025.201401.nc. This issue, also prevent the run from occurring, even after the "manual" download of the omitted files, since it tries to pull the data from a v2020-02/2014/ local directory and not from the /v2015-10/.
We were thinking that this might be due to omitted specifications in the input for the run and dry run. This might be triggering the use of the latest version of GFED4, and not the use of the one where the data is. So we were wondering where can we specify this to be able to download and run the data correctly. We were thinking of doing it in HEMCO_Config.rc , but still need some insight.
GFED4 version, not specified in run --> dry run pulls latest version of GFED4 --->Data not available in latest version--->Run and Data download not working.
Errors in HEMCO.log:
Cannot find file for current simulation time: /home/ubuntu/ExtData/HEMCO/GFED4/v2020-02/2014/GFED4_gen.025x025.201401.nc
HEMCO ERROR: Error encountered in routine HCOIO_Read_Std!
HEMCO ERROR: Error in HCOIO_DATAREAD called from HEMCO ReadList_Fill: GFED_TEMP
HEMCO ERROR: Error in ReadList_Fill (3) called from HEMCO ReadList_Read
Starting in 13.0.0, pre-built docker images for GCClassic and GCHP are deployed to Docker Hub on releases. I see that geos-chem-cloud is using geoschem/gchp_model
. Could this be updated to use the new auto-deployed images?
GC 12.7.0 run in the Tutorial AMI.
We already solved this issue but wanted to bring it to your attention.
There are some files in the S3 gcgrid input repository that perhaps got corrupted when uploaded to the S3.
For instance we were getting an error when GC was trying to pull offline_dust for an Aerosol Only run. We then found that this error was due to a specific file that was damaged.
/gcgrid/data/ExtData/HEMCO/OFFLINE_DUST/v2019-01/0.25x0.3125/2014/02/dust_emissions_025.20140202.nc
Which weighted 173 Kb in the gcgrid input repository.
The same file in Compute Canada weighted 2.6MBs
/ExtData/HEMCO/OFFLINE_DUST/v2019-01/0.25x0.3125/2014/02/dust_emissions_025.20140202.nc
We were able to get the run going by deleting the file that was failing, and by downloading the missing file using wget
from Compute Canada. But it is important to mention that we weren't getting this error two weeks ago, so we believe there might have been a change in the S3://gcgrid input repository.
This error was a solution to
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
HEMCO ERROR: Wrong dimensions: $ROOT/OFFLINE_DUST/v2019-01/0.25x0.3125/$YYYY/$MM/dust_emissions_025.$YYYY$MM$DD.nc
ERROR LOCATION: HCOIO_READ_STD (hcoio_read_std_mod.F90)
ERROR LOCATION: HCOIO_DataRead (hcoio_dataread_mod.F90)
ERROR LOCATION: ReadList_Fill (hco_readlist_mod.F90)
ERROR LOCATION: ReadList_Read (hco_readlist_mod.F90)
ERROR LOCATION: HCO_RUN (hco_driver_mod.F90)
I used Spack to build a software stack for GEOS-Chem "Classic" simulations on AWS. I am documenting the steps I took here (at least for now). Maybe later I can make this a README file within the geos-chem-cloud repo.
For some background, see my video tutorial about installing gcc/gfortran 9.2.0 using Spack.
c5.4xlarge
, with 16 vCPUs)GEOSChem-spack-tutorial-1 (ami-093dcad6c38c2a7f6)
This will start a cloud instance with Intel Skylake architecture.
After logging in to an AWS instance, type these commands:
# Unset env vars
unset NETCDF_HOME
unset NETCDF_FORTRAN_HOME
unset GC_BIN
unset GC_F_BIN
unset GC_INCLUDE
unset GC_F_INCLUDE
unset GC_LIB
unset GC_F_LIB
# Install environment-modules (similar to Lmod modules)
# This will let you use "spack load" to load a module set PATH properly
sudo apt install environment-modules
# Initialize environment-modules
. /etc/profile.d/modules.sh
# Set the path to Spack and run the Spack setup script
export SPACK_ROOT=$HOME/spack
source $SPACK_ROOT/share/spack/setup-env.sh
After completing step 2, follow these steps
# Display compiler info.
# Make sure that [email protected] is displayed under linux-ubuntu-skylake_avx512.
spack config get compilers
# Then type the following commands in this order (I have added whitespace
# to better show the different parts of the commands).
# You can copy & paste these into your terminal window.
#
# Command module compiler options
#------------------------------------------------------------------
spack install netcdf-fortran %[email protected]
spack install perl %[email protected]
spack install flex %[email protected]
spack install cmake %[email protected]
spack install gmake %[email protected]
spack install gdb %[email protected]
spack load flex
spack load texinfo
spack install cgdb %[email protected]
spack install nco %[email protected]
spack install ncview %[email protected]
spack install openjdk %[email protected]
spack install tau %[email protected] ~otf2
spack install unzip %[email protected] # Note: Optional
spack load unzip # Note: Optional
spack install emacs %[email protected] +X # Note: Optional
Add this code to your ~/.bashrc
file:
# Initialize environment-modules package
if [[ -f /etc/profile.d/modules.sh ]]; then
source /etc/profile.d/modules.sh
fi
# Initialize spack
export SPACK_ROOT=$HOME/spack
source $HOME/spack/share/spack/setup-env.sh
# Load Spack packages and set relevant environment variables
if [[ -f ~/.init.gc-classic.gfortran92 ]]; then
source ~/.init.gc-classic.gfortran92
fi
# Alias for loading modules compiled with gfortran 9.2.0
alias load_gf92=`source ~/.init.gc-classic.gfortran92`
And then make sure to copy this file to your ~/init
folder and to make it executable:
NOTE: Rename this to .init.gc-classic.gfortran92
and place it in your home directory.
Then when you open a new shell (or type source ~/.bashrc
), the modules will be loaded as well.
I used the software stack that I created with Spack to create the following AMI:
GEOSChem-dev-gcc9.2.0
aka ami-0db45eda61a721639
Use this to initialize your cloud instance. I will also try to copy this to an S3 bucket.
Also note: this AMI contains 1 day of GEOS-Chem data, but no GEOS-Chem code or UT folders. It is recommended to download the code and to do a dry-run to get the data for whatever period your simulation needs.
One-time setup:
aws-env/root/.gitconfig
with your name & email:[user]
name = Bob Yantosca
email = [email protected]
root/.bash_aliases
file to add any of your favorite Unix aliases etc.Then, once you are in a cloud instance using the GEOSChem-dev-gcc9.2.0
AMI, clone the fork of your aws-env repo to your home directory. Then type:
~/aws-env/initenv.sh
source ~/.bash_aliases
This will apply the settings of your startup scripts.
So far I have been using custom scripts to download input data from S3. They are just a bunch of aws s3 cp
commands with ad-hoc --include
/ --exclude
filters. This works fine initially, but can cause maintenance troubles in the long term. Because every version release will add or remove some datasets, I always need to tweak the scripts a bit. This contrasts with the principle of "minimizing human intervention" in software deployment.
For example, 12.5.0 introduces offline grid-independent emissions with a total size of ~2TB. Downloading the entire dataset takes too much time & space, so I need to skip downloading them by default (88f881f)
Most complications are from the HEMCO
directory, which contains many emission datasets and gets updated frequently. Metfields and other files are generally quite static.
The new script might look like the hemco_data_download tool made by @yantosca . Instead of using Perl, the new script should probably be written in Python with boto3. It might use a similar config file like hemcoDataDownload.rc (can be in YAML/JSON), or parse the HEMCO_Config.rc
file in the model run directory, to determine which dataset to download.
An important feature is to select a time window, as some datasets span over a long time and the total sizes are quite large. For example, OFFLINE_BIOVOC
spans over 4 years, while a typical model simulation only needs a few months:
$ aws s3 ls --request-payer=requester s3://gcgrid/HEMCO/OFFLINE_BIOVOC/v2019-01/0.25x0.3125/
PRE 2014/
PRE 2015/
PRE 2016/
PRE 2017/
The most straightforward way is probably rewriting hemco_data_download
in Python, and replacing wget
with s3.download_file()
See boto docs. Can also have an option to use ftplib to download data from Harvard FTP, so the same script works with multiple data sources.
@yantosca and Judit (@xaxis3) should be most capable of doing this. This is not urgent right now but will save a lot of time in the long run...
Dear GEOS-Chem users,
We are excited to invite you to attend the first user testing and survey for the GEOSChem-on-cloud project! Our major purposes are:
This testing should take you at most an hour and cost less than $0.1.
Why would you bother attending if you already have GEOS-Chem running smoothly on your local computer? Here are some good reasons:
The survey will be done transparently on GitHub, so the first thing is signing up for an GitHub account if you don't have one already. GitHub is also the currently recommended way to submit your code updates, so it is nice to have an account, anyway.
Please add minimal information to your GitHub profile, such as a picture, one-sentence personal introduction, or a link to personal website / Google scholar site. So we can know who you are and how to contact you. Also consider giving a star to the repositories under the GEOS-Chem team. Just click on the "star" button on the upper-right corner of each repository page. This "shows appreciation to the repository maintainer for their work".
GEOS-Chem has a reasonably large amount of users, but it is sad that its GitHub repo receives so little attention because most scientists don't use GitHub. Here is actually a much better place for discussion compared to private emails, because all discussions are public and can be easily found by anyone with similar problems.
You will reply to this survey by simply posting GitHub comments within this GitHub issue (i.e. the current page you are looking at). See the "comment" cell at the bottom of this page, with a big, green "comment" button.
The first part is general info:
The second part is just a place holder right now. It will be the time you spend on a minimal GEOS-Chem demo on the cloud, as explained in the next section:
We have a very comprehensive documentation, but the most important part for a new user is the Quick start guide. That guide has 5 major steps. We sincerely ask you to go through them and record the time you spend on each step. The timing starts when you begin to follow the instruction for a particular step, and ends according the following rules. Please read through the rules to get a general idea before actually starting the timing.
Step 1. Sign up an AWS account: This step is considered finished when you can log into AWS console with your account. Note that a credit card is needed, although the cost of this short demo is negligible (~$0.1). If you don't have a credit card but still wants to try the cloud, please contact us individually. For this test, I think it is very useful to see how long it takes to get the account. (If you have already signed-up for an account, simply put a rough number of how long that took last time. Additional operations such as subscribing to the educational credit is not included in the timing.)
Step 2: Launching a server with GEOS-Chem pre-installed: This step is considered finished when you can see the running server (also called "EC2 instance") with a green "running" icon in your AWS console.
Step 3: Log into the server and run GEOS-Chem: This step is considered finished when you can execute the simulation by the command ./geos.mp
. No need to wait for the simulation to finish. (For Windows users, do not include the time on installing terminal software like Git-BASH).
Step 4: Analyze output data with Python and Jupyter: This step is considered finished when you can see the Jupyter notebook interface in your browser. You will find this step easier if you take a look at our interactive Python tutorial first. However, you don't have to know Python in order to follow this step. You can simply copy and paste the commands shown in the guide. I just want to see if people can successfully connect to the Jupyter notebook program on the cloud. Learning Python can be a separate topic.
Step 5: Shut down the server: This step is trivial and there is no need to report the time on it. But do remember to shut down the server, otherwise you may be charged by much more than $0.1.
Thanks very much for your time and help! Hope you find this new cloud computing capability useful!
I think most users will use emacs and not vim. Installing emacs24 manually worked well but it should be there by default.
dryrun error in GEOSChem 12.7.0 geosfp_2x2.5_CO2 simulation on a c5.4xlrge AWS instance
Include the steps that must be done in order to reproduce the observed behavior:
commands
At line 1027 of file input_mod.F
Fortran runtime error: Expected INTEGER for item 1 in formatted transfer, got REAL
(i10)
^
Error termination. Backtrace:
#0 0x14c3441e820b in require_type
at /tmp/ubuntu/spack-stage/spack-stage-gcc-9.2.0-ss6lvrx5xrdw2fvz537f7alyak3cym2x/spack-src/libgfortran/io/transfer.c:1328
#1 0x14c3441eb27d in require_type
at /tmp/ubuntu/spack-stage/spack-stage-gcc-9.2.0-ss6lvrx5xrdw2fvz537f7alyak3cym2x/spack-src/libgfortran/io/transfer.c:1320
#2 0x14c3441eb27d in formatted_transfer_scalar_write
at /tmp/ubuntu/spack-stage/spack-stage-gcc-9.2.0-ss6lvrx5xrdw2fvz537f7alyak3cym2x/spack-src/libgfortran/io/transfer.c:1945
#3 0x14c3441eb64b in formatted_transfer
at /tmp/ubuntu/spack-stage/spack-stage-gcc-9.2.0-ss6lvrx5xrdw2fvz537f7alyak3cym2x/spack-src/libgfortran/io/transfer.c:2335
#4 0x14c3441e7ee5 in wrap_scalar_transfer
at /tmp/ubuntu/spack-stage/spack-stage-gcc-9.2.0-ss6lvrx5xrdw2fvz537f7alyak3cym2x/spack-src/libgfortran/io/transfer.c:2369
#5 0x14c3441e7ee5 in wrap_scalar_transfer
at /tmp/ubuntu/spack-stage/spack-stage-gcc-9.2.0-ss6lvrx5xrdw2fvz537f7alyak3cym2x/spack-src/libgfortran/io/transfer.c:2346
#6 0x4c56ef in read_grid_menu
at /home/ubuntu/tutorial/Code.GC-classic/GeosCore/input_mod.F:1027
#7 0x4cb941 in __input_mod_MOD_read_input_file
at /home/ubuntu/tutorial/Code.GC-classic/GeosCore/input_mod.F:174
#8 0x48f56c in geos_chem
at /home/ubuntu/tutorial/Code.GC-classic/GeosCore/main.F:386
#9 0x4033c6 in ???
#10 0x14c342f8eb96 in ???
#11 0x4033f9 in ???
#12 0xffffffffffffffff in ???
Please include the following:
CopyRunDirs.input.txt
HEMCO_Config.rc.txt
HISTORY.rc.txt
input.geos.txt
lastbuild.txt
log.dryrun.txt
I am new to running GEOS-Chem, and just set up an AWS account to try to follow the tutorial (https://cloud-gc.readthedocs.io/en/latest/chapter02_beginner-tutorial/quick-start.html). However, I noticed that the screen shots and options in the tutorial are not the same as what I see on AWS's website.
For example:
there are several newer versions of the GEOS-Chem tutorial on the AMIs pull down- should we be using the one in the tutorial, or the latest version of the tutorial I see on AWS? Or does it matter? And if it matters, do newer versions require different computing resources than those suggested in the older version of the tutorial?
the tutorial's Choose an Instance Type step suggests 'r5.large'- although this is an option in the current AWS interface, the information does not look the same on AWS as the tutorial screen shot shows. The default on AWS is 't2.micro', which is 'Free tier eligible', but 'r5.large' is listed as 0.126 USD/hour- are we charged to run the tutorial if we choose the suggested r5.large, or is this just for longer runs using more computing time? Please advise.
I would appreciate updated screen shots or tutorial information that matches with the interface that we now see on AWS so I know if I am 'doing it right' as I learn for the first time (or even information that says 'updates don't matter, please follow directions even if the screen shots no longer match')
Thank you for all of your time and effort- this is a great resource.
I am collaborating with a team of health researchers, and for our project they have requested daily near-surface PM2.5 data related to 'all source' vs 'no fires'- for reference, what we are doing is similar to Liu et al., 2017, Epidemiology: 10.1097/eDe.0000000000000556, who ran similar experiments for western North America.
I had a few questions about setting up production runs on AWS (GEOS-Chem Classic, v14, the main branch as of Dec, 2022) to generate the PM2.5 data.
Our eventual needs: ~0.5 degree spatial resolution over three regions (the continental US, tropical S America (Brazil), and a smaller area in Southeast Asia), daily PM2.5, time period: 2018,2019,2020,2021. We will need a 'all source' run as well as a 'no fires' run for each of these regions. (We are funds-limited, so need to keep costs down to < ~$10k.)
What is the most sensible/efficient way to set this up?
I did some testing on AWS with various c5 instance types/sizes, and it looks like running a global 4x5 degree (c5.9xlarge) to save boundary conditions, then running nested regions (c5.9xlarge or c5.12xlarge), will be the fastest/cheapest way to go, but I'd like some input please.
can we run GEOS-Chem using data on AWS s3 from Jan 2018 through Jan 1, 2022?
The health data extends to the end of 2021, but if I request the final day of the dry run as Jan 1, 2022, I get errors (looks like no Jan 1 HEMCO data?).
In the GEOS-Chem Classic v14, can I select the lat/lon bounds of the output boundary conditions (we don't need the Southern Ocean/Antarctica for example) to decrease boundary condition file size? I see this was an option in v12 (someone asked a question about running nested regions and this issue came up), but now the HISTORY file has no lat/lon selection option for the boundary condition collection:
#==============================================================================
# %%%%% THE BoundaryConditions COLLECTION %%%%%
#
# GEOS-Chem boundary conditions for use in nested grid simulations
#
# Available for all simulations
#==============================================================================
BoundaryConditions.template: '%y4%m2%d2_%h2%n2z.nc4',
BoundaryConditions.frequency: 00000000 030000
BoundaryConditions.duration: 00000001 000000
BoundaryConditions.mode: 'instantaneous'
BoundaryConditions.fields: 'SpeciesBC_?ADV? ',
::
What is the latest month/year of complete meteorology etc data available on AWS s3?
Thanks for your help and support- I appreciate it.
I was able to use Git Bash from MS Windows 7 to log into the AWS instance. However, on Windows there are a couple of assumptions and extra steps:
(1) Git-Bash on Windows assumes that your home directory is
C:/Users/YOUR-WINDOWS-NAME
where YOUR-WINDOWS-NAME is your screen name on Windows.
(2) Git-Bash on Windows assumes that your SSH keys will be stored by default in
C:/Users/YOUR-WINDOWS-NAME/.ssh
So this is the place where you can put the private key that you created to log into AWS.
(3) Git-Bash on Windows requires that this .bashrc be placed in your home directory (i.e. C:/Users/YOUR-WINDOWS-NAME):
#======================================================
# Set personal preferences (feel free to edit)
#======================================================
# Change to home directory
cd C:/Users/YOUR-WINDOWS-NAME
# Name of the private key you created for AWS
aws_priv_key=C:/Users/YOUR-WINDOWS-NAME/.ssh/YOUR-AWS-PRIVATE-KEY
# Set prompt (optional)
PS1="\[\e[1;93m\][YOUR-WINDOWS-NAME \W]$\[\e[0m\] "
# Launch an AWS setting with the AWS private key you set up
alias login_aws="ssh -XA -i ${aws_priv_key} "
#======================================================
# Start the ssh agent and add your AWS private key
#======================================================
env=~/.ssh/agent.env
agent_load_env () { test -f "$env" && . "$env" >| /dev/null ; }
agent_start () {
(umask 077; ssh-agent >| "$env")
. "$env" >| /dev/null ; }
agent_load_env
# agent_run_state: 0=agent running w/ key; 1=agent w/o key; 2= agent not running
agent_run_state=$(ssh-add -l >| /dev/null 2>&1; echo $?)
if [ ! "$SSH_AUTH_SOCK" ] || [ $agent_run_state = 2 ]; then
agent_start
ssh-add ${aws_priv_key}
elif [ "$SSH_AUTH_SOCK" ] && [ $agent_run_state = 1 ]; then
ssh-add ${aws_priv_key}
fi
unset env
This will make sure to forward the private key to your AWS instance. Replace YOUR-WINDOWS-NAME your Windows screen name and YOUR-AWS-PRIVATE-KEY with the name of the private key you made to log in to AWS.
(4) Make sure to add public key that corresponds to your AWS private key to Github, or else you won't be able to clone repos from there.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.