team-ocean / veros Goto Github PK
View Code? Open in Web Editor NEWThe versatile ocean simulator, in pure Python, powered by JAX.
Home Page: https://veros.readthedocs.io
License: MIT License
The versatile ocean simulator, in pure Python, powered by JAX.
Home Page: https://veros.readthedocs.io
License: MIT License
Hey all,
I have run into an issue when trying to parallelize the new version of VEROS on a cluster.
The problem arises when using the PETsc linear solver which is set as the default if you use cpu and over 1 process.
The problem is a division by zero in the _petsc_solver function in the:
rel_residual=residual_norm/rhs_norm
line. So the only way I found around this is to explicitly state what solver to use in the bash file:
export 'VEROS_LINEAR_SOLVER'='scipy'
before running the resubmit command.
The models I have tried with the same problem are: ACC_channel, ACC_basic, ACC_sector and the north Atlantic setup.
The cluster uses slurm, and this is what is written in the batch file:
#!/bin/bash -l
#SBATCH -p mycluster
#SBATCH -A myaccount
#SBATCH --job-name=veros_mysetup
#SBATCH --time=23:59:59
#SBATCH --constraint=v1
#SBATCH --nodes=1
#SBATCH --ntasks=16
#SBATCH --cpus-per-task=1
#SBATCH --threads-per-core=1
#SBATCH --exclusive
export OMP_NUM_THREADS=2
export 'VEROS_LINEAR_SOLVER'='scipy'
veros resubmit -i acc -n 1 -l 31536000
-c "srun --mpi=pmi2 -- veros run acc_sector.py -b jax --float-type float32 -n 4 4"
--callback "sbatch veros_batch.sh"
I have also tried changing the number of taskes with no change.
The two version of VEROS I have tried are:
veros/040422_cpu_py3.9.10
veros/240322_cpu_py3.9.10
Hope this help and is the information needed,
Cheers,
Rasmus
Just FYI: the link to the high-quality simulation GIF is broken.
*in the README
I've installed veros on ubuntu 14.04 as per the intructions. When I try to run the eady.py example, I get a failure that leads to an abort and core
~/veros/setup/eady$ python eady.py
Setting up everything
Initializing streamfunction method
determining number of land masses
Land mass and perimeter
0 5 10 15 20 25 30 35
35 111111111111111111111111111111111111
34 111111111111111111111111111111111111
33 ************************************
32 000000000000000000000000000000000000
31 000000000000000000000000000000000000
30 000000000000000000000000000000000000
29 000000000000000000000000000000000000
28 000000000000000000000000000000000000
27 000000000000000000000000000000000000
26 000000000000000000000000000000000000
25 000000000000000000000000000000000000
24 000000000000000000000000000000000000
23 000000000000000000000000000000000000
22 000000000000000000000000000000000000
21 000000000000000000000000000000000000
20 000000000000000000000000000000000000
19 000000000000000000000000000000000000
18 000000000000000000000000000000000000
17 000000000000000000000000000000000000
16 000000000000000000000000000000000000
15 000000000000000000000000000000000000
14 000000000000000000000000000000000000
13 000000000000000000000000000000000000
12 000000000000000000000000000000000000
11 000000000000000000000000000000000000
10 000000000000000000000000000000000000
9 000000000000000000000000000000000000
8 000000000000000000000000000000000000
7 000000000000000000000000000000000000
6 000000000000000000000000000000000000
5 000000000000000000000000000000000000
4 000000000000000000000000000000000000
3 000000000000000000000000000000000000
2 ************************************
1 222222222222222222222222222222222222
0 222222222222222222222222222222222222
0 5 10 15 20 25 30 35
solving for boundary contribution by island 0
solving for boundary contribution by island 1
Cannot load library: /usr/lib/libbh_ve_openmp.so: undefined symbol: _ZN7bohrium4jitk17write_source2fileERKSsRKN5boost10filesystem4pathEmS2_b
terminate called after throwing an instance of 'std::runtime_error'
what(): ConfigParser: Cannot load library
Aborted (core dumped)
Looking at the core dump doesn't seem to help, though I may be doing this wrong:
gdb ~/path_to_python core
[New LWP 3825]
[New LWP 3828]
[New LWP 3830]
[New LWP 3829]
Core was generated by `python eady.py'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f4ca0f4ac37 in ?? ()
(gdb) bt
#0 0x00007f4ca0f4ac37 in ?? ()
#1 0x00007f4ca0f4e028 in ?? ()
#2 0x0000000000000020 in ?? ()
#3 0x0000000000000000 in ?? ()
Any suggestions?
I was not able to find out the reason behind resubmission issue with job scheduler, such as:
veros-resubmit -i acc.lowres -n 50 -l 62208000 -c "python acc.py -b bohrium -v debug" --callback "/usr/bin/sbatch /groups/ocean/nutrik/veros_cases/paper/acc/veros_batch.sh"
Although jobs with run length of up to 29 days are resubmitted fine, those with longer run length are not resubmitted and no errors or messages are reported.
In fact, jobs are successfully resubmitted without scheduler (--callback "./veros_batch.sh"
) for any run length.
Necessary steps:
git lfs
to external webspace and handle dynamical asset downloadsetup.py
to comply with best practicesHi Dion,
Nice work with Veros! I skimmed through the docs and your slides from AMS. I have a few questions:
I am looking at your benchmarks for different problem sizes, and one thing confuses me. For 1e7 elements, you get, for example for Veros numpy, 65 s for a 4-core system and 90 s for a 24-core system + GPU. Am I not reading this correctly? This seems to me like that performance is worse with more cores, but I cannot imagine that would be the case.
So far, we use a home brew testing suite with limited flexibility and robustness. Ideally, we would want a testing suite that
pytest
codecov
.I am trying to run global_flexible setup from Veros' setup gallery.
Run script:
#!/bin/bash -l
#
#SBATCH -p aegir
#SBATCH -A ocean
#SBATCH --job-name=flexdeg
#SBATCH --time=23:59:59
#SBATCH --constraint=v1
#SBATCH --nodes=2
#SBATCH --ntasks=32
#SBATCH --cpus-per-task=1
#SBATCH --exclusive
##SBATCH --mail-type=ALL
##SBATCH --mail-user=<REDACTED>
##SBATCH --output=slurm.out
export OMP_NUM_THREADS=1
module load veros/23052019
srun -v --mpi=pmi2 --kill-on-bad-exit python -m mpi4py global_flexible.py -n 8 4 -b bohrium >& veros_run.log
and getting md5 sum mismatch of forcing & bathymetry files
Veros log file:
srun: defined options for program `srun'
srun: --------------- ---------------------
srun: user : `nutrik'
srun: uid : 16001
srun: gid : 16000
srun: cwd : /lustre/hpc/ocean/nutrik/veros_cases/global_flexible
srun: ntasks : 32 (set)
srun: cpus_per_task : 1
srun: nodes : 2 (set)
srun: jobid : 13466356 (default)
srun: partition : default
srun: profile : `NotSet'
srun: job name : `2deg'
srun: reservation : `(null)'
srun: burst_buffer : `(null)'
srun: wckey : `(null)'
srun: cpu_freq_min : 4294967294
srun: cpu_freq_max : 4294967294
srun: cpu_freq_gov : 4294967294
srun: switches : -1
srun: wait-for-switches : -1
srun: distribution : unknown
srun: cpu_bind : default (0)
srun: mem_bind : default (0)
srun: verbose : 1
srun: slurmd_debug : 0
srun: immediate : false
srun: label output : false
srun: unbuffered IO : false
srun: overcommit : false
srun: threads : 60
srun: checkpoint_dir : /var/slurm/checkpoint
srun: wait : 0
srun: nice : -2
srun: account : (null)
srun: comment : (null)
srun: dependency : (null)
srun: exclusive : false
srun: bcast : false
srun: qos : (null)
srun: constraints : mincpus-per-node=1 mem-per-cpu=1024M
srun: geometry : (null)
srun: reboot : yes
srun: rotate : no
srun: preserve_env : false
srun: network : (null)
srun: propagate : NONE
srun: prolog : (null)
srun: epilog : (null)
srun: mail_type : NONE
srun: mail_user : (null)
srun: task_prolog : (null)
srun: task_epilog : (null)
srun: multi_prog : no
srun: sockets-per-node : -2
srun: cores-per-socket : -2
srun: threads-per-core : -2
srun: ntasks-per-node : -2
srun: ntasks-per-socket : -2
srun: ntasks-per-core : -2
srun: plane_size : 4294967294
srun: core-spec : NA
srun: power :
srun: remote command : `python -m mpi4py global_flexible.py -n 8 4 -b bohrium'
srun: launching 13466356.0 on host node172, 16 tasks: [0-15]
srun: launching 13466356.0 on host node173, 16 tasks: [16-31]
srun: route default plugin loaded
srun: Node node172, 16 tasks started
srun: Node node173, 16 tasks started
WARNING: Error in initializing MVAPICH2 ptmalloc library.Continuing without InfiniBand registration cache support.
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.708 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:47.713 | INFO | veros.tools.assets:get_asset:73 - Downloading asset ETOPO5_Ice_g_gmt4.nc ...
2019-05-28 16:26:48.468 | INFO | veros.tools.assets:get_asset:73 - Downloading asset forcing_1deg_global_interpolated.nc ...
2019-05-28 16:26:48.496 | INFO | veros.tools.assets:get_asset:73 - Downloading asset forcing_1deg_global_interpolated.nc ...
2019-05-28 16:26:48.583 | INFO | veros.tools.assets:get_asset:73 - Downloading asset forcing_1deg_global_interpolated.nc ...
2019-05-28 16:26:48.595 | INFO | veros.tools.assets:get_asset:73 - Downloading asset forcing_1deg_global_interpolated.nc ...
2019-05-28 16:26:48.606 | INFO | veros.tools.assets:get_asset:73 - Downloading asset forcing_1deg_global_interpolated.nc ...
2019-05-28 16:26:48.611 | INFO | veros.tools.assets:get_asset:73 - Downloading asset forcing_1deg_global_interpolated.nc ...
Traceback (most recent call last):
File "/groups/ocean/software/python/gcc/3.6.7/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/groups/ocean/software/python/gcc/3.6.7/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/groups/ocean/software/mpi4py_mvapich231/gcc/3.0.1/lib/python3.6/site-packages/mpi4py/__main__.py", line 7, in <module>
main()
File "/groups/ocean/software/mpi4py_mvapich231/gcc/3.0.1/lib/python3.6/site-packages/mpi4py/run.py", line 196, in main
run_command_line(args)
File "/groups/ocean/software/mpi4py_mvapich231/gcc/3.0.1/lib/python3.6/site-packages/mpi4py/run.py", line 47, in run_command_line
run_path(sys.argv[0], run_name='__main__')
File "/groups/ocean/software/python/gcc/3.6.7/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/groups/ocean/software/python/gcc/3.6.7/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/groups/ocean/software/python/gcc/3.6.7/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "global_flexible.py", line 16, in <module>
DATA_FILES = veros.tools.get_assets('global_flexible', os.path.join(BASE_PATH, 'assets.yml'))
File "/lustre/hpc/ocean/software/veros/repo23052019/veros/tools/assets.py", line 81, in get_assets
return {key: get_asset(val['url'], val.get('md5', None)) for key, val in assets.items()}
File "/lustre/hpc/ocean/software/veros/repo23052019/veros/tools/assets.py", line 81, in <dictcomp>
return {key: get_asset(val['url'], val.get('md5', None)) for key, val in assets.items()}
File "/lustre/hpc/ocean/software/veros/repo23052019/veros/tools/assets.py", line 77, in get_asset
raise AssetError('Mismatching MD5 checksum on asset %s' % target_filename)
veros.tools.assets.AssetError: Mismatching MD5 checksum on asset forcing_1deg_global_interpolated.nc
srun: Complete job step 13466356.0 received
slurmstepd: error: *** STEP 13466356.0 ON node172 CANCELLED AT 2019-05-28T16:26:49 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: Complete job step 13466356.0 received
srun: Received task exit notification for 16 tasks (status=0x0009).
srun: error: node172: tasks 0-15: Killed
srun: Terminating job step 13466356.0
srun: Complete job step 13466356.0 received
srun: Received task exit notification for 16 tasks (status=0x0009).
srun: error: node173: tasks 16-31: Killed
When looking at our benchmarks, I noticed that the NumPy backend loses some ground against Fortran when using MPI. We should try and understand why that is.
The licence stated in the discussion paper is GPL. The licence file in the repository is MIT. If you want the code for instance to be licensed under GPL>=3.0, the LICENSE file should be replaced, and it is best to be precise in the paper, e.g. "... code is available under the GNU General Public License version 3.0, or, at your option, any higher version."
By the way, it is fantastic that this code is developed and shared (as free software)—an important contribution to ocean modelling and oceanography!
Hi Dion,
When I change x_origin
of Veros grid (91. --> 181.) in setup/global_1deg/global_one_degree.py
, the resulted Land mass and perimeter (see below) contain New Zealand as a lake area (or sea) and the model returns the following:
/groups/ocean/software/bohrium/gcc/14112017/lib64/python2.7/site-packages/bohrium/array_create.py:167: RuntimeWarning: Encountering an operation not supported by Bohrium. It will be handled by the original NumPy.
return numpy.array(ary, dtype=dtype, copy=copy, order=order, subok=subok, ndmin=ndmin, fix_biclass=False)
/lustre/hpc/ocean/nutrik/veros/veros/core/external/solve_poisson.py:75: RuntimeWarning: divide by zero encountered in divide
Z[2:-2, 2:-2] = np.where(Y != 0., 1. / Y, 1.)
Traceback (most recent call last):
File "global_one_degree.py", line 273, in <module>
simulation.setup()
File "/lustre/hpc/ocean/nutrik/veros/veros/veros.py", line 234, in setup
external.streamfunction_init(self)
File "/lustre/hpc/ocean/nutrik/veros/veros/decorators.py", line 50, in veros_method_wrapper
res = function(*args, **kwargs)
File "/lustre/hpc/ocean/nutrik/veros/veros/core/external/streamfunction_init.py", line 166, in streamfunction_init
solve_poisson.initialize_solver(vs)
File "/lustre/hpc/ocean/nutrik/veros/veros/decorators.py", line 50, in veros_method_wrapper
res = function(*args, **kwargs)
File "/lustre/hpc/ocean/nutrik/veros/veros/core/external/solve_poisson.py", line 18, in initialize_solver
preconditioner = _jacobi_preconditioner(vs, matrix)
File "/lustre/hpc/ocean/nutrik/veros/veros/decorators.py", line 50, in veros_method_wrapper
res = function(*args, **kwargs)
File "/lustre/hpc/ocean/nutrik/veros/veros/core/external/solve_poisson.py", line 76, in _jacobi_preconditioner
return scipy.sparse.dia_matrix((Z.flatten(), 0), shape=(Z.size, Z.size)).tocsr()
File "/groups/ocean/software/python/gcc/2.7.13/lib/python2.7/site-packages/scipy/sparse/base.py", line 764, in tocsr
return self.tocoo(copy=copy).tocsr(copy=False)
File "/groups/ocean/software/python/gcc/2.7.13/lib/python2.7/site-packages/scipy/sparse/dia.py", line 354, in tocoo
mask &= (self.data != 0)
File "ufuncs.pyx", line 594, in ufuncs._handle__array_ufunc__ (/groups/ocean/software/tarballs/bohrium/bohrium14112017/build/bridge/npbackend/ufuncs.c:12150)
File "bhary.pyx", line 105, in bhary.fix_biclass_wrapper.inner (/groups/ocean/software/tarballs/bohrium/bohrium14112017/build/bridge/npbackend/bhary.c:3381)
File "ufuncs.pyx", line 161, in ufuncs.Ufunc.__call__ (/groups/ocean/software/tarballs/bohrium/bohrium14112017/build/bridge/npbackend/ufuncs.c:5755)
AttributeError: 'tuple' object has no attribute 'shape'
Similar story happens to the setup/wave_propagation
case or the model cannot initialise barotropic stream function solver and returns the following error:
File "..../veros/core/external/streamfunction_init.py", line 69, in streamfunction_init
raise RuntimeError("found no starting point for line integral")
Do you know how algorithms for "Land mass and perimeter" setup + streamfunction_init work?
Land mass and perimeter
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120
163 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
162 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
161 ****1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
160 000*******1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
159 000*2********1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
158 *******1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
157 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
156 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
155 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
154 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
153 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
152 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
151 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
150 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
149 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
148 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111*1111111111111111111111
147 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111*11111111***111111111111111111111
146 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111*****11******11111111111111111111
145 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111**000*****66****111111111111111111
144 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111*0000000******1111111111111111111
143 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111*******00000000000*11111111111111111111
142 11111111111111111111111111111111111111111111111111111111111111111111****11111111111****00000000000000000*11111111111111111111
141 1111111111111111111111111111111111111111111111111111111111111111111**0**11111*******00000000000000000000****11111111111111***
140 111111111111111111111111111111111111111111111111111111*****************11111**00000000000000000000000000000***1111111111***00
139 1111111111111111111111111111111111111111111111111111***000000000000**111111**00000000000000000000000000000000****1111111*0000
138 11111111111111111111111111111111111111111111111111***0000000000000**1111111*00000000000000000000000000000000****111**1***0000
137 1111111111111111111111111111111111111111111111111**000000000000000*1111111**00000000000000000000000000000****1111******000000
136 11111111111111111111111111111111111111111111111111******0000000000*1111111*00000000000000000000000000000**1111****00000000000
135 1111111111111111111111111111111111111111111111111111111*0000000000**1111***00000000000000000000000000000*1*****00000000000000
134 1111111111111111111111111111111111111111111111111111111**0000000000*111**0000000000000000000000000000000***000000000000000000
133 11111111111111111111111111111111111111111111111111111111*0000000000*11**00000000000000000000000000000000000000000000000000000
132 11111111111111111111111111111111111111111111111111111*11*0000000000****000000000000000000000000000000000000000000000000000000
131 11111111111111111111111111111111111111111111111111111*11*00000000000000000000000000000000000000000000000000000000000000000000
130 1111111111111111111111111111111111111111111111111111**1**00000000000000000000000000000000000000000000000000000000000000000000
129 111111111111111111111111111111111111111111111111111*****000000000000000000000000000000000000000000000000000000000000000000000
128 111111111111111111111111111111111111111111111111111****0000000000000000000000000000000000000000000000000000000000000000000000
127 11111111111111111111111111111111111111111111111111***8***00000000000000000000000000000000000000000000000000000000000000000000
126 1111111111111111111111111111111111111111111111111**0*888****00000000000000000000000000000000000000000000000000000000000000000
125 11111111111111111111111111111111111111111111111***0**888888*00000000000000000000000000000000000000000000000000000000000000000
124 1111111111111111111111111111111111111111111*****000*8888****00000000000000000000000000000000000000000000000000000000000000000
123 111111111111111111111111111111111111111111**0000000*8****00000000000000000000000000000000000000000000000000000000000000000000
122 11111111111111111111111111111111111111111**00000000*88*0000000000000000000000000000000000000000000000000000000000000000000000
121 1111111111111111111111111111111111111111**00000000**88*0000000000000000000000000000000000000000000000000000000000000000000000
120 1111111111111111111111111111111111111111**00000****888*0000000000000000000000000000000000000000000000000000000000000000000000
119 11111111111111111111111111111111111***111**0000*8*888**0000000000000000000000000000000000000000000000000000000000000000000000
118 1111111111111111111111111111111111**0*1111******88888*00000000000000000000000000000000000000000000000000000000000000000000000
117 111111111111111111111111111111111**00*111***888888888*00000000000000000000000000000000000000000000000000000000000000000000000
116 111111111111111111111111111111111*000*111**888888*****00000000000000000000000000000000000000000000000000000000000000000000000
115 111111111111111111111111111111111**00*1***88888***000000000000000000000000000000000000000000000000000000000000000000000000000
114 1111111111111111111111111111111111*00****888****00000000000000000000000000000000000000000000000000000000000000000000000000000
113 1111111111111111111111111111111111*00000**8**00000000000000000000000000000000000000000000000000000000000000000000000000000000
112 1111111111111111111111111111111111*000000***000000000000000000000000000000000000000000000000000000000000000000000000000000000
111 1111111111111111111111111111111111*000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
110 1111111111111111111111111111111111*000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
109 111111111111111111111111111111111**000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
108 11111111111111111111111111111111**0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
107 11111111111111111111111111111111***000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
106 1111111111111111111111111111111*11*000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
105 111111111111111111111111111111**1**000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
104 111111111111111111111111111*****1*0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
103 1111111111111111111111111***000***0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
102 ****111111111111111**11***00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000***0000000
101 000**1111111111111**111*00000000***00000000000000000000000000000000000000000000000000000000000000000000000000000000*4*0000000
100 0000**111111111111**11**0000000**6*00000000000000000000000000000000000000000000000000000000000000000000000000000000***0000000
99 00000*1111111111111****00000000*66*000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
98 00000*11111111111111**000000000*66*000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
97 00000***1111111111111*000000000*66*000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
96 0000000***11111111111*000000000*66***0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
95 000000000*11111111111*000000000*6666*0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
94 000000000*11*11111111*000000000******0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
93 000000000*11***111111*00000000**7*8**0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
92 000000000*11*0*11111**0000000**7**88**000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
91 000000000*11*0**111**0000000**7*****8**00000000000000000000000000000000000000000000000000000000000000000000000000000000000000
90 000000000*11**0**1**00000000*7**0***88*00000000000000000000000000000000000000000000000000000000000000000000000000000000000000
89 000000000**11**0***00000000**7*00*8888*00000000000000000000000000000000000000000000000000000000000000000000000000000000000000
88 000000***0*111**0000000000**77**0***8**00000000000000000000000000000000000000000000000000000000000000000000000000000000000000
87 000000*1****111**00000000**7777*000***000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
86 000000*111**1111*0000000**77777*000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
85 000000***1111111*00000***77777**000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
84 00000000*1111111*000***7777777**0000000***00000000000000000000000000000000000000000000000000000000000000000000000000000000000
83 00000000**111111*000*7777777777*******0*9*00000000000000000000000000000000000000000000000000000000000000000000000000000000000
82 00000000*1*11111**00*777777777***0000*****00000000000000000000000000000000000000000000000000000000000000000000000000000000000
81 00000000****11111*00*77777777***0******1*******000000000000000000000000000000000000000000000000000000000000000000000000000000
80 00000000000*11111*****7777777**0000*00****2222*******000000000000000000000000000000000000000000000000000000000000000000000000
79 00000000000**1111111**7777777**000**0*******22222222****000000000000000000000000000000000000000000000000000000000000000000000
78 000000000000**1111*1***777777**0*00*0*3*444*22222222222***00000***00000000000000000000000000000000000000000000000000000000000
77 0000000000000**111***0****77***0**0*0********2*2222222222**00***5***000000000000000000000000000000000000000000000000000000000
76 00000000000000**11******0****0*0**0*00000000******22222222****55**7**00000000000000000000000000000000000000000000000000000000
75 000000000000000**1111*1****000******000000****8*0*2222222222*******7*00000000000000000000000000000000000000000000000000000000
74 0000000000000000***1111111*****00000000000*9****0*222222222**00000***0***0000000000000000000000000000000000000000000000000000
73 000000000000000000*********000*0000***0000***0000***2222**22**00000000*1**000000000000000000000000000000000000000000000000000
72 00000000000000000000000000*****000**2*0000000000000**222***22**0000000*11*000000000000000000000000000000000000000000000000000
71 0000000000000000000000000000000000*2**000*****000000**22*0***2*0000000****000000000000000000000000000000000000000000000000000
70 0000000000000000000000000000000000***0000*222*****000*22*000***00000000000000000000000000000000000000000000000000000000000000
69 00000000000000000000000000000000000000000*2222222*00**22*00000000000000000000000000000000000000000000000000000000000000000000
68 000000000000000000000000000000000000******2222222*00*222**0000000000000000000000000000000000000000000000000000000000000000000
67 00000000000000000000000000000000000**222222222222*00*2222*0000000000000000000000000000000000000000000000000000000000000000000
66 0000000000000000000000000000000000**2222222222222****2222**0000000000000000000000000000000***00000000000000000000000000000000
65 000000000000000000000000000000000**22222222222222222*22222*00000000000000000000000000000***6*00000000000000000000000000000000
64 000000000000000000000000000000000*222222222222222222222222**0000000000000000000000000000*7***00000000000000000000000000000000
63 00000000000000000000000000000000**2222222222222222222222222***0000000***00***00000000000***0000000000000000000000000000000000
62 00000000000000000000000000000****2222222222222222222222222222**000000*8*00*9**00000000000000000000000000000000000000000000000
61 00000000000000000000000000****22222222222222222222222222222222**00000*8*00**9**0000000000000000000000000000000000000000000000
60 0000000000000000000000000**222222222222222222222222222222222222*00000***000**9**000000000000000000000000000000000000000000000
59 0000000000000000000000000*2222222222222222222222222222222222222*000000000000**9*000000000000000000000000000000000000000000000
58 0000000000000000000000000*2222222222222222222222222222222222222***00000000000***000000000000000000000000000000000000000000000
57 000000000000000000000000**222222222222222222222222222222222222222*00000000000000000000000000000000000000000000000000000000000
56 000000000000000000000000*2222222222222222222222222222222222222222*00000000000000000000000000000000000000000000000000000000000
55 000000000000000000000000*2222222222222222222222222222222222222222**0000000000000000000000000000000000000000000000000000000000
54 000000000000000000000000**2222222222222222222222222222222222222222*0000000000000000000000000000000000000000000000000000000000
53 0000000000000000000000000*2222222222222222222222222222222222222222*0000000000000000000000000000000000000000000000000000000000
52 0000000000000000000000000**222222222222222222222222222222222222222*0000000000000000000000000000000000000000000000000000000000
51 00000000000000000000000000*22222222222222222222222222222222222222**0000000000000000000000000000000000000000000000000000000000
50 00000000000000000000000000*22222222222222222222222222222222222222*00000000000000000000000000000000000000000000000000000000000
49 00000000000000000000000000*22222222222*******2222222222222222222**00000000000000000000000000000000000000000000000000000000000
48 00000000000000000000000000*2222222222**00000**222222222222222222*000000000000000000000000000000000000000000000000000000000000
47 00000000000000000000000000*2222*******0000000**2222222222222222**0000000000000000000****0000000000000000000000000000000000000
46 00000000000000000000000000******00000000000000*2*22222222222222*00000000000000000000*00**000000000000000000000000000000000000
45 0000000000000000000000000000000000000000000000****2*2222222222**00000000000000000000**00*000000000000000000000000000000000000
44 0000000000000000000000000000000000000000000000000***2222222222*0000000000000000000000*00***0000000000000000000000000000000000
43 000000000000000000000000000000000000000000000000000**2222222***000000000000000000000**0000*0000000000000000000000000000000000
42 0000000000000000000000000000000000000000000000000000***2*****00000000000000000000000*0000**0000000000000000000000000000000000
41 000000000000000000000000000000000000000000000000000000*******0000000000000000000000**0000*00000000000000000000000000000000000
40 0000000000000000000000000000000000000000000000000000000*2222*000000000000000000000**000***00000000000000000000000000000000000
39 0000000000000000000000000000000000000000000000000000000**222*0000000000000000000***000**0000000000000000000000000000000000000
38 00000000000000000000000000000000000000000000000000000000**2**000000000000000000**0000**00000000000000000000000000000000000000
37 000000000000000000000000000000000000000000000000000000000***000000000000000000**0000**000000000000000000000000000000000000000
36 000000000000000000000000000000000000000000000000000000000000000000000000000000*0000**0000000000000000000000000000000000000000
35 000000000000000000000000000000000000000000000000000000000000000000000000000000*000**00000000000000000000000000000000000000000
34 000000000000000000000000000000000000000000000000000000000000000000000000000000*0***000000000000000000000000000000000000000000
33 000000000000000000000000000000000000000000000000000000000000000000000000000000***00000000000000000000000000000000000000000000
32 00000000000000000000000000000000000000000000000000000000000000000000000000000***000000000000000000000000000000000000000000000
31 00000000000000000000000000000000000000000000000000000000000000000000000000000*5*000000000000000000000000000000000000000000000
30 00000000000000000000000000000000000000000000000000000000000000000000000000000***000000000000000000000000000000000000000000000
29 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
28 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
27 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
26 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
25 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
24 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
23 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
22 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
21 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
20 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
19 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
18 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
17 000000000000000000000000000000000000000000000***00000000000000000000000000000000000000000000000000000000000000000000000000000
16 ********************0*******0000000*****0*****8******000000000000000000000000000000000000000000000000000000000000000000000000
15 8**8888888888888888***88888*********888***8888888888*******000000000000000000000000000000000000000000000000000000000000000000
14 8888888888888888888888888888888888888888888888888888888888******0000000000000000000000000000000000000000000000000000000000000
13 888888888888888888888888888888888888888888888888888888888888888**********0000000000000000000000000000000000000000000000000000
12 888888888888888888888888888888888888888888888888888888888888888888888888********000000000000000000000000000000000000000000000
11 8888888888888888888888888888888888888888888888888888888888888888888888888888888*****00000000000000000000000000000000000000000
10 88888888888888888888888888888888888888888888888888888888888888888888888888888888888*00000000000000000000000000000000000000000
9 8888888888888888888888888888888888888888888888888888888888888888888888888888888888**00000000000000000000000000000000000000000
8 888888888888888888888888888888888888888888888888888888888888888888888888888888888**000000000000000000000000000000000000000000
7 88888888888888888888888888888888888888888888888888888888888888888888888888888*****0000000000000000000000000000000000000000000
6 888888888888888888888888888888888888888888888888888888888888888888888888888***0000000000000000000000000000000000000000000****
5 888888888888888888888888888888888888888888888888888888888888888888888888888***00000000000000000000000000000000000*********888
4 88888888888888888888888888888888888888888888888888888888888888888888888888888*************************************88888888888
3 88888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
2 88888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
1 88888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
0 88888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120
Not knowing the numerics, and possibly not reading the docs carefully enough, it is unclear to me how to initialize the velocities in the model.
For a channel run, re-entrant in x, with intial velocity 0.1 m/s everywhere, no forcing, I tried do in set_initial_conditions
: s.u = update(vs.u, at[...], 0.1 * vs.maskU[..., None])
.
The velocity signal only lasts for one time step, and then it is gone. It does create a small pressure perturbations that drive internal waves, but the mean flow of 0.1 m/s is immediately gone. Conversely, the initial conditions have psi=0 everywhere, and then immediately on the next time step there is a stream function, but if the units are really m^3/s it is far too small.
Was I to initialize psi at the beginning instead of u, or in addition to u?
Possible through mpi4py.futures, but this will require some prototyping.
The dream:
>>> with MPIContext(n_proc=8) as ctx:
... sim = MySetup(context=ctx)
... sim.setup()
... sim.run()
>>> sim.state
<gathered state>
where workers only live inside the MPIContext
and are destroyed afterwards.
This sounds awesome but I and I'm sure many others don't want to go through the trouble of installing it to see what it looks like.
How can I check bathymetry of a specific region in veros 4-deg setup ?
Looks pretty ugly in output time series. Probably some float roundoff shenanigans.
Possible candidates:
dask.array
Hi, everyone, I just stumbled into this project after watching this JuliaCon session and let me start by saying that I really enjoy its vision and scope!
I've been reading through the docs and it seems like you don't have the capability of running nonhydrostatic LES, do I understand that correctly?
If so, do you envision them to be implemented soon? From the intro I get the feeling that the nonhydrostatic solver will be ported soon from pyOM2, but I see no mention of LES closures. I ask this because I think that if LES are possible with this package, it'll open up many more research avenues given the extensibility of the model and ability to run on multiple GPUs.
Thanks!
This mostly implies to add canonical names as attributes to all variables (at least those relevant for output).
Veros is leaking memory, which is particularly noticeable in long-running low-resolution runs. This seems to be caused mostly by pyamg/pyamg#198 and to a lesser degree by bh107/bohrium#360.
As a workaround, it is advisable to re-start simulations every 10,000 time steps or so.
E.g. in global_1deg
, data is just assumed to be on the same grid as the model setup. Ideally, we would have a smart reader for external data that validates this assumption, and possibly does some interpolation if grids don't match.
Currently, diagnostics are using the warning
loglevel. It would be more idiomatic to use a custom level name (that may or may not have the same numerical level as warning
).
Hello Dear,
1- I have provided the boundary (Domain) and bathymetry file in ".txt "format in my study area. How can I add it in Veros?
also I have sea level and wind time series data in ."txt" format. How can I tuning them in Veros?
2- Unfortunately , I could not find the explanation about the output of the model in the website, How can I illustrate U and V components of currents and sea level?
3- Another issue that I have confused in website is which of scripts (please say the name of them) I should change to produce the physical condition of my region?
Thank you so much
Hi Dion,
It seems impossible to make 2 parallel/standalone Veros runs on 2 GPUs on the same node. The error message is below. Is it something to do with OpenCL keys?
Do you think that should work if I would have two separate versions of veros and/or bohrium and use them for individual runs?
Time step took 4.14e+00s
Current iteration: 17
build program: binary cache hit (key: 219cf206d832af2614aabaa6095b2a6d)
build program: start
build program: completed, success
pyopencl-invoker-cache-v1: in mem cache hit [key=1fc007e39493fed6f5e0c45672144578c269a48fd12e4bc28dc2ce3b7c1dc753]
build program: binary cache hit (key: 219cf206d832af2614aabaa6095b2a6d)
build program: start
build program: completed, success
pyopencl-invoker-cache-v1: in mem cache hit [key=1fc007e39493fed6f5e0c45672144578c269a48fd12e4bc28dc2ce3b7c1dc753]
Error code: -4
terminate called after throwing an instance of 'cl::Error'
what(): clEnqueueNDRangeKernel
Traceback (most recent call last):
File "/groups/ocean/software/veros/inst06032018/bin/veros-resubmit", line 11, in <module>
load_entry_point('veros', 'console_scripts', 'veros-resubmit')()
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo06032018/veros/cli/veros_resubmit.py", line 81, in cli
resubmit(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo06032018/veros/cli/veros_resubmit.py", line 61, in resubmit
call_veros(veros_cmd, identifier, current_n, length_per_run)
File "/lustre/hpc/ocean/software/veros/repo06032018/veros/cli/veros_resubmit.py", line 46, in call_veros
raise RuntimeError("Run {} failed, exiting".format(n))
RuntimeError: Run 0 failed, exiting
This is going to be me talking to myself for a while to explore the feasibility of introducing distributed memory support to Veros.
All of these should be easy to implement with MPI.
I am worried that requiring all routines to support distributed execution would be too daunting for less experienced programmers. On top of that, we lose some interoperability with 3rd party libraries (e.g. SciPy).
If somewhat feasible, we could introduce a parameter to the @veros_method
decorator:
@veros_method(dist_safe=False)
def mylocalmethod(vs):
...
Methods of this type could be executed on the main process only, but there are some challenges:
An alternative would be to simply throw an error when such a function is executed in a distributed context.
E.g., instead of introducing separate functions for global max, min, sum, zonal mean etc., we could have a generic reduction operator.
I only know of MPI4py, but there might be others. How difficult are they to install?
It seems like there are missing qnet
reading & assignment statements in set_initial_conditions
routine of global_1deg setup.
There must be 2 extra lines
qnet_data = self._read_forcing("q_net")
vs.qnet = update(vs.qnet, at[2:-2, 2:-2, :], -qnet_data * vs.maskT[2:-2, 2:-2, -1, npx.newaxis])
according to pyOM.
Core routine implementations should always give a reference directly in the code.
The goal is to get rid of the np
injection magic in veros_method
s, and to allow for specialized implementations based on the backend.
Introduce a backend module that implements all necessary functions, and dynamically dispatch the right function from the backend. Example:
def sum(arr, axis=None):
if rs.backend == 'bohrium':
return bh.sum(arr, axis=axis)
elif rs.backend == 'numpy':
return np.sum(arr, axis=axis)
...
Usage:
import veros.backend as vb
@veros_method
def my_parameterization(vs):
temp_sum = vb.sum(vs.temp)
Just opening to encourage the pressure solver as an option to the stream function solver...
For the problems I do, I often want to specify the initial velocity (see #271), and apply a momentum source to the velocity (body forcing). The stream function formulation appears to need the initial conditions and the forcing to be explicitly separated into baroclinic and barotropic components. I think with some work and looking at the pyOM manual I could figure out how to do that, but it might be more natural for the model to simply figure it out for me by solving the pressure equation.
Thanks!
Currently, all core routines are modeled very closely after the corresponding PyOM routines, including variable names and code structure. Additionally, all gradients are calculated via index shifting and slicing. This makes the code hard to read and understand, but has a performance impact, too, since temporary arrays cannot be freed until the (often overly long) routine has finished.
From the top of my head, possible enhancements include:
xarray
-style shifting / gradient function instead of explicit index shiftsflux_east
/ flux_north
/ ...fxa
, temp
, ...)Hi there
Firstly, thanks for sharing this code!
I can run the acc and eady models out of the box which is great but when I come to run any of global_1deg, global_4deg, north_atlantic or wave_propagation models, I get the following type of error. Is this something that you have seen before?
(my_root) clim01|Tue Jan 30|00:08:52|veros-run> cd global_1deg/
(my_root) clim01|Tue Jan 30|00:08:56|global_1deg> python global_one_degree.py
/scale_akl_persistent/filesets/home/williamsjh/veros/veros/core/numerics.py:10: UserWarning: Special OpenCL implementations could not be imported
warnings.warn("Special OpenCL implementations could not be imported")
Traceback (most recent call last):
File "global_one_degree.py", line 10, in
DATA_FILES = veros.tools.get_assets("global_1deg", os.path.join(BASE_PATH, "assets.yml"))
File "/scale_akl_persistent/filesets/home/williamsjh/veros/veros/tools/assets.py", line 43, in get_assets
return {key: get_asset(val["url"], val.get("md5", None)) for key, val in assets.items()}
File "/scale_akl_persistent/filesets/home/williamsjh/veros/veros/tools/assets.py", line 43, in
return {key: get_asset(val["url"], val.get("md5", None)) for key, val in assets.items()}
File "/scale_akl_persistent/filesets/home/williamsjh/veros/veros/tools/assets.py", line 36, in get_asset
_download_file(url, target_path)
File "/scale_akl_persistent/filesets/home/williamsjh/veros/veros/tools/assets.py", line 48, in _download_file
with requests.get(url, stream=True, timeout=timeout) as response:
AttributeError: exit
(my_root) clim01|Tue Jan 30|00:09:02|global_1deg>
I have run wget on the source file for this example and it is there.
Thanks for any ideas, I'm a bit stuck!
Jonny
The introduction might be a bit too tacky / salesman-like. I think a slightly more factual tone might help.
Hello everyone! I have been trying to run one of the basic setups available in Versos, i.e. , the gloabl_1deg setup on my PC. But each time I try to run, its showing a "killed" message in the terminal. My PC has 8 GB RAM ; Is it too low to run the setup on my PC?
Also I would like to know what the output variable names actually mean. There are some variables for example the "bolus_depth" and "bolus_iso" from the global_4deg setup which inspite of having the same coordinates and attributes (i.e., meridional transport), represent different things. I looked into the official documentation but it has description for only model variables.
Thanks for any kind of help!
Hi there, me again!
Could you please let me know how to run with the force_overwrite
setting? I can't seem to work it out!
Thanks
Jonny
We should enforce a consistent coding style, e.g. via flake8
, or even a code formatter like black
.
Hi Dion,
I cannot make Veros run on GPU.
It seems like a problem with compilation of TDMA solver.
Veros and Bohrium versions are up to date, i.e. cloned 1 day and 6 days ago, respectively.
Run script:
#!/bin/bash -l
export BH_STACK=opencl
veros-resubmit -i wp.05deg.cah -n 50 -l 31104000 -c "python wave_propagation.py -b bohrium -v debug" --callback "veros_gpu_run.sh"
Command line output:
Current iteration: 1
stopping integration at iteration 1
Waiting for lock wp.05deg.cah.0000.restart.h5 to be released
Timing summary:
setup time = 74.21s
main loop time = 1.24s
momentum = 0.67s
pressure = 0.00s
friction = 0.40s
thermodynamics = 0.00s
lateral mixing = 0.00s
vertical mixing = 0.00s
equation of state = 0.00s
EKE = 0.04s
IDEMIX = 0.00s
TKE = 0.51s
diagnostics and I/O = 0.00s
Traceback (most recent call last):
File "wave_propagation.py", line 413, in <module>
run()
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/tools/cli.py", line 49, in wrapped
run(*args, **kwargs)
File "wave_propagation.py", line 409, in run
simulation.run()
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/veros.py", line 267, in run
momentum.momentum(self)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/decorators.py", line 50, in veros_method_wrapper
res = function(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/core/momentum.py", line 76, in momentum
friction.implicit_vert_friction(vs)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/decorators.py", line 50, in veros_method_wrapper
res = function(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/core/friction.py", line 82, in implicit_vert_friction
res, mask = utilities.solve_implicit(vs, kss, a_tri, b_tri, c_tri, d_tri, b_edge=b_tri_edge)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/decorators.py", line 50, in veros_method_wrapper
res = function(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/core/utilities.py", line 52, in solve_implicit
return solve_tridiag(vs, a_tri, b_tri, c_tri, d_tri), water_mask
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/decorators.py", line 50, in veros_method_wrapper
res = function(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/core/numerics.py", line 257, in solve_tridiag
return tdma_opencl.tdma(a, b, c, d)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/core/special/tdma_opencl.py", line 58, in tdma
prg = compile_tdma(ret.shape[-1], bh.interop_pyopencl.type_np2opencl_str(a.dtype))
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/core/special/tdma_opencl.py", line 35, in compile_tdma
""".format(sys_depth=sys_depth, dtype=dtype)
ValueError: Unknown format code 'd' for object of type 'str'
Traceback (most recent call last):
File "/groups/ocean/software/veros/inst26012018/bin/veros-resubmit", line 11, in <module>
load_entry_point('veros', 'console_scripts', 'veros-resubmit')()
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/groups/ocean/software/python/gcc/2.7.14/lib/python2.7/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/cli/veros_resubmit.py", line 80, in cli
resubmit(*args, **kwargs)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/cli/veros_resubmit.py", line 60, in resubmit
call_veros(veros_cmd, identifier, current_n, length_per_run)
File "/lustre/hpc/ocean/software/veros/repo26012018/veros/cli/veros_resubmit.py", line 46, in call_veros
raise RuntimeError("Run {} failed, exiting".format(n))
RuntimeError: Run 0 failed, exiting
Implement a simple diagnostic that prints the current throughput in regular intervals (ratio of simulated time to real time). This makes it easier to (i) estimate how long a simulation is going to take, and (ii) allows better comparisons to other packages.
Currently, all settings and variables can be overridden at any time. This might cause subtle bugs, both from inside Veros core routines (e.g. when assigning directly to an array instead of updating its values), setups, or even higher-level code.
It may be worth considering protecting variables from being set to entirely new objects (e.g. by overloading setitem
in the Veros
class). An equivalent check could be made for settings that cannot be safely changed after setup (this should at least warrant a warning).
Veros is using the top-level logger provided by logging
. To play nicely with other packages, we should instead register a custom logger that offers more fine-grained control via the getLogger(__name__)
pattern.
Currently, set_grid
is expected to set the grid origin and spacings. In my experience it is usually more practical to set the grid points directly. We will have to figure out a clean way to do so without breaking compliance with PyOM, though.
Since the Poisson solver uses scipy.sparse.linalg.bicgstab
, all stream function data is copied to CPU and back once per time step. Additionally, this solver is not parallelized. This causes the streamfunction solver to be the most expensive routine on high-end GPU systems.
Ideally, an implementation will be available through Bohrium soon; if that should not be the case, we could implement a solver through Bohrium's PyOpenCL
interoperability, or wrap a library.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.