opendrr / earthquake-scenarios Goto Github PK

Public repository for earthquake scenarios from the National catalogue / Dépôt public de scénarios de tremblements de terre du catalogue national

Home Page: https://opendrr.github.io/earthquake-scenarios/

Python 88.05% Shell 11.95%

earthquakes earthquake-data natural-hazards natural-resources-canada government-of-canada risk-modelling risk-assessment

earthquake-scenarios's Issues

Idea to add a "display_name" variable to the initialization files

The issue is partially resolved as in we have done the bulk of the work for the necessary renaming of 16(?) scenarios, but @DamonU2 has a very good point on "add[ing] a "display_name" variable to the initialization files, and consistency on naming, etc., so I am reproducing his comment here as a new issue for us to really think about and improve on in the future.

As @anthonyfok said, this is bigger issue than it seems. Specifically, for GitHub pages, the "code name" is used to generate the "display name", as it was set up to add additional scenarios with limited manual updating. All of the links, tiles, and associated files use the "code name" for ID, so we'd have to change the "display name" down the line.

The easiest way to do this is probably to add a "display_name" variable to the initialization files, and pull that in. However, we then have the issue of someone looking at a scenario named "Georgia Strait 4.9" and downloading a file called "capilano5", and probably assuming the link is broken (side note - should Georgia Strait be Salish Sea?). This is probably a good method if the code name and display name are similar, ex. "Ottawa" to "near Ottawa", and we just want additional detail, but seems problematic otherwise, ex. "BurwashLanding" to "Denali Fault".

The best way going forward may be to add a display name variable, but also to rename the files in the case of big changes, and reprocess them. @tieganh Do we have a list of proper names for the scenarios in Jeremy's repo?

Originally posted by @DamonU2 in #73 (comment)

Investigate if scripts/consequences-v3.10.0.py could be optimized

While casually observing a scripts/run_OQStandard.sh run, I noticed that OpenQuake itself would happily use all available CPU cores to do calculations in parallel (which is awesome), but some other processing are single-threaded and could take over 12 hours. For example:

from ps auxwww nearing the end of python3 scripts/consequences-v3.10.0.py -2 run:

user    2151  0.0  0.0   8756  3792 pts/0    S+   07:51   0:00 bash scripts/run_OQStandard.sh SCM6p5_Montreal_conv -h -r -d -o
user    2225  0.0  0.0 3065888 101008 ?      Sl   07:51   0:01 oq-dbserver
user    6603  100  0.0 2836080 263132 pts/0  Rl+  09:53 759:05 python3 scripts/consequences-v3.10.0.py -2

from top:

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
   6603 user      20   0 2836080 263132  50196 R 100.0   0.0 286:58.05 python3

from free -h:

              total        used        free      shared  buff/cache   available
Mem:          749Gi       1.4Gi       739Gi       1.0Mi       8.6Gi       744Gi
Swap:            0B          0B          0B

So, in this particular case, calculations before python3 scripts/consequences-v3.10.0.py -2 took just over 2 hours, but python3 scripts/consequences-v3.10.0.py -2 alone ~~was approaching 5 hours~~ took 12.65 hours (759 minutes), running single-threaded (not using a lot of RAM) and writing to CSV files at about 200 lines/second (487,211 lines per CSV file in this scenario):

-rw-rw-r-- 1 user group 96764235 May  2 10:40 consequences-rlz-000_-2.csv
-rw-rw-r-- 1 user group 96262336 May  2 11:28 consequences-rlz-001_-2.csv
-rw-rw-r-- 1 user group 96978159 May  2 12:15 consequences-rlz-002_-2.csv
-rw-rw-r-- 1 user group 97646335 May  2 13:03 consequences-rlz-003_-2.csv
-rw-rw-r-- 1 user group 98016335 May  2 13:50 consequences-rlz-004_-2.csv
-rw-rw-r-- 1 user group 83709311 May  2 14:31 consequences-rlz-005_-2.csv

Ditto for the python3 scripts/consequences-v3.10.0.py -1 command which is expected to take another 12 hours.

Would be an interesting exercise to profile this script and see where it is spending most of its time, and find ways to make it speedier.

(Low priority, could have)

P.S. A quick-and-dirty script that I am using to record basic metrics:

#!/bin/bash
LOGFILE=~/logs/log_2022-05-02_cpu-ram-process.log
while true; do
  ( date; uptime; free -h; ps auxwww | grep ^user ; echo) | tee -a "${LOGFILE}"
  sleep 15
done

run_OQStandard.sh should do the hazard calculation only once per scenario

The run_OQStandard.sh script needs to be optimized to avoid repeated identical hazard calculation. The ideal behaviour, as envisioned by @tieganh, is to make only one hazard calculation per scenario, and to refer to that same haz calc for the damage baseline, damage retrofit, risk baseline, and risk retrofit calculations.

@tieganh explained in more details in #86 (review):

… The reason to get rid of running the haz calc was because I changed how I used it in my workflow for the most part. The damage and risk calculators need a hazard calculation, so either you give them the name of one or they will do the calculation internally. Since we run both baseline and retrofit scenarios, my goal was to redo this script so that it would do the hazard calc only once and then use the '-hc' flag (need to double check this, just going off memory) in the damage and risk calculators to point them to that hazard calculation for baseline and retrofit. However, I don't think I got to that point. I only made it so it shares one haz calc for the baseline and retrofit of each calculator type (dmg and risk). I also just started skipping the hazard calc since it wasn't needed for some calculations.

I'm really open to your feedback about the best way to proceed, and have approved the changes as is. If you've got some bandwidth to change this script to do the ideal behaviour (1 haz calc per scenario and call to it for the damage baseline, damage retrofit, risk baseline, and risk retrofit calculations) then that would be the best outcome. …

Add link to OpenFile in downloads

Perhaps as a "GSC OpenFile" button similar to the link to the GitHub repo? Not sure what's best.

Link: https://github.com/OpenDRR/earthquake-scenarios/raw/master/Openfile8806_Hobbs_etal_2021_OQCanadaScenario.pdf

Create PR for branch add-13-scenarios-apr2023

for to do list
any blockers for merging this into master to get the new release assets for the 13 scenarios?

update dsra_attributes_en/fr

add indicators from exposure to help build charts. Need to update attribute dictionary since new indicators being added.

"E_BldgOccS1"
"E_BldgTypeG"
E_BldgDesLev
add worksheet for shakemap_hexbin
french section may need to be updated/translated

@tieganh @drotheram

Legend alignment issues in Chrome on Windows

Rename rupture_ACM7p3_LeechRiverFaultFull.xml to match s_consequences_ACM7p3_LeechRiverFullFault

Naming convention for scenarios must be consistent for downstream processing

Create metadata records in FGP

Add collection metadata and records for each scenario.

Add shakemap files to download site

Make sure we have all the relevant files available for download on the GitHub Pages site.

Update scenario metadata page

Tasks:

Improve label collisions
Create download button and links for the related files

Review README wording

Check that the readme covers the necessary topics (introduce the material, point people where they need to go, answer questions someone would have, liability, etc) with appropriate wording.

Add a "ChangeLog" page to document major or breaking changes

Major or breaking changes such as the scenario renaming that was done in #77 should probably be publicly documented on some kind of "What’s New" or "ChangeLog" page, probably on GitHub Pages (web pages), or maybe a link to a ChangeLog.md Markdown file rendered as HTML.

Refresh scripts/TakeSnapshot.py

It was suggested during internal demo/review that script/TakeSnapshot.py would benefit from the use of Python 3 argparse for parsing command-line arguments, see https://docs.python.org/3/library/argparse.html

Update scenario overview map

Update the map showing all available scenarios. May have to switch to cartopy from Basemap.

Bash Script
Python Script
Can check out Take Snapshot Script for working example with CartoPy.

Incorrect display on for each scenario in the earthquake scenarios GitHub page, dead links

Newest GitHub page for earthquake scenarios display syntax for each scenario.

For the new 13 scenarios, the download links are broken as its referencing v1.2.2 release asset which those scenarios did not exist, and only in v1.2.3.

Investigate missing tiles in some scenarios, regenerate as needed

Some of the new tiles generated seem to be missing some tiles (only checked dsra_sim9p0_cascadiainterfacebestfault_indicators_s_900913) at various zoom levels.
Need to investigate and regenerate (geoserver?) and check the other scenarios if it is the same issue.

upload_assets.yml GitHub Actions workflow run out of space

The data files have grown big enough that, as of v1.2.2, the default GitHub runner no longer has sufficient disk/build space for upload_assets.yml (was generate_assets.yml) to complete, in spite of the use of easimon/maximize-build-space in v1.2.3. For example:

https://github.com/OpenDRR/earthquake-scenarios/actions/runs/5859835239
upload-release-assets
unable to write file outputs/s_shakemap_ACM7p0_GeorgiaStraitFault_124.csv

Potential solution is to download from release assets of the previous release in a piecemeal fashion, and use Git LFS only for downloading new or changed files.

Add CSD aggregations to download link table

We may have to change the title qualifiers (e.g. Polygon, Point) to something more descriptive. Perhaps Census Subdivision, Settled Area, Aggregated Buildings.

Explore OQ Engine >> 3.11 and set up CI tests

Current release of OpenDRR/earthquake-scenarios (as of v1.2.3) has been tested/certified to work with only OpenQuake Engine 3.11.5.

As we'll need to upgrade to a newer version of OQ Engine (such as the latest 3.18.0 release) eventually, it would be a good idea to start exploring getting OpenDRR/earthquake-scenarios to work with OQ Engine >> 3.11.

Update our code to be compatible with OQ Engine >= 3.11.5
Set up CI tests (with a GitHub Actions workflow) for multiple OQ Engine releases (e.g. 3.11.5, 3.12.1, 3.13.0, 3.14.0, 3.15.0, 3.16.7, 3.17.1, 3.18.0) and compare their calculation results
Update documentation, especially Python version requirement for different OQ Engine releases

Thanks to @MPConda for the idea and inspiration!

r2 to r1 for finished scenario

@tieganh if we check the https://github.com/OpenDRR/earthquake-scenarios/blob/master/FINISHED/SIM9p0_CascadiaInterfaceBestFault.md it shows b0/r1 files but in the finished folder for cascadia interface we see b0/r2. I think this was from the previous iteration where we did a dirty fix to to run all the scenarios and in my script rename it back to r1.

Now that we are starting this 'final' repo for scenarios, We should be reverting back to how it was originally planed b0/r1 and we will implement the changes on our end. So this way it should run fine with all the current and new scenarios that will you will be putting in the repo. Right now the Cascadia interface best fault scenarios are labeled b0/r2, and the newer leech river ones are b0/r1.

Would you be able to rename the cascadia scenarios back to r1?

Changes to GitHub Pages

From Julie. Changes needed to: https://opendrr.github.io/earthquake-scenarios/en/

Need to add new scenarios
Need to update descriptions with ones in this document or want to keep as-is and technical? Scenario names should probably align with RP ones?
Is there additional documentation?
Add to view our user-friendly interface to explore the information without downloading data, visit www.RiskProfiler.ca.

@DamonU2 maybe you could help out with this? I don't understand the github pages structure on this one.

Renaming scenarios in GitHub [Pages]

Hey guys - I know we already fixed the scenario names in RiskProfiler, but we need to also change them in Git. I'm sorry - I should've thought to rename them after they were first added to the repo. Is this doable now? Then for the rest of Jeremy's scenarios we can change the names while we're adding them? I just don't want us using a municipality name for a specific earthquake scenario when the scenario might not even be that close to the municipality. The names were set because we wanted scenarios to affect each of the ten riskiest scenarios. @anthonyfok @DamonU2 what do you think?

opendrr / earthquake-scenarios Goto Github PK

earthquake-scenarios's Issues

Recommend Projects

Recommend Topics

Recommend Org