Comments (7)
Would it be beneficial to use git lfs for the creation and versioning of the KGO files within the workflows? The files are not that big and can be stored in a separate repo that will be cloned only by a script so that bandwidth and data quotas are not used up that much.
from cospv2.0.
When I looked into this I concluded that we would hit the bandwidth limit easily, but I may not be understanding how it works.
Please can you give more details of your proposed solution?
from cospv2.0.
I just assumed that testing is only needed for contributors and action runs and given the size of all datasets is around 100 Mb, unless you get >10 answer-changing PRs a month - the bandwidth quota won't be met. Especially, if the testing description is moved to the wiki for developers, so the general users won't use it much.
from cospv2.0.
Currently, the test datasets are downloaded every time the CI tests are triggered (push, pull request), so my understanding is that the bandwidth limit will be hit after 10 tests. If my understanding is correct, then we would have hit the bandwidth limit during the last couple of weeks.
from cospv2.0.
Oh. Ok. Then what can be done is: in the workflows and in' driver/download_test_data.sh
only a json file containing a list of *.nc
files in a google folder (a google apps script can manage that). Your google folder will contain all the KGO files with different versions (I can show you how to set up the google apps scripts). In this case, there will be no need to update the workflows with new google links every time there is an answer-changing PR.
With regards to the versioning of the files, instead of vXXX, the filenames can contain creation date and tag/commit they were created with and in download_test_data.sh
that part would be an optional argument if for some reason the older versions of KGO are needed.
from cospv2.0.
Sorry for my slow reply. That sounds like a nice approach. I'm not familiar with google apps scripts, would you be able to provide guidance on how to to do that? For the moment, I'd like to keep the vXXX versioning, and perhaps move to date/tag in the future.
from cospv2.0.
You can add a script at https://script.google.com/home/start (documentation).
Before you write a script:
- Make a folder with
*.nc
files. - Make an empty
.txt
file which would contain the list of*.nc
files in the folder. It can also be.json
file - Get the
.txt
file Id and folder Id. - Example java_script function:
function write_nc_list() {
var listfile = DriveApp.getFileById('.<txt file id>')
var folder = DriveApp.getFolderById('<folder id>');
var list = [];
list.push(['Name','ID','Size']); // comment this out if you do not need the header
var files = folder.getFiles(); //
while (files.hasNext()){
file = files.next();
var row = []
var fname = file.getName()
if (fname.includes('.nc')) {
row.push(file.getName(),file.getId(),file.getSize())
list.push(row);
}
}
listfile.setContent(JSON.stringify(list))
}
- This function will fill your
.txt
with JSON that has filenames, id's size (you can add other entries f.e. date modified) which can later be read in the github workflow ordownload_test_data.sh
to construct download links for curl. There are also options to make triggers for the script (f.e. run it every day) or just execute the script manually once you add new kgo to the folder.
This way you can have the only the.txt
file link hardcoded in the workflows. If you have any trouble, we can arrange a zoom.
from cospv2.0.
Related Issues (20)
- MODIS Optical_Thickness_vs_ReffICE and Optical_Thickness_vs_ReffLIQ not masked for night columns HOT 3
- Striping caused by vertical interpolation routine HOT 15
- Example data for tests do not include any night columns HOT 2
- Allocation issue with MODIS/Cloudsat joint-products HOT 1
- cosp_diag_warmrain has wrong dimensions for variable frac_out / design flaw HOT 2
- cosp2_test reference (test) output is missing variables & needs versioning. HOT 4
- Python test script does not note occurrence of outputs that are not in the test dataset (Version check?) HOT 6
- Remove COSPv1.4 interface HOT 5
- Test case using global snapshot HOT 5
- CI test ifort broken HOT 1
- Implementation of CLARA simulator HOT 14
- hgt_matrix_half is used inconsistently HOT 1
- Masking fix in cosp_diag_warmrain? HOT 4
- Calculation of cloudsat_preclvl_index fails when use_vgrid=.false. HOT 4
- Missing type specifier causing crash on Cori-KNL in running debug mode HOT 6
- Allocation of radar lookup table and delete unused variables
- download_test_data.sh drive links HOT 3
- add MODIS joint histgram diagnostics
- Inconsistent unit labels for hgt_matrix
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cospv2.0.