neicnordic / bifrost Goto Github PK
View Code? Open in Web Editor NEWThis is a service that enables job submissions to secure compute platforms from the open internet
This is a service that enables job submissions to secure compute platforms from the open internet
Describe pain points or successes here until completion.
Local EGA has been suggested to be used in the schizophrenia implementation plan (point 10) as well as the imputation implementation plan (point 11) for the final full scale version. This issue will serve as a placeholder until we can look closer at this issue.
The splitByChromosome.sh script was added because the imputation server expects the input files to only contain a single chromosome, and as the name reveals, the script splits VCF files into separate chromosomes.
The input files for the imputation project are sensitive and should thus be encrypted before transfer to TSD is initiated. Currently the idea is to use the crypt4gh tool for this.
Document everything before moving on to the next milestone.
After the files have been successfully transferred to TSD they need to be moved from the /tsd/pXXX/data/durable/... disk to the disk where the compute will happen.
This will be handled by a cron job that will trigger a script that will handle the task.
Below is a preliminary outline of what steps needs to be done.
And also write documentation when this is done:
Describe pain points or success here.
The plan is to use ARC to submit queries in a way that would make it possible to do computations on several of the sensitive compute infrastructures within one query.
ARC can delegate jobs to the correct sensitive compute infrastructure and submit the job to the clusters and collect the results.
ARC can also handle queuing and resource management.
How this would work in practice needs to be discussed.
The known obstacles at the moment are with authentication/authorization and privilege management.
Describe pain points or successes here until completion.
I suggest we use the yaml file format to make the job config file, it is readable and has a simple syntax.
I suggest that the config file should contain the following:
What more should it contain? Is there a better solution to the incomplete/corrupted files issue?
Document everything before moving on to the next milestone.
What input data do we need per project?
Imputation project:
VCF files.
Schizophrenia project:
13/03 - No input data from the outside, only query file with parameters.
This is all in progress, will close once enough documentation has been written.
When I run the readConfig.py script on an encrypted "schizophrenia" input file it prints the error message No supported encryption method
and continues with the rest of the script. This error is not printed when running an imputation job.
Describe pain points or success here.
Describe pain points or success here.
The job submit script has been created now, it needs to be edited with all of the schizophrenia related code. Currently it has some placeholders.
When the input files have been transferred to TSD they need to be moved to a scratch disk of some sort, as of right now it is not known where this will be.
Automatically move your cards to the right place based on the status and activity of your issues and pull requests.
Describe pain points or successes here until completion.
Access to the compute VM (rhel7.tsd.usit.no) has not been granted yet, this needs to be done.
Document all changes and issues related to the config file reading script here.
Document everything before finishing and closing this milestone.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.