Git Product home page Git Product logo

cloudos's People

Contributors

cgpu avatar ilevantis avatar ratnesh10 avatar sk-sahu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

abrahamlifebit

cloudos's Issues

Replace else with explicit if statement in R functions

@sk-sahu thanks for being on the Jenny Bryan side of life πŸ˜„ !

We can update R functions replacing else with the explicit if function in the following places:

So in principle as an example the following snippet, when if-ified would lokk like this:

if (!r$status_code == 200) {
stop("Something went wrong. Not able to create a cohort")
}else{
message("Cohort named ", cohort_name, " created successfully. Bellow are the details")
res <- httr::content(r)
# into a dataframe
res_df <- do.call(rbind, res)
colnames(res_df) <- "details"
return(res_df)
}

if (!r$status_code == 200) { 
    stop("Something went wrong. Not able to create a cohort") 
 }
if (r$status_code == 200){
    message("Cohort named ", cohort_name, " created successfully. Bellow are the details") 
    res <- httr::content(r) 
    # into a dataframe 
    res_df <- do.call(rbind, res) 
    colnames(res_df) <- "details" 
    return(res_df) 
 } 

Originally posted by @cgpu in #5 (comment)

Add a function to list out all the available phenotypic filters

At the moment only for few terms works in cb_search_phenotypic_filters()

cb_search_phenotypic_filters(cloudos = my_cloudos, term = "cancer")
  • Change the arg name from term -> phenotype (take a vote on this)
  • Make a function to list out all the available phenotypic filters, so a user no need to search and can directly take the phenotypic filters id to next step.

rename the dataframe with column names fails

Error in GEL

image

Error coming from

# rename the dataframe with column names
colnames(res_df_new) <- columns_names

Error Reason
Once of the columns ("Date Of Death") is completely empty thoughout the returned JSON
image

res <- httr::content(r) not able to parse it properly. That's why when forming a dataframe one column is missing. That's the reason behind mismatch in number of column names and columns.

Nomenclature for filters/queries is confusing

Currently "filter" and "query" are used somewhat randomly to mean different things.

We should rename functions and parameters to more consistantly mirror the Web-UI:
image

i.e.:

  • criterion based on a particular phenotype e.g. list("id" = 4, "value" = "Cancer") should be called a phenotype.
  • a set of phenotypes combined with logical operators to narrow down the dataset to make a cohort is a query.

Add proper error message

At the moment although it a failed api request stops and gives error message, its not intuitive, for example -

image

  • Check the error message coming from the request. (If few end-points missing error from backend, ask for it.)
  • Do error handling in few situations to respond to the error.

Improve query definitions in R

Defining CB queries in R via lists is not very user friendly, especially when there are multiple conditions. For example, even a relatively simple query with three conditions and two AND operators:

adv_query <- list(
  "operator" = "AND",
  "queries" = list(
    list( "id" = 13, "value" = list("from"="2016-01-21", "to"="2017-02-13")),
    list(
      "operator" = "AND",
      "queries" = list(
        list("id" = 4, "value" = "Cancer"),
        list("id" = 21, "value" = "Consenting")
        )
    )
  )
)

To simplify, we could include functions to define and combine individual phenotype definitions in a more modular fashion. As a starting point for discussion, the testing-new_query_syntax branch adds two new functions: new_phenotype_cont to define continuous variable phenotypes and new_phenotype_cat to define categorical variable phenotypes. They can be combined using overloaded &, | and ! operators. For example:

Get the package from the PR branch:

> git clone 'https://github.com/lifebit-ai/cloudos.git'                                                                                                                                   
> cd cloudos
> git checkout  testing-new_query_syntax

In the cloudos directory enter an R session (or do so in Rstudio) and load the package + config:

> devtools::install(".")
> library(cloudos)
> cloudos_configure(base_url = "http://cohort-browser-dev-110043291.eu-west-1.elb.amazonaws.com/cohort-browser/", 
token = "...api token...",
team_id = "5f7c8696d6ea46288645a89f")

Try building new queries:

A <- new_phenotype_cont(13, "2016-01-21", "2017-02-13")
B <- new_phenotype_cat(4, "Cancer")
C <- new_phenotype_cont(13, "2016-01-21", "2017-02-13")
D <- new_phenotype_cat(4, "Cancer")

########## test 1
AB <- A & B

cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)

########## test 2
AB <- A & !B

cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")

cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)

########## test 3
AB <- A | !B

cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")

cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)

########## test 3
AB <- (A | B) & (D | C) 

cohort <- cloudos::cb_load_cohort("60f96a97f2395b7f16a93c3a")

cloudos::cb_apply_filter(cohort, adv_query = unclass(AB), keep_existing_filter = F)

Applying CBv2 query sometimes breaks the web-UI

It appears that creating a query with an AND or OR operator that only has a single subquery within it (unless it's the top node) makes the web-UI very unhappy.

Is this something to fix in the web-UI or do we need to ensure all queries have no such single nodes under operators?

example that breaks the UI:

  • PUT 'v2/cohort/{cohort_id}' with request body:
{
  "name": "test-apply-07",
  "description": "",
  "columns": [ ... ],
  "query": {
    "operator": "AND",
    "queries": [
      {
        "operator": "AND",
        "queries": [
          {
            "field": 13,
            "instance": [
              "0"
            ],
            "value": {
              "from": "2016-01-21",
              "to": "2017-02-13"
            }
          }
        ]
      },
      {
        "operator": "AND",
        "queries": [
          {
            "field": 56,
            "instance": [
              "0"
            ],
            "value": [
              "Adult C1"
            ]
          },
          {
            "field": 4,
            "instance": [
              "0"
            ],
            "value": [
              "Cancer"
            ]
          }
        ]
      }
    ]
  }
} 

Not able apply certain pehnotypic filters

For filter id - 72 (SARS-CoV-2 positive) Not able to apply filter

image

This because they way filter-value json is being created is different from current implantation.

It is requesting -

"value":["Positive"]

But from it should have -
image

Need to investigate why that 1 in the end.

cb_search_phenotypic_filters produces a table with multiple rows for each filter

When testing on http://dev-gel.lifebit.ai/ instance, each row for the same filter contains one of the possible values in the possibleValues column:

> library(cloudos)
> cloudos_configure(base_url = "http://cohort-browser-dev-110043291.eu-west-1.elb.amazonaws.com/cohort-browser/", 
token = "...api token...",
team_id = "5f7c8696d6ea46288645a89f")


> cb_search_phenotypic_filters("genetic", cb_version="v1")
Total number of phenotypic filters found - 3
    display isMandatory                                                    possibleValues recruiterDescription            group
1  dropdown       FALSE                                                  Somatic, Somatic         Genetic Test                 
2  dropdown       FALSE                                                Germline, Germline         Genetic Test                 
3  dropdown       FALSE                                                      cfDNA, cfDNA         Genetic Test                 
4  checkBox       FALSE                                           Not known, Not known, 0                                      
5  checkBox       FALSE                     Li-Fraumeni syndrome, Li-Fraumeni syndrome, 1                                      
6  checkBox       FALSE Familial adenomatous polyposis, Familial adenomatous polyposis, 2                                      
7  checkBox       FALSE       Beckwith-Wiedemann syndrome, Beckwith-Wiedemann syndrome, 3                                      
8  checkBox       FALSE                                                   Other, Other, 4                                      
9  checkBox       FALSE                             Chromosomes/FISH, Chromosomes/FISH, 0                 NULL clinicalFeatures
10 checkBox       FALSE                                           Array CGH, Array CGH, 1                 NULL clinicalFeatures
11 checkBox       FALSE                         Fragile X Syndrome, Fragile X Syndrome, 2                 NULL clinicalFeatures
12 checkBox       FALSE           Single gene/panel testing, Single gene/panel testing, 3                 NULL clinicalFeatures
13 checkBox       FALSE                     Exome/genome testing, Exome/genome testing, 4                 NULL clinicalFeatures
14 checkBox       FALSE           Newborn screening testing, Newborn screening testing, 5                 NULL clinicalFeatures
   clinicalForm bucket500 bucket1000 bucket2500 bucket5000 bucket300 bucket10000       categoryPathLevel1
1    cancerForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE         Cancer diagnosis
2    cancerForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE         Cancer diagnosis
3    cancerForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE         Cancer diagnosis
4    cancerForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE         Cancer diagnosis
5    cancerForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE         Cancer diagnosis
6    cancerForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE         Cancer diagnosis
7    cancerForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE         Cancer diagnosis
8    cancerForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE         Cancer diagnosis
9        udForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE Undiagnosed disease form
10       udForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE Undiagnosed disease form
11       udForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE Undiagnosed disease form
12       udForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE Undiagnosed disease form
13       udForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE Undiagnosed disease form
14       udForm     FALSE      FALSE      FALSE      FALSE     FALSE       FALSE Undiagnosed disease form
               categoryPathLevel2  id instances                                 name type Sorting            valueType units coding
1                Genetic findings  57         1                         Genetic Test bars           Categorical single             
2                Genetic findings  57         1                         Genetic Test bars           Categorical single             
3                Genetic findings  57         1                         Genetic Test bars           Categorical single             
4               Genetic syndromes  52         1                    Genetic syndromes bars           Categorical single             
5               Genetic syndromes  52         1                    Genetic syndromes bars           Categorical single             
6               Genetic syndromes  52         1                    Genetic syndromes bars           Categorical single             
7               Genetic syndromes  52         1                    Genetic syndromes bars           Categorical single             
8               Genetic syndromes  52         1                    Genetic syndromes bars           Categorical single             
9  Previous Investigation results 113         1 Previous genetic and genomic testing bars         Categorical multiple             
10 Previous Investigation results 113         1 Previous genetic and genomic testing bars         Categorical multiple             
11 Previous Investigation results 113         1 Previous genetic and genomic testing bars         Categorical multiple             
12 Previous Investigation results 113         1 Previous genetic and genomic testing bars         Categorical multiple             
13 Previous Investigation results 113         1 Previous genetic and genomic testing bars         Categorical multiple             
14 Previous Investigation results 113         1 Previous genetic and genomic testing bars         Categorical multiple             
    description descriptionParticipantsNo link array descriptionStability descriptionCategoryID descriptionItemType
1  Genetic Test              Not provided         20                                                               
2  Genetic Test              Not provided         20                                                               
3  Genetic Test              Not provided         20                                                               
4             ]              Not provided          5                                                               
5             ]              Not provided          5                                                               
6             ]              Not provided          5                                                               
7             ]              Not provided          5                                                               
8             ]              Not provided          5                                                               
9                            Not provided         20                                                               
10                           Not provided         20                                                               
11                           Not provided         20                                                               
12                           Not provided         20                                                               
13                           Not provided         20                                                               
14                           Not provided         20                                                               
   descriptionStrata descriptionSexed orderPhenotype instance0Name instance1Name instance2Name instance3Name instance4Name
1                                                                                                                         
2                                                                                                                         
3                                                                                                                         
4                                                                                                                         
5                                                                                                                         
6                                                                                                                         
7                                                                                                                         
8                                                                                                                         
9                                                                                                                         
10                                                                                                                        
11                                                                                                                        
12                                                                                                                        
13                                                                                                                        
14                                                                                                                        
   instance5Name instance6Name instance7Name instance8Name instance9Name instance10Name instance11Name instance12Name
1                                                                                                                    
2                                                                                                                    
3                                                                                                                    
4                                                                                                                    
5                                                                                                                    
6                                                                                                                    
7                                                                                                                    
8                                                                                                                    
9                                                                                                                    
10                                                                                                                   
11                                                                                                                   
12                                                                                                                   
13                                                                                                                   
14                                                                                                                   
   instance13Name instance14Name instance15Name instance16Name
1                                                             
2                                                             
3                                                             
4                                                             
5                                                             
6                                                             
7                                                             
8                                                             
9                                                             
10                                                            
11                                                            
12                                                            
13                                                            
14 

Rename repo to cloudos

Overview

We had a vote and brainstorming session with the bioinfo team, taking into account best practices for naming R packages.
The name cloudos is the one we voted for.
In the future we can have a python lib name pycloudos etc.

What needs to be done

Rename the repo from cloudos-R to cloudos

Why and more context:

tl;dr

  • R package names should not have dashes -
  • R package names should use a mix of CAPS and lowercase letters A,a

Unfortunately, this means you can’t use either hyphens or underscores, i.e., '-' or '', in your package name

Avoid using both upper and lower case letters: doing so makes the package name hard to type and even harder to remember. For example, I can never remember if it’s Rgtk2 or RGTK2 or RGtk2.

Slack link with relevant convo:
https://lifebit-biotech.slack.com/archives/C013UHE9MQQ/p1596185485021800

PR comment with best practices suggestions:
#1 (review)

Release cloudos 0.2.0

First release:

Prepare for release:

  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()

Submit to CRAN:

  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted πŸŽ‰
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Update install instructions in README

Create a mongo dump for use in CI for predictable assertions and testing

@eBsowka we were discussing with @hms1 about the newly added functionalities.
A main struggle with assertions if using a demo env is the volatility of the app and the collections.

We have attempted in the past to use a mongo db to connect to with success that is created on the fly in the CI with mongo dumps. In this scenario, we would need to take it a step further, where we have an instance of CB running that we have full control of, and for the test we would

  1. purge all collections available
  2. resurrect with the db that we have saved from the dumped collections

This would reassure us that we are not breaking anything in the package without having to account for CB app unpredictable changes.

This is low priority for now, but keeping it as we have committed to ensure reproducibility for mongo related CI tests.

Plot for phenotypic filters

As earlier discussed with Pablo, we need to have a plot (based on ggplot) function to get all the filters showing in a single plot.

Something like -

plot_phenotypic_filters(cohort_object)

Output -

image

output better error when trying to create already existing cohort

currently looks like this:

> cb_create_cohort("ilya-r-test1", cb_version="v1")
Error in .cb_create_cohort_v1(cohort_name = cohort_name, cohort_desc = cohort_desc,  : 
  Bad Request (HTTP 400). Failed to create a cohort.
> cb_create_cohort("ilya-r-test2", cb_version="v2")
Error in .cb_create_cohort_v2(cohort_name = cohort_name, cohort_desc = cohort_desc,  : 
  Conflict (HTTP 409). Failed to create a cohort.

Clear message for which version of CB cohort is created

There is no clear way to find out in which CB version this cohort is created

cb_create_cohort("test-400-desc", cohort_desc = "for testing", cb_version="v1")
Cohort created successfully.
Cohort ID:  60ec0c2f9dd71b22256fcfe0 
Cohort Name:  test-400-desc 
Cohort Description:  for testing 
Number of filters applied:  0 

cb_apply_filter should give more useful message on success

When a CBv1 cohort is used in cb_apply_filter() the function messages with the number of participants in the newly updated cohort. With a CBv2 cohort the number of participants is not given in a message.

The CBv2 API does not provide a response with the updated number of participants for PUT /cohort-browser/v2/cohort/{cohort_id}. Instead cb_participant_count() can be used to get this info.

Additionally, it would be useful to provide a web address for the cohort in the output of this (and other functions like cb_cohort_create() and cb_cohort_load()).

potential bug in `cb_get_participants_table`?

While reviewing #80 I noticed a potential bug in cb_get_participants_table. Using the relevant branch:

library(cloudos)

cohortv2 <- cb_load_cohort("61f1bd4b3de81e52ec46d0ea")
cb_get_participants_table(cohortv2)
 Error: All elements must be size one, use `list()` to wrap.
x Element `f4i0a0` is of size 0.
Run `rlang::last_error()` to see where the error occurred.

It looks like the error is happening here:

for (n in c(list(emptyrow), res$data)) {
# important to change NULL to NA using .null_to_na_nested
dta <- .null_to_na_nested(n)
# change types within lists according to col_type
for (name in names(dta)) {
if (is.list(dta[[name]])){
type_func <- col_types[[name]]
dta[[name]] <- list(type_func(dta[[name]]))
}
}
dta <- tibble::as_tibble_row(dta)
df_list <- c(df_list, list(dta))
}

I am not familiar with this code, so someone with a better understanding should look properly, but it appears it can be fixed using something like:

  for (dta in c(list(emptyrow), res$data)) {
    # drop all null elements entirely
    dta[sapply(dta, is.null)] <- NULL
    # change types according to col_type
    for (name in names(dta)) {
      type_func <- col_types[[name]]
      dta[[name]] <- type_func(dta[[name]])
    }
    dta <- tibble::as_tibble_row(dta)
    df_list <- c(df_list, list(dta))
  }

Use credential from environment variable

Use a file ~/.cloudos/credential.json, convert them into environment variables.

Something like - Ref

github_pat <- function() {
  pat <- Sys.getenv('GITHUB_PAT')
  if (identical(pat, "")) {
    stop("Please set env var GITHUB_PAT to your github personal access token",
      call. = FALSE)
  }

  pat
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.