Git Product home page Git Product logo

algorithm-repository's Introduction

CHUV JSI TAU Codacy Badge CircleCI

Algorithm repository

This is the repository of algorithms for the MIP.

Algorithms, written in their native language (R, Matlab, Python, Java...) are encapsulated in a Docker container that provides them with the runtime environment necessary to execute this function.

The environment variables provided to the Docker container are used as parameters to the function or algorithm to execute.

Currently, we expect the Docker containers to be autonomous:

  • they should connect to a database and retrieve the dataset to process
  • they should process the data, taking into account the parameters given as environment variables to the Docker container
  • they should store the results into the results database.

The format of the results should be easily shared.

  • For algorithms providing statistical analysis or machine learning, we require the results to be in PFA format in its YAML or JSON form.
  • For algorithms providing visualisations, we support different formats, including Highcharts, Vis.js, PNG and SVG.
  • For algorithms providing tabular data, we expect a JSON output in this format: Tabular Data Resource

List of algorithms

hbpmip/python-anova: Anova algorithm

DockerHub ImageVersion ImageLayers CHUV

This is a Python implementation of Anova.

DockerHub ImageVersion ImageLayers CHUV

Calculate correlation heatmap, only works for real variables. Run it on single node or in a distributed mode. First, intermediate mode calculates covariance matrix from a single node, then aggregate mode is used after intermediate to combine statistics from multiple jobs and produce the final graph.

hbpmip/python-distributed-pca: PCA - principal components analysis

DockerHub ImageVersion ImageLayers CHUV

Calculate PCA, only works for real variables. Run it on single node or in a distributed mode. First, intermediate mode calculates covariance matrix from a single node, then aggregate mode is used after intermediate to combine statistics from multiple jobs and produce the final graph.

Code is shared with hbpmip/python-correlation-heatmap

DockerHub ImageVersion ImageLayers CHUV

Implementation of distributed k-means clustering (https://github.com/MRN-Code/dkmeans) in Python. It uses Single-Shot Decentralized LLoyd (https://github.com/MRN-Code/dkmeans#single-shot-decentralized-lloyd).

Intermediate mode calculates clusters on a single node, while aggregate mode is merging the clusters according to least merging error (e.g. smallest distance between centroids).

DockerHub ImageVersion ImageLayers CHUV

Calculates histogram of nominal or real variable grouped by nominal variables in independent variables. Histogram edges are taken from minValue and maxValue property of dependent variable. If not available, then these values are calculated dynamically from dependent values (this won't work in distributed mode though).

DockerHub ImageVersion ImageLayers JSI

Hedwig method for semantic subgroup discovery. (https://github.com/anzev/hedwig).

DockerHub ImageVersion ImageLayers JSI

The HINMINE algorithm for network-based propositionalization is an algorithm for data analysis based on network analysis methods.

The input for the algorithm is a data set containing instances with real-valued features. The purpose of the algorithm is to construct a new set of features for further analysis by other data mining algorithms. The algorithm outputs a data set with features, generated for each data instance in the input data set. The features represent how close a given instance is to the other instances in the data set. The closeness of instances is measured using the PageRank algorithm, calculated on a network constructed from instance similarities.

hbpmip/python-knn: k-nearest neighbors

DockerHub ImageVersion ImageLayers CHUV

Implementation of k-nearest neighbors algorithm (https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) in Python.

Run it on single node or in a distributed mode.

hbpmip/python-linear-regression: Linear and logistic regression

DockerHub ImageVersion ImageLayers CHUV

Python implementation of multivariate linear regression. It supports both continuous and categorical as independent variables. Run it on single node or in a distributed mode. Python implementation of logistic regressions on one class versus the others. Only single-node mode is supported

hbpmip/python-sgd-regression: SGD family of regressions

DockerHub ImageVersion ImageLayers CHUV

This is a Python implementation of scikit-learn estimators (http://scikit-learn.org/stable/modules/scaling_strategies.html) using Stochastic Gradient Descent and the partial_fit method for distributed learning.

Implemented methods:

  • linear_model - calls SGDRegressor or SGDClassifier
  • neural_network - calls MLPRegressor or MLPClassifier
  • naive_bayes - calls MixedNB (mix of GaussianNB and MultinomialNB), only works for classification tasks
  • gradient_boosting - calls GradientBoostingRegressor or GradientBoostingClassifier, does not support distributed training.

DockerHub ImageVersion ImageLayers CHUV

It calculates various summary statistics for entire dataset and also for all subgroups created by combining all possible values of nominal covariates. Run it on single node or in a distributed mode.

DockerHub ImageVersion ImageLayers CHUV

The python-tsne is a wrapper for the the A-tSNE algorithm developed by N. Pezzotti. The underlying algorithm is an improvement on the Barnes-Hut tSNE (http://lvdmaaten.github.io/publications/papers/JMLR_2014.pdf) using an approximated k-nearest neighbor calculation.

hbpmip/java-jsi-clus-fire: k-nearest neighbors

DockerHub ImageVersion ImageLayers JSI

hbpmip/java-jsi-clus-fr: k-nearest neighbors

DockerHub ImageVersion ImageLayers JSI

hbpmip/java-jsi-clus-pct: k-nearest neighbors

DockerHub ImageVersion ImageLayers JSI

hbpmip/java-jsi-clus-pct-ts: k-nearest neighbors

DockerHub ImageVersion ImageLayers JSI

hbpmip/java-jsi-clus-rm: k-nearest neighbors

DockerHub ImageVersion ImageLayers JSI

JSI

JSI

DockerHub ImageVersion ImageLayers

DockerHub ImageVersion ImageLayers

DockerHub ImageVersion ImageLayers

DockerHub ImageVersion ImageLayers

hbpmip/java-rapidminer-knn: ๐ŸŒ‘ k-NN k-NN

DockerHub ImageVersion ImageLayers

k-NN implemented with RapidMiner. Deprecated, replaced by hbpmip/python-knn

java-rapidminer-naivebayes: ๐ŸŒ‘ Naive Bayes Naive Bayes

DockerHub ImageVersion ImageLayers CHUV

Naive Bayes implemented with RapidMiner. Deprecated, replaced by hbpmip/python-naivebayes

hbpmip/r-linear-regression: ๐ŸŒ‘ Linear regression Linear regression

DockerHub ImageVersion ImageLayers CHUV

Linear regression implemented in R, with support for federated results. Deprecated, replaced by hbpmip/python-linear-regression

Algorithm capabilities

Algorithm Description Predictive Federated results In production Used for Runtime engine
hbpmip/python-anova Anova โœ”๏ธ ๐Ÿ”œ โœ”๏ธ Regression Woken
hbpmip/python-correlation-heatmap Correlation heatmap โŒ โœ”๏ธ Visualisation Woken
hbpmip/python-distributed-pca PCA โœ”๏ธ โœ”๏ธ Visualisation Woken
hbpmip/python-distributed-kmeans K-means โœ”๏ธ โœ”๏ธ Clustering Woken
hbpmip/python-histograms Histograms โœ”๏ธ โœ”๏ธ Visualisation Woken
hbpmip/python-jsi-hedwig Hedwig โŒ โœ”๏ธ Woken
hbpmip/python-jsi-hinmine HINMINE โŒ โœ”๏ธ Woken
hbpmip/python-knn k-NN โœ”๏ธ โœ”๏ธ โœ”๏ธ Clustering Woken
hbpmip/python-linear-regression Linear regression โœ”๏ธ โœ”๏ธ โœ”๏ธ Regression Woken
hbpmip/python-linear-regression Logistic regression โœ”๏ธ โŒ โœ”๏ธ Regression, Classification Woken
hbpmip/python-sgd-regression SGD Linear model โœ”๏ธ โœ”๏ธ โœ”๏ธ Classification Woken
hbpmip/python-sgd-regression SGD Neural Network โœ”๏ธ โŒ โœ”๏ธ Classification Woken
hbpmip/python-sgd-regression SGD Naive Bayes โœ”๏ธ โŒ โœ”๏ธ Classification Woken
hbpmip/python-sgd-regression SGD Gradient Boosting โœ”๏ธ โŒ โœ”๏ธ Classification Woken
hbpmip/python-summary-statistics Summary statistics โœ”๏ธ โœ”๏ธ Data exploration Woken
hbpmip/python-tsne t-SNE โŒ โœ”๏ธ Visualisation Woken

Acknowledgements

This work has been funded by the European Union Seventh Framework Program (FP7/2007ยญ2013) under grant agreement no. 604102 (HBP)

This work is part of SP8 of the Human Brain Project (SGA1).

algorithm-repository's People

Contributors

ajutzeler avatar bldrvnlw avatar clefourrier avatar ludovicc avatar marigold avatar mbreskvar avatar midam avatar mirco-nasuti avatar nicedexter avatar shay-y avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

algorithm-repository's Issues

KNN fails on nominal values

http://frontend/experiment/?variables=alzheimerbroadcategory&coVariables=minimentalstate

An error has occurred while running your experiment K-nearest neighbors with k=5. Here's the message:

Invalid JSON:
java.lang.Exception: com.opendatagroup.hadrian.errors.PFASemanticException: PFA semantic error at JSON line:col 1:12319 (PFA field "fcns -> toArray -> do -> 0"): array constructed with "new" has wrong type for item 0: {"type":"enum","symbols":["_28","_27","_21","_30","_29","_18","_26","_9","_24","_13","_22","_25","_16","_12","_20","_23","_19","_17","_4","_15","_11","_8"],"name":"Enum_minimentalstate"} rather than "double"

RapidMiner algorithms: numerical label not supported

woken version: 9750bdc
woken-validation version: 9750bdc
knn version: 0.2.1 or naive-bayes version: 0.2.0

When using nominal variables with "integers" as categories (e.g. apoe4), RapidMiner complains (see woken logs) that : "numerical label not supported"

Python-histogram fails on genetic variables

INFO:root:variable: rs17125944_c
INFO:root:groups: ['dataset', 'gender', 'agegroup', 'alzheimerbroadcategory']
INFO:root:columns: ['rs17125944_c', 'dataset', 'gender', 'agegroup', 'alzheimerbroadcategory']
Traceback (most recent call last):
File "/main.py", line 273, in
main()
File "/main.py", line 31, in main
json.dumps(generate_descriptive_stats(var, groups, data, data_columns),
File "/main.py", line 45, in generate_descriptive_stats
output.append(generate_histogram(data, data_columns, var))
File "/main.py", line 78, in generate_histogram
var_categories)
File "/main.py", line 105, in histo_nominal
sums[v.rstrip()] += 1
KeyError: '0'

docker inspect 8ca8c5b6026a
[
{
"Id": "8ca8c5b6026a719d2c1f7b7677905c5d487d4b574ee58854367001d7ae28d4ed",
"Created": "2017-10-16T10:45:54.46321336Z",
"Path": "/docker-entrypoint.sh",
"Args": [
"compute"
],
"State": {
"Status": "exited",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 0,
"ExitCode": 1,
"Error": "",
"StartedAt": "2017-10-16T10:45:54.795594797Z",
"FinishedAt": "2017-10-16T10:45:55.313828253Z"
},
"Image": "sha256:5ad4879b87429baf105d219aee70c96575a7c4a084593574a4a48562c789747f",
"ResolvConfPath": "/var/lib/docker/containers/8ca8c5b6026a719d2c1f7b7677905c5d487d4b574ee58854367001d7ae28d4ed/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/8ca8c5b6026a719d2c1f7b7677905c5d487d4b574ee58854367001d7ae28d4ed/hostname",
"HostsPath": "/var/lib/docker/containers/8ca8c5b6026a719d2c1f7b7677905c5d487d4b574ee58854367001d7ae28d4ed/hosts",
"LogPath": "",
"Name": "/mesos-3b920c1c-0295-46f7-913f-29efaf61a17f",
"RestartCount": 0,
"Driver": "overlay",
"Platform": "linux",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "docker-default",
"ExecIDs": null,
"HostConfig": {
"Binds": [
"/var/lib/mesos/slaves/dcf68ead-69c7-47cf-a4e5-5c062974548c-S0/docker/links/3b920c1c-0295-46f7-913f-29efaf61a17f:/mnt/mesos/sandbox"
],
"ContainerIDFile": "",
"LogConfig": {
"Type": "journald",
"Config": {}
},
"NetworkMode": "host",
"PortBindings": {},
"RestartPolicy": {
"Name": "no",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": null,
"CapAdd": null,
"CapDrop": null,
"Dns": [],
"DnsOptions": [],
"DnsSearch": [],
"ExtraHosts": null,
"GroupAdd": null,
"IpcMode": "shareable",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": null,
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
"Runtime": "runc",
"ConsoleSize": [
0,
0
],
"Isolation": "",
"CpuShares": 512,
"Memory": 536870912,
"NanoCpus": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": [],
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": [],
"DeviceCgroupRules": null,
"DiskQuota": 0,
"KernelMemory": 0,
"MemoryReservation": 0,
"MemorySwap": -1,
"MemorySwappiness": null,
"OomKillDisable": false,
"PidsLimit": 0,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0
},
"GraphDriver": {
"Data": {
"LowerDir": "/var/lib/docker/overlay/8b0cedaad7e59864d3157a855d0c8829c6827ca7f2878492ad5630f08e095058/root",
"MergedDir": "/var/lib/docker/overlay/aa995cca2fd566f1a64511665d92fe9ca5a5a34ec03f1713a00e00c91667787e/merged",
"UpperDir": "/var/lib/docker/overlay/aa995cca2fd566f1a64511665d92fe9ca5a5a34ec03f1713a00e00c91667787e/upper",
"WorkDir": "/var/lib/docker/overlay/aa995cca2fd566f1a64511665d92fe9ca5a5a34ec03f1713a00e00c91667787e/work"
},
"Name": "overlay"
},
"Mounts": [
{
"Type": "bind",
"Source": "/var/lib/mesos/slaves/dcf68ead-69c7-47cf-a4e5-5c062974548c-S0/docker/links/3b920c1c-0295-46f7-913f-29efaf61a17f",
"Destination": "/mnt/mesos/sandbox",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
}
],
"Config": {
"Hostname": "hos49130",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": true,
"AttachStderr": true,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"IN_JDBC_PASSWORD=r5s3a6c9d8p2v",
"IN_JDBC_URL=jdbc:postgresql://hos49130.intranet.chuv:31433/research",
"OUT_JDBC_USER=woken",
"PARAM_covariables=",
"PARAM_meta={"rs17125944_c":{"sql_type":"int","enumerations":[{"code":0,"label":0},{"code":1,"label":1},{"code":2,"label":2}],"description":"","methodology":"lren-nmm-volumes","label":"rs17125944_C","code":"rs17125944_c","type":"polynominal"},"agegroup":{"enumerations":[{"code":"-50y","label":"-50y"},{"code":"50-59y","label":"50-59y"},{"code":"60-69y","label":"60-69y"},{"code":"70-79y","label":"70-79y"},{"code":"+80y","label":"+80y"}],"description":"Age Group","methodology":"mip-cde","label":"Age Group","code":"agegroup","type":"polynominal"},"alzheimerbroadcategory":{"enumerations":[{"code":"AD","label":"Alzheimer's disease"},{"code":"CN","label":"Cognitively Normal"},{"code":"Other","label":"Other"}],"description":"There will be two broad categories taken into account. Alzheimer's disease (AD) in which the diagnostic is 100% certain and \"Other\" comprising the rest of Alzheimer's related categories. The \"Other\" category refers to Alzheime's related diagnosis which origin can be traced to other pathology eg. vascular. In this category MCI diagnosis can also be found. In summary, all Alzheimer's related diagnosis that are not pure.","methodology":"mip-cde","label":"Alzheimer Broad Category","code":"alzheimerbroadcategory","type":"polynominal"},"dataset":{"enumerations":[{"code":"edsd","label":"EDSD"},{"code":"adni","label":"ADNI"},{"code":"ppmi","label":"PPMI"}],"description":"Variable used to differentiate datasets.","label":"Dataset","code":"dataset","type":"polynominal"},"gender":{"enumerations":[{"code":"M","label":"Male"},{"code":"F","label":"Female"}],"description":"Gender of the patient - Sex assigned at birth","methodology":"mip-cde","label":"Gender","code":"gender","length":1,"type":"binominal"}}",
"PARAM_query=select rs17125944_c,dataset,gender,agegroup,alzheimerbroadcategory from mip_local_features where rs17125944_c is not null and dataset is not null and gender is not null and agegroup is not null and alzheimerbroadcategory is not null ",
"IN_JDBC_DRIVER=org.postgresql.Driver",
"JOB_ID=40c37a3b-c0f2-4c87-ae1f-360d490a91a3",
"OUT_JDBC_PASSWORD=aDB/neuroinfo",
"OUT_JDBC_URL=jdbc:postgresql://hos49130.intranet.chuv:31433/woken",
"PARAM_variables=rs17125944_c",
"IN_JDBC_JAR_PATH=/usr/lib/R/libraries/postgresql-9.4-1201.jdbc41.jar",
"DOCKER_IMAGE=hbpmip/python-histograms:4cb93ea",
"NODE=hos49130.intranet.chuv",
"CHRONOS_RESOURCE_CPU=0.5",
"OUT_JDBC_DRIVER=org.postgresql.Driver",
"HOST=hos49130.intranet.chuv",
"OUT_JDBC_JAR_PATH=/usr/lib/R/libraries/postgresql-9.4-1201.jdbc41.jar",
"PARAM_grouping=dataset,gender,agegroup,alzheimerbroadcategory",
"CHRONOS_RESOURCE_MEM=512.0",
"IN_JDBC_USER=research",
"CHRONOS_RESOURCE_DISK=256.0",
"MESOS_SANDBOX=/mnt/mesos/sandbox",
"CHRONOS_JOB_NAME=python_histograms_40c37a3b_c0f2_4c87_ae1f_360d490a91a3",
"MESOS_CONTAINER_NAME=mesos-3b920c1c-0295-46f7-913f-29efaf61a17f",
"mesos_task_id=ct:1508150752607:2:python_histograms_40c37a3b_c0f2_4c87_ae1f_360d490a91a3:",
"CHRONOS_JOB_OWNER=[email protected]",
"PATH=/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"LANG=C.UTF-8",
"LC_ALL=C.UTF-8",
"COMPUTE_IN=/data/in",
"COMPUTE_OUT=/data/out",
"MODEL=histograms",
"FUNCTION=python-histograms",
"CODE=histo",
"NAME=Histograms"
],
"Cmd": [
"compute"
],
"Image": "hbpmip/python-histograms:4cb93ea",
"Volumes": null,
"WorkingDir": "",
"Entrypoint": [
"/docker-entrypoint.sh"
],
"OnBuild": null,
"Labels": {
"eu.humanbrainproject.category": "Python"
}
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "b32fbbaa1aa9b729443855d9c090aa67e149387d338ebf4a858fa74e357295ed",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {},
"SandboxKey": "/var/run/docker/netns/default",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"host": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "dd644a5262cb021644570c9257a95e808a3d4fca5e813fd3513752dfd3a26e8b",
"EndpointID": "",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "",
"DriverOpts": null
}
}
}
}
]

Tests for python-knn are failing

While running python-knn/tests/test.sh:

Run the distributed-knn...
WARNING: Dependency conflict: an older version of the 'docker-py' package may be polluting the namespace. If you're experiencing crashes, run the following command to remedy the issue:
pip uninstall docker-py; pip uninstall docker; pip install docker
Starting tests_db_1 ... done
Traceback (most recent call last):
  File "/knn.py", line 28, in <module>
    from sklearn_to_pfa.sklearn_to_pfa import sklearn_to_pfa
  File "/usr/local/lib/python3.6/site-packages/sklearn_to_pfa/sklearn_to_pfa.py", line 23, in <module>
    import titus.prettypfa
  File "/usr/local/lib/python3.6/site-packages/titus/prettypfa.py", line 26, in <module>
    from titus.pfaast import Subs
  File "/usr/local/lib/python3.6/site-packages/titus/pfaast.py", line 25, in <module>
    import titus.lib.core
  File "/usr/local/lib/python3.6/site-packages/titus/lib/core.py", line 25, in <module>
    from titus.signature import Sig
  File "/usr/local/lib/python3.6/site-packages/titus/signature.py", line 23, in <module>
    import titus.P as P
  File "/usr/local/lib/python3.6/site-packages/titus/P.py", line 20, in <module>
    from titus.datatype import Type
  File "/usr/local/lib/python3.6/site-packages/titus/datatype.py", line 23, in <module>
    import avro.io
  File "/usr/local/lib/python3.6/site-packages/avro/io.py", line 200
    bits = (((ord(self.read(1)) & 0xffL)) |
                                      ^

Python histograms fails somethimes

Seen on CLM Vertex, with research + CLM datasets:

            "PARAM_variables=rs610932_a",
            "CHRONOS_JOB_NAME=python_histograms_bab479f1_1b1b_4e03_b381_dfa4f6893111",
            "PARAM_grouping=dataset,gender,agegroup,alzheimerbroadcategory",
            "OUT_JDBC_JAR_PATH=/usr/lib/R/libraries/postgresql-9.4-1201.jdbc41.jar",
            "OUT_JDBC_PASSWORD=aDB/neuroinfo",
            "PARAM_covariables=",
            "mesos_task_id=ct:1507726483217:2:python_histograms_bab479f1_1b1b_4e03_b381_dfa4f6893111:",
            "CHRONOS_RESOURCE_DISK=256.0",
            "DOCKER_IMAGE=hbpmip/python-histograms:4cb93ea",
            "MESOS_SANDBOX=/mnt/mesos/sandbox",
            "OUT_JDBC_DRIVER=org.postgresql.Driver",
            "PARAM_query=select rs610932_a,dataset,gender,agegroup,alzheimerbroadcategory from mip_local_features where rs610932_a is not null and dataset is not null and gender is not null and agegroup is not null and alzheimerbroadcategory is not null ",
            "CHRONOS_RESOURCE_CPU=0.5",
            "IN_JDBC_URL=jdbc:postgresql://hos49130.intranet.chuv:31433/research",
            "OUT_JDBC_URL=jdbc:postgresql://hos49130.intranet.chuv:31433/woken",
            "IN_JDBC_USER=research",
            "NODE=hos49130.intranet.chuv",
            "JOB_ID=bab479f1-1b1b-4e03-b381-dfa4f6893111",
            "OUT_JDBC_USER=woken",
            "PARAM_meta={\"agegroup\":{\"enumerations\":[{\"code\":\"-50y\",\"label\":\"-50y\"},{\"code\":\"50-59y\",\"label\":\"50-59y\"},{\"code\":\"60-69y\",\"label\":\"60-69y\"},{\"code\":\"70-79y\",\"label\":\"70-79y\"},{\"code\":\"+80y\",\"label\":\"+80y\"}],\"description\":\"Age Group\",\"methodology\":\"mip-cde\",\"label\":\"Age Group\",\"code\":\"agegroup\",\"type\":\"polynominal\"},\"alzheimerbroadcategory\":{\"enumerations\":[{\"code\":\"AD\",\"label\":\"Alzheimer's disease\"},{\"code\":\"CN\",\"label\":\"Cognitively Normal\"},{\"code\":\"Other\",\"label\":\"Other\"}],\"description\":\"There will be two broad categories taken into account. Alzheimer's disease (AD) in which the diagnostic is 100% certain and \\\"Other\\\" comprising the rest of Alzheimer's related categories. The \\\"Other\\\" category refers to Alzheime's related diagnosis which origin can be traced to other pathology eg. vascular. In this category MCI diagnosis can also be found. In summary, all Alzheimer's related diagnosis that are not pure.\",\"methodology\":\"mip-cde\",\"label\":\"Alzheimer Broad Category\",\"code\":\"alzheimerbroadcategory\",\"type\":\"polynominal\"},\"dataset\":{\"enumerations\":[{\"code\":\"edsd\",\"label\":\"EDSD\"},{\"code\":\"adni\",\"label\":\"ADNI\"},{\"code\":\"ppmi\",\"label\":\"PPMI\"}],\"description\":\"Variable used to differentiate datasets.\",\"label\":\"Dataset\",\"code\":\"dataset\",\"type\":\"polynominal\"},\"rs610932_a\":{\"sql_type\":\"int\",\"enumerations\":[{\"code\":0,\"label\":0},{\"code\":1,\"label\":1},{\"code\":2,\"label\":2}],\"description\":\"\",\"methodology\":\"lren-nmm-volumes\",\"label\":\"rs610932_A\",\"code\":\"rs610932_a\",\"type\":\"polynominal\"},\"gender\":{\"enumerations\":[{\"code\":\"M\",\"label\":\"Male\"},{\"code\":\"F\",\"label\":\"Female\"}],\"description\":\"Gender of the patient - Sex assigned at birth\",\"methodology\":\"mip-cde\",\"label\":\"Gender\",\"code\":\"gender\",\"length\":1,\"type\":\"binominal\"}}",
            "HOST=hos49130.intranet.chuv",
            "IN_JDBC_JAR_PATH=/usr/lib/R/libraries/postgresql-9.4-1201.jdbc41.jar",
            "CHRONOS_RESOURCE_MEM=512.0",
            "MESOS_CONTAINER_NAME=mesos-3987976c-26a3-415d-8263-f8db921fb1b4",
            "IN_JDBC_PASSWORD=r5s3a6c9d8p2v",
            "[email protected]",
            "IN_JDBC_DRIVER=org.postgresql.Driver",
            "PATH=/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            "LANG=C.UTF-8",
            "LC_ALL=C.UTF-8",
            "COMPUTE_IN=/data/in",
            "COMPUTE_OUT=/data/out",
            "MODEL=histograms",
            "FUNCTION=python-histograms",
            "CODE=histo",
            "NAME=Histograms"

Python-histogram fails on TIV variable

Seen using latest web-analytics-starter:

0bdc026ca55c        hbpmip/python-histograms:0.3.6        "python /histograms.โ€ฆ"   55 seconds ago       Exited (1) 53 seconds ago                                               

docker logs 0bdc026ca55c                                         2 changed files  master 
INFO:root:Using default number of bins: 20
Traceback (most recent call last):
  File "/histograms.py", line 149, in <module>
    main()
  File "/histograms.py", line 37, in main
    histograms_results = compute_histograms(dep_var, indep_vars, nb_bins)
  File "/histograms.py", line 46, in compute_histograms
    histograms.append(compute_histogram(dep_var, nb_bins=nb_bins))
  File "/histograms.py", line 59, in compute_histogram
    categories, categories_labels = compute_categories(dep_var, nb_bins)
  File "/histograms.py", line 91, in compute_categories
    minimum = min(values)
ValueError: min() arg is an empty sequence

"PARAM_covariables=",
"PARAM_grouping=dataset,gender,agegroup,alzheimerbroadcategory",
"PARAM_meta={\"tiv\":{\"description\":\"Total intra-cranial volume\",\"methodology\":\"lren-nmm-volumes\",\"label\":\"TIV\",\"code\":\"tiv\",\"units\":\"cm3\",\"type\":\"real\"},\"agegroup\":{\"enumerations\":[{\"code\":\"-50y\",\"label\":\"-50y\"},{\"code\":\"50-59y\",\"label\":\"50-59y\"},{\"code\":\"60-69y\",\"label\":\"60-69y\"},{\"code\":\"70-79y\",\"label\":\"70-79y\"},{\"code\":\"+80y\",\"label\":\"+80y\"}],\"description\":\"Age Group\",\"methodology\":\"mip-cde\",\"label\":\"Age Group\",\"code\":\"agegroup\",\"type\":\"polynominal\"},\"alzheimerbroadcategory\":{\"enumerations\":[{\"code\":\"AD\",\"label\":\"Alzheimer's disease\"},{\"code\":\"CN\",\"label\":\"Cognitively Normal\"},{\"code\":\"Other\",\"label\":\"Other\"}],\"description\":\"There will be two broad categories taken into account. Alzheimer's disease (AD) in which the diagnostic is 100% certain and \\\"Other\\\" comprising the rest of Alzheimer's related categories. The \\\"Other\\\" category refers to Alzheime's related diagnosis which origin can be traced to other pathology eg. vascular. In this category MCI diagnosis can also be found. In summary, all Alzheimer's related diagnosis that are not pure.\",\"methodology\":\"mip-cde\",\"label\":\"Alzheimer Broad Category\",\"code\":\"alzheimerbroadcategory\",\"type\":\"polynominal\"},\"dataset\":{\"enumerations\":[{\"code\":\"cde_features_a\",\"label\":\"CHUV\"},{\"code\":\"cde_features_b\",\"label\":\"Brescia\"},{\"code\":\"cde_features_c\",\"label\":\"Lille\"}],\"description\":\"Variable used to differentiate datasets.\",\"methodology\":\"mip-cde\",\"label\":\"Dataset\",\"code\":\"dataset\",\"type\":\"polynominal\"},\"gender\":{\"enumerations\":[{\"code\":\"M\",\"label\":\"Male\"},{\"code\":\"F\",\"label\":\"Female\"}],\"description\":\"Gender of the patient - Sex assigned at birth\",\"methodology\":\"mip-cde\",\"label\":\"Gender\",\"code\":\"gender\",\"length\":1,\"type\":\"binominal\"}}",
"PARAM_query=SELECT \"tiv\",\"dataset\",\"gender\",\"agegroup\",\"alzheimerbroadcategory\" FROM cde_features_a WHERE \"tiv\" IS NOT NULL",
"PARAM_variables=tiv"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.