haghish / mlim Goto Github PK
View Code? Open in Web Editor NEWmlim: single and multiple imputation with automated machine learning
License: Other
mlim: single and multiple imputation with automated machine learning
License: Other
Hi,
Thank you for your constant work on this package. It is awesome. I had no problems using it so far and it gives very good results.
However, I have problem with my latest dataset and cannot figure out what is the problem..
This is the error:
> dane_elnet <- mlim(as.data.frame(x_miss), m=1, seed = 2022, tuning_time = 900, algos = c("ELNET"), report = "imputation_ELNET.md")
Random Forest preimputation in progress...
data 1, iteration 1 (RAM = 138.897 GiB):
| | 0%
21:40:03.915: GLM_1_AutoML_1_20220908_214002 [GLM def_1] failed: java.lang.ArrayIndexOutOfBoundsException: Index 57 out of bounds for length 57
21:40:03.935: Empty leaderboard.
AutoML was not able to build any model within a max runtime constraint of 900 seconds, you may want to increase this value before retrying.
21:50:18.391: New models will be added to existing leaderboard mlim@@TNM (leaderboard frame=null) with already 0 models.
21:50:18.769: GLM_2_AutoML_2_20220908_215018 [GLM def_1] failed: java.lang.ArrayIndexOutOfBoundsException: Index 57 out of bounds for length 57
21:50:18.785: Empty leaderboard.
AutoML was not able to build any model within a max runtime constraint of 900 seconds, you may want to increase this value before retrying.
21:20:07.760: New models will be added to existing leaderboard mlim@@TNM (leaderboard frame=null) with already 0 models.
21:20:07.990: GLM_3_AutoML_3_20220909_212007 [GLM def_1] failed: java.lang.ArrayIndexOutOfBoundsException
21:20:08.1: Empty leaderboard.
AutoML was not able to build any model within a max runtime constraint of 900 seconds, you may want to increase this value before retrying.connection to JAVA server failed...
Error in value[[3L]](cond) : Java server crashed. perhaps a RAM problem?
In addition: Warning message:
In .automl.fetch_state(project_name) :
The leaderboard contains zero models: try running AutoML for longer (the default is 1 hour).
Dataset seems to cleaned and formatted nicely. I also cannot figure out what Index 57 out of bounds for length 57
is referring to..
> str(x_miss)
'data.frame': 1641 obs. of 42 variables:
$ Wiek : num 69.6 73.1 76.5 65.1 63.3 48.4 68.8 69.5 78.2 71.1 ...
$ EBRT_BT : Factor w/ 2 levels "BT BOOST","EBRT": 2 2 2 2 2 2 2 2 2 2 ...
$ GGG : num 2 5 1 4 1 3 2 1 5 1 ...
$ cores : num 6 6 10 6 6 NA NA 6 12 6 ...
$ cores_positive : num 6 1 6 5 2 NA NA 2 12 1 ...
$ cores_positive_proc: num 1 0.167 0.6 0.833 0.333 ...
$ max_prccancer : num NA 50 100 100 NA NA 100 50 NA 20 ...
$ TURP : Factor w/ 2 levels "0_No","1_Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ V_prostata : num 31.9 35.7 67.2 38.5 26.5 35.9 NA 60 NA 156 ...
$ MR_pre_EPE : Factor w/ 2 levels "0_No","1_Yes": 2 NA NA NA NA NA 1 NA NA NA ...
$ MR_pre_SVI : Factor w/ 2 levels "0_No","1_Yes": 2 NA NA NA NA NA 1 NA NA NA ...
$ PSA_density : num 0.72 0.59 1.29 1.09 1.53 4.74 NA 0.25 NA 0.07 ...
$ TNM : Factor w/ 7 levels "T1c","T2a","T2b",..: 7 3 3 6 1 6 NA 3 5 2 ...
$ ZUBROD : num 0 1 1 0 0 0 0 0 0 0 ...
$ PSAmax : num 23 21 86.4 41.8 40.5 ...
$ Risk_Group : Factor w/ 4 levels "1_low_IR","2_high_IR",..: 4 4 3 4 3 4 3 2 4 2 ...
$ ADT_pre_RT : Factor w/ 2 levels "0_No","1_Yes": 2 2 2 2 2 2 2 2 2 1 ...
$ ADT_intractu_RT : Factor w/ 2 levels "0_No","1_Yes": 2 2 2 2 2 2 2 2 2 1 ...
$ ADT_ADJ : Factor w/ 2 levels "0_No","1_Yes": 1 2 2 2 2 2 2 2 2 1 ...
$ ADT_typ : Factor w/ 5 levels "0_brak","analog",..: 3 3 3 3 3 3 3 3 3 1 ...
$ ADT_czas_pre_RT : num 100 110 104 92 74 76 101 90 113 1 ...
$ ADT_ADJ_CZAS : num 0 2.66 51.12 74.8 9.63 ...
$ ADT_czas_suma : num 5.81 11.4 59.1 80.35 13.7 ...
$ czas_PSA_pre : num NA NA 0.76 8.74 1.38 NA NA 6.44 NA 0.66 ...
$ PSA_pre_RT : num 0.08 0.04 0 41.8 23.28 ...
$ czas_RT : num 38 40 62 50 61 39 50 50 58 46 ...
$ DCp : num 38 42 44 54 58 60 60 62 72 68 ...
$ N_RT : num 1 1 1 1 1 1 0 1 1 0 ...
$ DCn : num 38 42 44 50 44 50 0 50 43.2 0 ...
$ DCbt : num 0 0 0 0 0 0 0 0 0 0 ...
$ BTfx : num 0 0 0 0 0 0 0 0 0 0 ...
$ BED3 : num 63.3 70 73.3 90 96.7 ...
$ BED1_5 : num 88.7 98 102.7 126 135.3 ...
$ FU : num 0.5 4 53.2 102.2 11.6 ...
$ Zgon : num 1 1 1 1 1 1 1 1 1 1 ...
$ BC : num 0 0 0 0 0 1 0 0 1 1 ...
$ MFS_24 : num 1 0 1 1 0 1 1 1 1 1 ...
$ FFM : num 0 0 0 0 0 1 0 0 0 1 ...
$ BC_czas : num 0.53 3.98 53.15 102.23 11.63 ...
$ FFM_czas : num 0.53 3.98 53.15 102.23 11.63 ...
$ MFS_czas_24 : num 23.72 3.98 53.15 102.76 11.63 ...
$ OS : num 23.7 112.5 53.2 102.8 84.3 ...
I have also not found any suspicious variables...
> caret::nearZeroVar(x_miss)
integer(0)
All other packages seems to handle well description of missing values...
naniar::vis_miss(x_miss, sort_miss = T)
Any suggestion would be really helpful. Thank you in advance.
Konrad
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.