About 190W lines train data. 40w test data. What does this error mean? Can resolve it.?
2017.12.27 13:54:10 com.fenbi.mp4j.comm.CommMaster - slave num:1, port:65534
2017.12.27 13:54:10 org.apache.hadoop.ipc.CallQueueManager - Using callQueue class java.util.concurrent.LinkedBlockingQueue
2017.12.27 13:54:10 org.apache.hadoop.ipc.Server - Starting Socket Reader #1 for port 65534
2017.12.27 13:54:11 org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017.12.27 13:54:11 com.fenbi.mp4j.comm.CommMaster - rpc server started!, rpcport=65534
2017.12.27 13:54:11 com.fenbi.ytklearn.worker.TrainWorker - configFile:config/model/flt_gbdt.conf
2017.12.27 13:54:11 com.fenbi.ytklearn.worker.TrainWorker - configPath:config/model/flt_gbdt.conf
2017.12.27 13:54:11 com.fenbi.ytklearn.worker.TrainWorker - pyTransformScript:
2017.12.27 13:54:11 com.fenbi.ytklearn.worker.TrainWorker - loginName:user
2017.12.27 13:54:11 com.fenbi.ytklearn.worker.TrainWorker - hostName:BOAXGLNJW0FEFII, hostPort:65534
2017.12.27 13:54:11 com.fenbi.ytklearn.worker.TrainWorker - threadNum:6
2017.12.27 13:54:11 com.fenbi.ytklearn.worker.TrainWorker - modelName:gbdt
2017.12.27 13:54:11 org.apache.hadoop.ipc.Server - IPC Server listener on 65534: starting
2017.12.27 13:54:11 org.apache.hadoop.ipc.Server - IPC Server Responder: starting
2017.12.27 13:54:11 com.fenbi.mp4j.comm.ProcessCommSlave - master host:BOAXGLNJW0FEFII, master port:65534
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - connecting:BOAXGLNJW0FEFII###62585, connected count:1
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - host names before sort:[BOAXGLNJW0FEFII###62585]
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - host names after sort:[BOAXGLNJW0FEFII###62585]
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - current slave's rank:0, address:BOAXGLNJW0FEFII###62585
2017.12.27 13:54:11 com.fenbi.mp4j.comm.ProcessCommSlave - this slave recv data port:62585
2017.12.27 13:54:11 com.fenbi.mp4j.comm.ProcessCommSlave - slave num:1
2017.12.27 13:54:11 com.fenbi.mp4j.comm.ProcessCommSlave - slave rank:0
2017.12.27 13:54:11 com.fenbi.mp4j.comm.ProcessCommSlave - Pid is:7748
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - Pid is:7748
2017.12.27 13:54:11 com.fenbi.mp4j.comm.ProcessCommSlave - slaves addresses:
2017.12.27 13:54:11 com.fenbi.mp4j.comm.ProcessCommSlave - BOAXGLNJW0FEFII:62585
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - this slave init finished!
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - ################ parameters ################
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.delim.feature_name_val_delim=ConfigString(":")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - model.dict_path=ConfigString("config/model/feat_dict")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.min_split_loss=ConfigInt(0)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.max_leaf_cnt=ConfigInt(16)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - model.need_dict=ConfigBoolean(false)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - model.continue_train=ConfigBoolean(false)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.max_feature_dim=ConfigInt(40)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.delim.x_delim=ConfigString("###")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - model.feature_importance_path=ConfigString("config/model/feature_importance")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.test.max_error_tol=ConfigInt(0)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.min_split_samples=ConfigInt(-1)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.watch_test=ConfigBoolean(false)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.delim.y_delim=ConfigString(",")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.min_child_hessian_sum=ConfigInt(1)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.watch_train=ConfigBoolean(false)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.delim.features_delim=ConfigString(" ")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.y_sampling=SimpleConfigList([])
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.regularization.l1=ConfigInt(0)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.regularization.l2=ConfigInt(1)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.feature_sample_rate=ConfigDouble(0.8)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.max_depth=ConfigInt(7)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.tree_grow_policy=ConfigString("loss")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.sample_dependent_base_prediction=ConfigBoolean(false)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.regularization.learning_rate=ConfigDouble(0.1)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.silent=ConfigInt(1)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - model.dump_freq=ConfigInt(-1)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - fs_scheme=ConfigString("local")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.tree_maker=ConfigString("data")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - feature.approximate=SimpleConfigList([{"cols":"default","type":"no_sample"}])
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - feature.missing_value=ConfigString("value")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.histogram_pool_capacity=ConfigInt(-1)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.train.data_path=ConfigString("data/flt/train.ytklearn")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.train.max_error_tol=ConfigInt(0)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - model.data_path=ConfigString("config/model/gbdt.model")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.loss_function=ConfigString("sigmoid")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.instance_sample_rate=ConfigDouble(0.8)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - feature.filter_threshold=ConfigInt(0)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.unassigned_mode=ConfigString("lines_avg")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.uniform_base_prediction=ConfigDouble(0.5)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.assigned=ConfigBoolean(false)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.just_evaluate=ConfigBoolean(false)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - data.test.data_path=ConfigString("data/flt/test.ytklearn")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - feature.split_type=ConfigString("mean")
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.eval_metric=SimpleConfigList(["confusion_matrix","auc"])
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.round_num=ConfigInt(300)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - verbose=ConfigBoolean(false)
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - optimization.max_abs_leaf_val=ConfigInt(-1)
2017.12.27 13:54:11 com.fenbi.ytklearn.worker.TrainWorker - file system uri:local, URI:local, URI tostring:local
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - commonParams:GBDTCommonParams(verbose=false, dataParams=DataParams(train=DataParams.Train(data_path=data/flt/train.ytklearn, max_error_tol=0), test=DataParams.Test(data_path=data/flt/test.ytklearn, max_error_tol=0), delim=DataParams.Delim(x_delim=###, y_delim=,, features_delim= , feature_name_val_delim=:), y_sampling=[], assigned=false, unassigned_mode=lines_avg), max_feature_dim=40, modelParams=GBDTModelParams(data_path=config/model/gbdt.model, need_dict=false, dict_path=config/model/feat_dict, dump_freq=2147483647, continue_train=false, feature_importance_path=config/model/feature_importance), featureParams=GBDTFeatureParams(split_Type=MEAN, enable_missing_value=true, featureMissingParams=value, needFeaAppro=true, feaApproConfList=[Config(SimpleConfigObject({"cols":"default","type":"no_sample"}))], featureApproximateParamList=null, verbose=false, filter_threshold=0), optimizationParams=GBDTOptimizationParams(learn_type=gradient_boosting, tree_maker_type=DATA_PARALLEL, round_num=300, max_depth=7, min_child_hessian_sum=1.0, max_leaf_cnt=16, min_split_loss=0.0, min_split_samples=-1, objective=sigmoid, sigmoid_zmax=0.0, max_abs_leaf_val=-1.0, lad_refine_appr=false, tree_grow_policy=LOSSCHG_WISE, histogram_pool_capacity=-1.0, regularization=GBDTOptimizationParams.Regularization(l1=0.0, l2=1.0, learningRate=0.1), uniform_base_prediction=0.5, sample_dependent_base_prediction=false, subsample=0.8, feature_sample_rate=0.8, class_num=1, just_evaluate=false, eval_metrics=[confusion_matrix, auc], watch_train=false, watch_test=false, verbose=false))
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - have no dict, we will collect feature dict...
2017.12.27 13:54:11 com.fenbi.mp4j.rpc.Server - #########read train data############
2017.12.27 13:55:05 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=1] has readed lines:10000
2017.12.27 13:55:05 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=2] has readed lines:10000
2017.12.27 13:55:05 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=3] has readed lines:10000
2017.12.27 13:55:05 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=5] has readed lines:10000
2017.12.27 13:55:05 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=0] has readed lines:10000
2017.12.27 13:55:05 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=4] has readed lines:10000
2017.12.27 13:55:47 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=4] has readed lines:20000
2017.12.27 13:55:47 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=1] has readed lines:20000
2017.12.27 13:55:47 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=0] has readed lines:20000
2017.12.27 13:55:47 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=2] has readed lines:20000
2017.12.27 13:55:47 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=3] has readed lines:20000
2017.12.27 13:55:47 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=5] has readed lines:20000
2017.12.27 13:56:25 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=1] has readed lines:30000
2017.12.27 13:56:25 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=4] has readed lines:30000
2017.12.27 13:56:25 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=3] has readed lines:30000
2017.12.27 13:56:25 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=0] has readed lines:30000
2017.12.27 13:56:30 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=2] has readed lines:30000
2017.12.27 13:56:30 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=5] has readed lines:30000
2017.12.27 13:57:07 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=4] has readed lines:40000
2017.12.27 13:57:07 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=3] has readed lines:40000
2017.12.27 13:57:10 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=0] has readed lines:40000
2017.12.27 13:57:10 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=2] has readed lines:40000
2017.12.27 13:57:10 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=1] has readed lines:40000
2017.12.27 13:57:10 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=5] has readed lines:40000
2017.12.27 13:57:58 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=4] has readed lines:50000
2017.12.27 13:57:58 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=2] has readed lines:50000
2017.12.27 13:57:58 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=3] has readed lines:50000
2017.12.27 13:58:01 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=1] has readed lines:50000
2017.12.27 13:58:01 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=5] has readed lines:50000
2017.12.27 13:58:01 com.fenbi.mp4j.rpc.Server - [rank=0] [threadId=0] has readed lines:50000
2017.12.27 14:19:28 com.fenbi.mp4j.rpc.Server - [ERROR] waiting for heartbeat timeout > 600000, master will be shutdowned!