Comments (4)
Yes, we will check whether this problem is caused by the configuration. After analyzing the causes, we will contact you in time. Thank you for your feedback. @xkp793003821
from vega.
Hi @xkp793003821 , we could not reproduce this error on our side. Could you please share your full log and anything that you may have modified.
from vega.
@xinghaochen
This is source code
My modified code is like this
I use pytorch==1.7
this is log
(torch16) cooper@gtis-System-Product-Name:~/project/vega_pro$ /home/cooper/anaconda3/envs/torch16/bin/python /home/cooper/project/vega_pro/examples/run_example.py
2020-07-08 11:03:38.386 INFO ------------------------------------------------
2020-07-08 11:03:38.387 INFO task id: 0708.110301.854
2020-07-08 11:03:38.387 INFO ------------------------------------------------
2020-07-08 11:03:38.405 INFO configure: {'general': {'worker': {'timeout': 1000.0, 'gpus_per_job': -1, 'eval_count': 10, 'evaluate_timeout': 0.1}, 'task': {'local_base_path': './tasks', 'output_subpath': 'output', 'best_model_subpath': 'best_model', 'log_subpath': 'logs', 'result_subpath': 'result', 'worker_subpath': 'workers/[step_name]/[worker_id]', 'backup_base_path': None, 'task_id': '0708.110301.854'}, 'logger': {'level': 'info'}, 'backend': 'pytorch', 'cluster': {'master_ip': None, 'listen_port': 8000, 'slaves': []}, 'model_store': {'store_path': None}, 'model_zoo': {'model_zoo_path': '/data3/model_zoo', 'local_path': None}, 'cluster_mode': <ClusterMode.Single: 0>}, 'pipeline': ['nas'], 'nas': {'pipe_step': {'type': 'NasPipeStep'}, 'dataset': {'type': 'Cifar10', 'common': {'num_workers': 0, 'data_path': '/home/cooper/datasets/'}, 'train': {'data_path': None, 'batch_size': 128, 'num_workers': 8, 'shuffle': False, 'distributed': False, 'download': False, 'imgs_per_gpu': 1, 'train_portion': 0.5, 'pin_memory': True, 'drop_last': True, 'transforms': [{'type': 'RandomCrop', 'size': 32, 'padding': 4}, {'type': 'RandomHorizontalFlip'}, {'type': 'ToTensor'}, {'type': 'Normalize', 'mean': [0.49139968, 0.48215827, 0.44653124], 'std': [0.24703233, 0.24348505, 0.26158768]}], 'cutout_length': 16}, 'val': {'data_path': None, 'batch_size': 2048, 'num_workers': 4, 'shuffle': False, 'distributed': False, 'download': False, 'imgs_per_gpu': 1, 'train_portion': 0.5, 'pin_memory': True, 'drop_last': True, 'transforms': [{'type': 'ToTensor'}, {'type': 'Normalize', 'mean': [0.49139968, 0.48215827, 0.44653124], 'std': [0.24703233, 0.24348505, 0.26158768]}]}, 'n_class': 10, 'test': {'data_path': None, 'batch_size': 1, 'num_workers': 4, 'shuffle': False, 'distributed': False, 'download': False, 'imgs_per_gpu': 1, 'train_portion': 1.0, 'pin_memory': True, 'drop_last': True, 'transforms': [{'type': 'ToTensor'}, {'type': 'Normalize', 'mean': [0.49139968, 0.48215827, 0.44653124], 'std': [0.24703233, 0.24348505, 0.26158768]}]}}, 'search_algorithm': {'type': 'CARSAlgorithm', 'policy': {'momentum': 0.9, 'weight_decay': 0.0003, 'parallel': False, 'num_individual': 8, 'num_individual_per_iter': 1, 'expand': 1.0, 'warmup': 0, 'num_generation': 1, 'start_ga_epoch': 3, 'ga_interval': 1, 'nsga_method': 'cars_nsga', 'pareto_model_num': 4, 'select_method': 'uniform', 'arch_optim': {'type': 'Adam', 'lr': 0.0003, 'betas': [0.5, 0.999], 'weight_decay': 0.001}, 'criterion': {'type': 'CrossEntropyLoss'}}, 'codec': 'DartsCodec'}, 'search_space': {'type': 'SearchSpace', 'modules': ['super_network'], 'super_network': {'name': 'CARSDartsNetwork', 'network': ['PreOneStem', 'normal', 'normal', 'reduce', 'normal', 'normal', 'reduce', 'normal', 'normal'], 'input_size': 32, 'init_channels': 16, 'num_classes': 10, 'auxiliary': False, 'search': True, 'normal': {'type': 'block', 'name': 'Cell', 'steps': 4, 'reduction': False, 'genotype': [[['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 2, 0], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 2, 1], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 3, 0], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 3, 1], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 3, 2], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 4, 0], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 4, 1], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 4, 2], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 4, 3], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 0], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 1], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 2], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 3], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 4]], 'concat': [2, 3, 4, 5]}, 'reduce': {'type': 'block', 'name': 'Cell', 'steps': 4, 'reduction': True, 'genotype': [[['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 2, 0], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 2, 1], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 3, 0], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 3, 1], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 3, 2], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 4, 0], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 4, 1], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 4, 2], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 4, 3], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 0], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 1], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 2], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 3], [['none', 'max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5'], 5, 4]], 'concat': [2, 3, 4, 5]}, 'preprocess': {'name': 'darts_stem1'}, 'linear': {'name': 'linear'}}}, 'trainer': {'type': 'Trainer', 'darts_template_file': '{default_darts_cifar10_template}', 'callbacks': 'CARSTrainerCallback', 'model_statistic': False, 'epochs': 5, 'optim': {'type': 'SGD', 'lr': 0.025, 'momentum': 0.9, 'weight_decay': 0.0003}, 'lr_scheduler': {'type': 'CosineAnnealingLR', 'T_max': 50, 'eta_min': 0.001}, 'loss': {'type': 'CrossEntropyLoss'}, 'metric': {'type': 'accuracy', 'topk': [1, 5]}, 'grad_clip': 5.0, 'seed': 10, 'unrolled': True, 'with_valid': True, 'cuda': True, 'is_detection_trainer': False, 'horovod': False, 'save_model_desc': True, 'report_freq': 10, 'model_desc': None, 'model_desc_file': None, 'hps_file': None, 'pretrained_model_file': None, 'print_step_interval': 50, 'visualize': {'train_process': {'visual': True}, 'model': {'visual': True, 'interval': 10}}, 'warmup_epochs': 5}}, 'env': {'init_method': 'tcp://127.0.0.1:8000', 'world_size': 1, 'rank': 0}}
2020-07-08 11:03:38.407 INFO ------------------------------------------------
2020-07-08 11:03:38.407 INFO pipeline steps:['nas']
2020-07-08 11:03:38.409 INFO Start pipeline step: [nas]
2020-07-08 11:03:38.418 INFO NasPipeStep started...
2020-07-08 11:03:38.418 INFO ====> vega.algorithms.nas.cars.cars_alg.search()
2020-07-08 11:03:53.51 INFO submit trainer(id=0)!
/home/cooper/anaconda3/envs/torch16/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:123: UserWarning: Detected call of lr_scheduler.step()
before optimizer.step()
. In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step()
before lr_scheduler.step()
. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/home/cooper/anaconda3/envs/torch16/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:143: UserWarning: The epoch parameter in scheduler.step()
was not necessary and is being deprecated where possible. Please use scheduler.step()
to step the scheduler. During the deprecation, if epoch is different from None, the closed form is used instead of the new chainable form, where available. Please open an issue if you are unable to replicate your use case: https://github.com/pytorch/pytorch/issues/new/choose.
warnings.warn(EPOCH_DEPRECATION_WARNING, UserWarning)
/home/cooper/anaconda3/envs/torch16/lib/python3.6/site-packages/vega/algorithms/nas/cars/cars_trainer_callback.py:82: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
self.trainer.model.parameters(), self.trainer.cfg.grad_clip)
2020-07-08 11:03:59.437 INFO epoch [0/5], train step [ 0/195], loss [ 2.358, 2.358], train metrics [accuracy: 7.812, 47.656]
2020-07-08 11:04:01.137 INFO epoch [0/5], train step [ 10/195], loss [ 2.409, 2.345], train metrics [accuracy: 9.801, 51.491]
2020-07-08 11:04:02.773 INFO epoch [0/5], train step [ 20/195], loss [ 2.258, 2.354], train metrics [accuracy: 11.793, 54.948]
2020-07-08 11:04:04.462 INFO epoch [0/5], train step [ 30/195], loss [ 2.270, 2.335], train metrics [accuracy: 13.155, 57.913]
2020-07-08 11:04:06.87 INFO epoch [0/5], train step [ 40/195], loss [ 2.353, 2.286], train metrics [accuracy: 14.539, 61.776]
2020-07-08 11:04:07.735 INFO epoch [0/5], train step [ 50/195], loss [ 2.227, 2.269], train metrics [accuracy: 15.196, 63.343]
2020-07-08 11:04:09.381 INFO epoch [0/5], train step [ 60/195], loss [ 2.180, 2.242], train metrics [accuracy: 16.150, 65.049]
2020-07-08 11:04:11.41 INFO epoch [0/5], train step [ 70/195], loss [ 2.075, 2.227], train metrics [accuracy: 16.670, 66.208]
2020-07-08 11:04:12.657 INFO epoch [0/5], train step [ 80/195], loss [ 1.970, 2.210], train metrics [accuracy: 17.390, 67.419]
2020-07-08 11:04:14.311 INFO epoch [0/5], train step [ 90/195], loss [ 2.254, 2.197], train metrics [accuracy: 17.797, 68.381]
2020-07-08 11:04:15.939 INFO epoch [0/5], train step [100/195], loss [ 2.164, 2.180], train metrics [accuracy: 18.487, 69.562]
2020-07-08 11:04:17.592 INFO epoch [0/5], train step [110/195], loss [ 2.062, 2.172], train metrics [accuracy: 19.017, 69.982]
2020-07-08 11:04:19.262 INFO epoch [0/5], train step [120/195], loss [ 2.035, 2.163], train metrics [accuracy: 19.221, 70.493]
2020-07-08 11:04:20.937 INFO epoch [0/5], train step [130/195], loss [ 1.980, 2.156], train metrics [accuracy: 19.495, 70.963]
2020-07-08 11:04:22.620 INFO epoch [0/5], train step [140/195], loss [ 1.797, 2.146], train metrics [accuracy: 19.764, 71.504]
2020-07-08 11:04:24.296 INFO epoch [0/5], train step [150/195], loss [ 2.651, 2.135], train metrics [accuracy: 20.142, 72.160]
2020-07-08 11:04:25.984 INFO epoch [0/5], train step [160/195], loss [ 1.952, 2.121], train metrics [accuracy: 20.652, 72.889]
2020-07-08 11:04:27.648 INFO epoch [0/5], train step [170/195], loss [ 1.992, 2.112], train metrics [accuracy: 20.847, 73.346]
2020-07-08 11:04:29.307 INFO epoch [0/5], train step [180/195], loss [ 1.881, 2.098], train metrics [accuracy: 21.344, 73.977]
2020-07-08 11:04:30.989 INFO epoch [0/5], train step [190/195], loss [ 1.925, 2.091], train metrics [accuracy: 21.613, 74.378]
2020-07-08 11:04:32.718 INFO epoch [1/5], train step [ 0/195], loss [ 1.955, 1.955], train metrics [accuracy: 28.906, 85.938]
2020-07-08 11:04:34.385 INFO epoch [1/5], train step [ 10/195], loss [ 1.811, 1.957], train metrics [accuracy: 26.065, 81.676]
2020-07-08 11:04:36.7 INFO epoch [1/5], train step [ 20/195], loss [ 1.944, 1.925], train metrics [accuracy: 28.088, 82.254]
2020-07-08 11:04:37.686 INFO epoch [1/5], train step [ 30/195], loss [ 2.206, 1.951], train metrics [accuracy: 27.243, 81.729]
2020-07-08 11:04:39.329 INFO epoch [1/5], train step [ 40/195], loss [ 1.965, 1.934], train metrics [accuracy: 27.630, 82.679]
2020-07-08 11:04:40.963 INFO epoch [1/5], train step [ 50/195], loss [ 1.877, 1.930], train metrics [accuracy: 27.926, 82.475]
2020-07-08 11:04:42.621 INFO epoch [1/5], train step [ 60/195], loss [ 1.807, 1.922], train metrics [accuracy: 28.189, 82.851]
2020-07-08 11:04:44.293 INFO epoch [1/5], train step [ 70/195], loss [ 1.837, 1.905], train metrics [accuracy: 28.576, 83.440]
2020-07-08 11:04:45.940 INFO epoch [1/5], train step [ 80/195], loss [ 1.825, 1.894], train metrics [accuracy: 28.887, 83.642]
2020-07-08 11:04:47.594 INFO epoch [1/5], train step [ 90/195], loss [ 1.860, 1.888], train metrics [accuracy: 28.975, 83.800]
2020-07-08 11:04:49.261 INFO epoch [1/5], train step [100/195], loss [ 1.697, 1.880], train metrics [accuracy: 29.247, 84.127]
2020-07-08 11:04:50.915 INFO epoch [1/5], train step [110/195], loss [ 1.889, 1.878], train metrics [accuracy: 29.392, 84.178]
2020-07-08 11:04:52.573 INFO epoch [1/5], train step [120/195], loss [ 1.739, 1.869], train metrics [accuracy: 29.881, 84.310]
2020-07-08 11:04:54.208 INFO epoch [1/5], train step [130/195], loss [ 1.662, 1.866], train metrics [accuracy: 29.968, 84.363]
2020-07-08 11:04:55.861 INFO epoch [1/5], train step [140/195], loss [ 1.885, 1.867], train metrics [accuracy: 29.904, 84.325]
2020-07-08 11:04:57.504 INFO epoch [1/5], train step [150/195], loss [ 1.801, 1.865], train metrics [accuracy: 29.951, 84.318]
2020-07-08 11:04:59.125 INFO epoch [1/5], train step [160/195], loss [ 1.724, 1.864], train metrics [accuracy: 30.027, 84.326]
2020-07-08 11:05:00.772 INFO epoch [1/5], train step [170/195], loss [ 1.933, 1.862], train metrics [accuracy: 30.158, 84.434]
2020-07-08 11:05:02.439 INFO epoch [1/5], train step [180/195], loss [ 1.775, 1.858], train metrics [accuracy: 30.413, 84.504]
2020-07-08 11:05:04.65 INFO epoch [1/5], train step [190/195], loss [ 1.828, 1.853], train metrics [accuracy: 30.542, 84.616]
2020-07-08 11:05:05.712 INFO epoch [2/5], train step [ 0/195], loss [ 1.981, 1.981], train metrics [accuracy: 27.344, 82.812]
2020-07-08 11:05:07.352 INFO epoch [2/5], train step [ 10/195], loss [ 1.734, 1.804], train metrics [accuracy: 32.173, 85.227]
2020-07-08 11:05:09.19 INFO epoch [2/5], train step [ 20/195], loss [ 1.868, 1.774], train metrics [accuracy: 33.668, 85.826]
2020-07-08 11:05:10.651 INFO epoch [2/5], train step [ 30/195], loss [ 1.814, 1.774], train metrics [accuracy: 33.468, 86.064]
2020-07-08 11:05:12.300 INFO epoch [2/5], train step [ 40/195], loss [ 1.841, 1.774], train metrics [accuracy: 33.346, 86.109]
2020-07-08 11:05:13.964 INFO epoch [2/5], train step [ 50/195], loss [ 1.896, 1.773], train metrics [accuracy: 32.935, 86.014]
2020-07-08 11:05:15.628 INFO epoch [2/5], train step [ 60/195], loss [ 1.688, 1.768], train metrics [accuracy: 33.171, 86.091]
2020-07-08 11:05:17.272 INFO epoch [2/5], train step [ 70/195], loss [ 1.784, 1.766], train metrics [accuracy: 33.352, 86.202]
2020-07-08 11:05:18.955 INFO epoch [2/5], train step [ 80/195], loss [ 1.700, 1.762], train metrics [accuracy: 33.507, 86.381]
2020-07-08 11:05:20.651 INFO epoch [2/5], train step [ 90/195], loss [ 1.747, 1.761], train metrics [accuracy: 33.602, 86.547]
2020-07-08 11:05:22.357 INFO epoch [2/5], train step [100/195], loss [ 1.796, 1.758], train metrics [accuracy: 33.663, 86.572]
2020-07-08 11:05:24.72 INFO epoch [2/5], train step [110/195], loss [ 1.774, 1.754], train metrics [accuracy: 33.875, 86.684]
2020-07-08 11:05:25.797 INFO epoch [2/5], train step [120/195], loss [ 1.596, 1.753], train metrics [accuracy: 33.955, 86.764]
2020-07-08 11:05:27.491 INFO epoch [2/5], train step [130/195], loss [ 1.761, 1.751], train metrics [accuracy: 34.059, 86.898]
2020-07-08 11:05:29.193 INFO epoch [2/5], train step [140/195], loss [ 1.855, 1.747], train metrics [accuracy: 34.209, 86.946]
2020-07-08 11:05:30.842 INFO epoch [2/5], train step [150/195], loss [ 1.755, 1.747], train metrics [accuracy: 34.240, 86.952]
2020-07-08 11:05:32.538 INFO epoch [2/5], train step [160/195], loss [ 1.647, 1.742], train metrics [accuracy: 34.433, 87.131]
2020-07-08 11:05:34.199 INFO epoch [2/5], train step [170/195], loss [ 1.676, 1.744], train metrics [accuracy: 34.412, 87.007]
2020-07-08 11:05:35.852 INFO epoch [2/5], train step [180/195], loss [ 1.745, 1.746], train metrics [accuracy: 34.427, 87.030]
2020-07-08 11:05:37.519 INFO epoch [2/5], train step [190/195], loss [ 1.718, 1.744], train metrics [accuracy: 34.494, 87.116]
2020-07-08 11:05:39.207 INFO epoch [3/5], train step [ 0/195], loss [ 1.735, 1.735], train metrics [accuracy: 30.469, 85.156]
2020-07-08 11:05:40.879 INFO epoch [3/5], train step [ 10/195], loss [ 1.681, 1.700], train metrics [accuracy: 36.151, 89.062]
2020-07-08 11:05:42.529 INFO epoch [3/5], train step [ 20/195], loss [ 1.577, 1.681], train metrics [accuracy: 37.091, 88.951]
2020-07-08 11:05:44.190 INFO epoch [3/5], train step [ 30/195], loss [ 1.573, 1.662], train metrics [accuracy: 38.256, 89.491]
2020-07-08 11:05:45.862 INFO epoch [3/5], train step [ 40/195], loss [ 1.624, 1.666], train metrics [accuracy: 37.843, 89.234]
2020-07-08 11:05:47.494 INFO epoch [3/5], train step [ 50/195], loss [ 1.613, 1.666], train metrics [accuracy: 37.623, 89.170]
2020-07-08 11:05:49.110 INFO epoch [3/5], train step [ 60/195], loss [ 1.717, 1.677], train metrics [accuracy: 37.244, 88.922]
2020-07-08 11:05:50.754 INFO epoch [3/5], train step [ 70/195], loss [ 1.630, 1.674], train metrics [accuracy: 37.148, 88.930]
2020-07-08 11:05:52.367 INFO epoch [3/5], train step [ 80/195], loss [ 1.843, 1.676], train metrics [accuracy: 37.201, 89.034]
2020-07-08 11:05:54.10 INFO epoch [3/5], train step [ 90/195], loss [ 1.930, 1.680], train metrics [accuracy: 37.002, 88.762]
2020-07-08 11:05:55.614 INFO epoch [3/5], train step [100/195], loss [ 1.756, 1.686], train metrics [accuracy: 36.696, 88.683]
2020-07-08 11:05:57.226 INFO epoch [3/5], train step [110/195], loss [ 1.587, 1.684], train metrics [accuracy: 36.613, 88.739]
2020-07-08 11:05:58.809 INFO epoch [3/5], train step [120/195], loss [ 1.408, 1.679], train metrics [accuracy: 36.848, 88.830]
2020-07-08 11:06:00.415 INFO epoch [3/5], train step [130/195], loss [ 1.587, 1.681], train metrics [accuracy: 36.737, 88.782]
2020-07-08 11:06:02.12 INFO epoch [3/5], train step [140/195], loss [ 1.726, 1.682], train metrics [accuracy: 36.697, 88.797]
2020-07-08 11:06:03.629 INFO epoch [3/5], train step [150/195], loss [ 1.712, 1.684], train metrics [accuracy: 36.620, 88.680]
2020-07-08 11:06:05.228 INFO epoch [3/5], train step [160/195], loss [ 1.550, 1.679], train metrics [accuracy: 36.796, 88.733]
2020-07-08 11:06:06.836 INFO epoch [3/5], train step [170/195], loss [ 1.781, 1.678], train metrics [accuracy: 36.824, 88.702]
2020-07-08 11:06:08.439 INFO epoch [3/5], train step [180/195], loss [ 1.638, 1.675], train metrics [accuracy: 36.866, 88.765]
2020-07-08 11:06:10.75 INFO epoch [3/5], train step [190/195], loss [ 1.567, 1.673], train metrics [accuracy: 36.903, 88.854]
2020-07-08 11:06:12.71 INFO checkpoint saved to /home/cooper/project/vega_pro/tasks/0708.110301.854/workers/nas/0/weights_3.pt
/home/cooper/anaconda3/envs/torch16/lib/python3.6/site-packages/vega/search_space/networks/pytorch/super_network/cars_darts.py:103: UserWarning: This overload of nonzero is deprecated:
nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input, *, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
if torch.nonzero(idx).size(0) > 2:
2020-07-08 11:06:12.160 ERROR Illegal alpha.
2020-07-08 11:06:12.161 ERROR Illegal alpha.
2020-07-08 11:06:12.163 ERROR Illegal alpha.
2020-07-08 11:06:12.164 ERROR Illegal alpha.
2020-07-08 11:06:12.166 ERROR Illegal alpha.
2020-07-08 11:06:12.168 ERROR Illegal alpha.
2020-07-08 11:06:12.170 ERROR Illegal alpha.
2020-07-08 11:06:12.172 ERROR Illegal alpha.
2020-07-08 11:06:12.172 ERROR Illegal alpha.
2020-07-08 11:06:12.173 ERROR Illegal alpha.
2020-07-08 11:06:12.178 ERROR Illegal alpha.
2020-07-08 11:06:12.178 ERROR Illegal alpha.
2020-07-08 11:06:12.179 ERROR Illegal alpha.
2020-07-08 11:06:12.181 ERROR Illegal alpha.
2020-07-08 11:06:12.181 ERROR Illegal alpha.
2020-07-08 11:06:12.182 ERROR Illegal alpha.
2020-07-08 11:06:12.184 ERROR Illegal alpha.
2020-07-08 11:06:12.184 ERROR Illegal alpha.
2020-07-08 11:06:12.185 ERROR Illegal alpha.
2020-07-08 11:06:12.186 ERROR Illegal alpha.
2020-07-08 11:06:12.187 ERROR Illegal alpha.
/home/cooper/anaconda3/envs/torch16/lib/python3.6/site-packages/vega/algorithms/nas/darts_cnn/darts_codec.py:78: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
normal_param = np.array(self.darts_cfg.super_network.normal.genotype)
/home/cooper/anaconda3/envs/torch16/lib/python3.6/site-packages/vega/algorithms/nas/darts_cnn/darts_codec.py:79: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
reduce_param = np.array(self.darts_cfg.super_network.reduce.genotype)
2020-07-08 11:06:18.764 INFO Valid_acc for invidual 0 30.806478, size 0.208058
2020-07-08 11:06:25.359 INFO Valid_acc for invidual 1 33.569336, size 0.267578
2020-07-08 11:06:31.743 INFO Valid_acc for invidual 2 38.663737, size 0.225882
2020-07-08 11:06:37.837 INFO Valid_acc for invidual 3 30.265299, size 0.194714
2020-07-08 11:06:44.67 INFO Valid_acc for invidual 4 35.709635, size 0.224858
^C2020-07-08 11:06:49.205 INFO Shutdown urgently.
from vega.
Hi @xkp793003821 , thanks for your feedback. I ran the code with pytorch 1.2.0 and could not reproduce this problem. Unfortunately I do not have a machine with pytorch 1.7 right now and can not further analyse this problem.
I notice that in the log you provided above, there is a warning:
UserWarning: This overload of nonzero is deprecated:
nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input, *, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
if torch.nonzero(idx).size(0) > 2:
Maybe the behavior of nonzero
changes in newest version of pytorch.
This Illegal alpha
check is to make sure each node are connected with only two previous nodes. I think your modified code makes sense and seems to be a more elegant implementation. We may updated the corresponding code in the future commit.
from vega.
Related Issues (20)
- 使用vega启动训练失败
- vega 需要对 mindspore 的 set_jit_config 接口更新适配
- vega依赖了torch HOT 2
- adelaide-ea训练失败 HOT 1
- vega-noah:esr-ea 训练失败 HOT 1
- 请问在moderarts上怎么配置,里边预训练模型这些在哪儿下? HOT 5
- sp-nas针对目标检测进行模型结构搜索,在加载数据集时,能够加载自己的数据集吗? HOT 3
- 'vega' 不是内部或外部命令,也不是可运行的程序 或批处理文件 HOT 3
- CARS运行时间问题
- curve lane detection
- CARS for the Cutsom ClassificationDataset
- Precision, Recall, F1 score for classification ?
- Add oriented-rcnn network
- Cannot download from Model Zoo?
- vega-noah:esr-ea/dnet-nas 网络废弃接口告警打印过多
- CARS RUN EFFOR:output file HOT 1
- "Error: No such option: nprocs=1" HOT 2
- TransNAS-Bench-101 download link is broken HOT 3
- Noah-vega on ROCm
- The requested array has an inhomogeneous shape after 2 dimensions HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vega.