Git Product home page Git Product logo

plbart's Introduction

plbart's People

Contributors

gchhablani avatar saikat107 avatar wasiahmad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

plbart's Issues

HuggingFace Checkpoint Configurations

Hello,
I am replicating some of the experiments with the checkpoints from PLBART on HuggingFace.

On the Drive Document (), it is mentioned that the model ('uclanlp/plbart-refine-java-small') does not need a language token.
However, if decoder_start_token_id=tokenizer.lang_code_to_id["java"] is not used, the model does not work properly.

Would that be an error in the documentation or am I confusing terms?

Thanks.

Fine-tuning Error : " AttributeError: module 'sacrebleu' has no attribute '__version__' "

Issue

I followed the Fine-tuning steps given here. But when I run the instructions given by step4, I get the following error:

/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code
2022-03-29 10:12:08 | INFO | numexpr.utils | NumExpr defaulting to 2 threads.
2022-03-29 10:12:09 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-train", line 5, in <module>
    from fairseq_cli.train import cli_main
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/train.py", line 30, in <module>
    from fairseq import checkpoint_utils, options, quantization_utils, tasks, utils
  File "/usr/local/lib/python3.7/dist-packages/fairseq/__init__.py", line 40, in <module>
    import fairseq.scoring  # noqa
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/__init__.py", line 55, in <module>
    importlib.import_module("fairseq.scoring." + module)
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/bleu.py", line 14, in <module>
    from fairseq.scoring.tokenizer import EvaluationTokenizer
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/tokenizer.py", line 12, in <module>
    SACREBLEU_V2_ABOVE = int(sb.__version__[0]) >= 2
AttributeError: module 'sacrebleu' has no attribute '__version__'
2022-03-29 10:12:12 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-generate", line 5, in <module>
    from fairseq_cli.generate import cli_main
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 22, in <module>
    from fairseq import checkpoint_utils, options, scoring, tasks, utils
  File "/usr/local/lib/python3.7/dist-packages/fairseq/__init__.py", line 40, in <module>
    import fairseq.scoring  # noqa
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/__init__.py", line 55, in <module>
    importlib.import_module("fairseq.scoring." + module)
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/bleu.py", line 14, in <module>
    from fairseq.scoring.tokenizer import EvaluationTokenizer
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/tokenizer.py", line 12, in <module>
    SACREBLEU_V2_ABOVE = int(sb.__version__[0]) >= 2
AttributeError: module 'sacrebleu' has no attribute '__version__'
Traceback (most recent call last):
  File "evaluator.py", line 50, in <module>
    main()
  File "evaluator.py", line 26, in main
    assert len(preds) == len(gts), f"Samples of predictions and answers are not equal, {len(preds)}: {len(gts)}"
AssertionError: Samples of predictions and answers are not equal, 0: 2000
Traceback (most recent call last):
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 110, in <module>
    main()
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 85, in main
    assert len(hypothesis) == len(pre_references[i])
AssertionError
/content/drive/MyDrive/nl2code/projects/plbart/PLBART

Then I tried to reinstall sacrebleu

!pip uninstall -y sacrebleu
!pip install sacrebleu
import sacrebleu
sacrebleu.__version__
>>>'2.0.0'

Then I re-executed step4, but got new error:

/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code
2022-03-29 10:28:29 | INFO | numexpr.utils | NumExpr defaulting to 2 threads.
2022-03-29 10:28:29 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
usage: fairseq-train [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL]
                     [--log-format {json,none,simple,tqdm}]
                     [--log-file LOG_FILE]
                     [--tensorboard-logdir TENSORBOARD_LOGDIR]
                     [--wandb-project WANDB_PROJECT] [--azureml-logging]
                     [--seed SEED] [--cpu] [--tpu] [--bf16]
                     [--memory-efficient-bf16] [--fp16]
                     [--memory-efficient-fp16] [--fp16-no-flatten-grads]
                     [--fp16-init-scale FP16_INIT_SCALE]
                     [--fp16-scale-window FP16_SCALE_WINDOW]
                     [--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
                     [--on-cpu-convert-precision]
                     [--min-loss-scale MIN_LOSS_SCALE]
                     [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--amp]
                     [--amp-batch-retries AMP_BATCH_RETRIES]
                     [--amp-init-scale AMP_INIT_SCALE]
                     [--amp-scale-window AMP_SCALE_WINDOW]
                     [--user-dir USER_DIR]
                     [--empty-cache-freq EMPTY_CACHE_FREQ]
                     [--all-gather-list-size ALL_GATHER_LIST_SIZE]
                     [--model-parallel-size MODEL_PARALLEL_SIZE]
                     [--quantization-config-path QUANTIZATION_CONFIG_PATH]
                     [--profile] [--reset-logging] [--suppress-crashes]
                     [--use-plasma-view] [--plasma-path PLASMA_PATH]
                     [--criterion {adaptive_loss,composite_loss,cross_entropy,ctc,fastspeech2,hubert,label_smoothed_cross_entropy,latency_augmented_label_smoothed_cross_entropy,label_smoothed_cross_entropy_with_alignment,label_smoothed_cross_entropy_with_ctc,legacy_masked_lm_loss,masked_lm,model,nat_loss,sentence_prediction,sentence_ranking,tacotron2,speech_to_unit,speech_to_spectrogram,speech_unit_lm_criterion,wav2vec,vocab_parallel_cross_entropy}]
                     [--tokenizer {moses,nltk,space}]
                     [--bpe {byte_bpe,bytes,characters,fastbpe,gpt2,bert,hf_byte_bpe,sentencepiece,subword_nmt}]
                     [--optimizer {adadelta,adafactor,adagrad,adam,adamax,composite,cpu_adam,lamb,nag,sgd}]
                     [--lr-scheduler {cosine,fixed,inverse_sqrt,manual,pass_through,polynomial_decay,reduce_lr_on_plateau,step,tri_stage,triangular}]
                     [--scoring {bert_score,sacrebleu,bleu,chrf,meteor,wer}]
                     [--task TASK] [--num-workers NUM_WORKERS]
                     [--skip-invalid-size-inputs-valid-test]
                     [--max-tokens MAX_TOKENS] [--batch-size BATCH_SIZE]
                     [--required-batch-size-multiple REQUIRED_BATCH_SIZE_MULTIPLE]
                     [--required-seq-len-multiple REQUIRED_SEQ_LEN_MULTIPLE]
                     [--dataset-impl {raw,lazy,cached,mmap,fasta,huffman}]
                     [--data-buffer-size DATA_BUFFER_SIZE]
                     [--train-subset TRAIN_SUBSET]
                     [--valid-subset VALID_SUBSET] [--combine-valid-subsets]
                     [--ignore-unused-valid-subsets]
                     [--validate-interval VALIDATE_INTERVAL]
                     [--validate-interval-updates VALIDATE_INTERVAL_UPDATES]
                     [--validate-after-updates VALIDATE_AFTER_UPDATES]
                     [--fixed-validation-seed FIXED_VALIDATION_SEED]
                     [--disable-validation]
                     [--max-tokens-valid MAX_TOKENS_VALID]
                     [--batch-size-valid BATCH_SIZE_VALID]
                     [--max-valid-steps MAX_VALID_STEPS]
                     [--curriculum CURRICULUM] [--gen-subset GEN_SUBSET]
                     [--num-shards NUM_SHARDS] [--shard-id SHARD_ID]
                     [--grouped-shuffling]
                     [--update-epoch-batch-itr UPDATE_EPOCH_BATCH_ITR]
                     [--update-ordered-indices-seed]
                     [--distributed-world-size DISTRIBUTED_WORLD_SIZE]
                     [--distributed-num-procs DISTRIBUTED_NUM_PROCS]
                     [--distributed-rank DISTRIBUTED_RANK]
                     [--distributed-backend DISTRIBUTED_BACKEND]
                     [--distributed-init-method DISTRIBUTED_INIT_METHOD]
                     [--distributed-port DISTRIBUTED_PORT]
                     [--device-id DEVICE_ID] [--distributed-no-spawn]
                     [--ddp-backend {c10d,fully_sharded,legacy_ddp,no_c10d,pytorch_ddp,slowmo}]
                     [--ddp-comm-hook {none,fp16}]
                     [--bucket-cap-mb BUCKET_CAP_MB] [--fix-batches-to-gpus]
                     [--find-unused-parameters] [--gradient-as-bucket-view]
                     [--fast-stat-sync]
                     [--heartbeat-timeout HEARTBEAT_TIMEOUT]
                     [--broadcast-buffers] [--slowmo-momentum SLOWMO_MOMENTUM]
                     [--slowmo-base-algorithm SLOWMO_BASE_ALGORITHM]
                     [--localsgd-frequency LOCALSGD_FREQUENCY]
                     [--nprocs-per-node NPROCS_PER_NODE]
                     [--pipeline-model-parallel]
                     [--pipeline-balance PIPELINE_BALANCE]
                     [--pipeline-devices PIPELINE_DEVICES]
                     [--pipeline-chunks PIPELINE_CHUNKS]
                     [--pipeline-encoder-balance PIPELINE_ENCODER_BALANCE]
                     [--pipeline-encoder-devices PIPELINE_ENCODER_DEVICES]
                     [--pipeline-decoder-balance PIPELINE_DECODER_BALANCE]
                     [--pipeline-decoder-devices PIPELINE_DECODER_DEVICES]
                     [--pipeline-checkpoint {always,never,except_last}]
                     [--zero-sharding {none,os}] [--no-reshard-after-forward]
                     [--fp32-reduce-scatter] [--cpu-offload]
                     [--use-sharded-state] [--not-fsdp-flatten-parameters]
                     [--arch ARCH] [--max-epoch MAX_EPOCH]
                     [--max-update MAX_UPDATE]
                     [--stop-time-hours STOP_TIME_HOURS]
                     [--clip-norm CLIP_NORM] [--sentence-avg]
                     [--update-freq UPDATE_FREQ] [--lr LR]
                     [--stop-min-lr STOP_MIN_LR] [--use-bmuf]
                     [--skip-remainder-batch] [--save-dir SAVE_DIR]
                     [--restore-file RESTORE_FILE]
                     [--continue-once CONTINUE_ONCE]
                     [--finetune-from-model FINETUNE_FROM_MODEL]
                     [--reset-dataloader] [--reset-lr-scheduler]
                     [--reset-meters] [--reset-optimizer]
                     [--optimizer-overrides OPTIMIZER_OVERRIDES]
                     [--save-interval SAVE_INTERVAL]
                     [--save-interval-updates SAVE_INTERVAL_UPDATES]
                     [--keep-interval-updates KEEP_INTERVAL_UPDATES]
                     [--keep-interval-updates-pattern KEEP_INTERVAL_UPDATES_PATTERN]
                     [--keep-last-epochs KEEP_LAST_EPOCHS]
                     [--keep-best-checkpoints KEEP_BEST_CHECKPOINTS]
                     [--no-save] [--no-epoch-checkpoints]
                     [--no-last-checkpoints] [--no-save-optimizer-state]
                     [--best-checkpoint-metric BEST_CHECKPOINT_METRIC]
                     [--maximize-best-checkpoint-metric] [--patience PATIENCE]
                     [--checkpoint-suffix CHECKPOINT_SUFFIX]
                     [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
                     [--load-checkpoint-on-all-dp-ranks]
                     [--write-checkpoints-asynchronously] [--store-ema]
                     [--ema-decay EMA_DECAY]
                     [--ema-start-update EMA_START_UPDATE]
                     [--ema-seed-model EMA_SEED_MODEL]
                     [--ema-update-freq EMA_UPDATE_FREQ] [--ema-fp32]
                     [--activation-fn {relu,gelu,gelu_fast,gelu_accurate,tanh,linear}]
                     [--dropout DROPOUT]
                     [--attention-dropout ATTENTION_DROPOUT]
                     [--activation-dropout ACTIVATION_DROPOUT]
                     [--adaptive-input]
                     [--encoder-embed-path ENCODER_EMBED_PATH]
                     [--encoder-embed-dim ENCODER_EMBED_DIM]
                     [--encoder-ffn-embed-dim ENCODER_FFN_EMBED_DIM]
                     [--encoder-layers ENCODER_LAYERS]
                     [--encoder-attention-heads ENCODER_ATTENTION_HEADS]
                     [--encoder-normalize-before] [--encoder-learned-pos]
                     [--encoder-layerdrop ENCODER_LAYERDROP]
                     [--encoder-layers-to-keep ENCODER_LAYERS_TO_KEEP]
                     [--max-source-positions MAX_SOURCE_POSITIONS]
                     [--decoder-embed-path DECODER_EMBED_PATH]
                     [--decoder-embed-dim DECODER_EMBED_DIM]
                     [--decoder-ffn-embed-dim DECODER_FFN_EMBED_DIM]
                     [--decoder-layers DECODER_LAYERS]
                     [--decoder-attention-heads DECODER_ATTENTION_HEADS]
                     [--decoder-normalize-before] [--decoder-learned-pos]
                     [--decoder-layerdrop DECODER_LAYERDROP]
                     [--decoder-layers-to-keep DECODER_LAYERS_TO_KEEP]
                     [--decoder-output-dim DECODER_OUTPUT_DIM]
                     [--max-target-positions MAX_TARGET_POSITIONS]
                     [--share-decoder-input-output-embed]
                     [--share-all-embeddings]
                     [--no-token-positional-embeddings]
                     [--adaptive-softmax-cutoff ADAPTIVE_SOFTMAX_CUTOFF]
                     [--adaptive-softmax-dropout ADAPTIVE_SOFTMAX_DROPOUT]
                     [--adaptive-softmax-factor ADAPTIVE_SOFTMAX_FACTOR]
                     [--layernorm-embedding] [--tie-adaptive-weights]
                     [--tie-adaptive-proj] [--no-scale-embedding]
                     [--checkpoint-activations] [--offload-activations]
                     [--no-cross-attention] [--cross-self-attention]
                     [--quant-noise-pq QUANT_NOISE_PQ]
                     [--quant-noise-pq-block-size QUANT_NOISE_PQ_BLOCK_SIZE]
                     [--quant-noise-scalar QUANT_NOISE_SCALAR]
                     [--min-params-to-wrap MIN_PARAMS_TO_WRAP] [--char-inputs]
                     [--relu-dropout RELU_DROPOUT] [--base-layers BASE_LAYERS]
                     [--base-sublayers BASE_SUBLAYERS]
                     [--base-shuffle BASE_SHUFFLE] [--export]
                     [--no-decoder-final-norm] [--pooler-dropout D]
                     [--pooler-activation-fn {relu,gelu,gelu_fast,gelu_accurate,tanh,linear}]
                     [--spectral-norm-classification-head]
                     [--source-lang SOURCE_LANG] [--target-lang TARGET_LANG]
                     [--load-alignments] [--left-pad-source]
                     [--left-pad-target] [--upsample-primary UPSAMPLE_PRIMARY]
                     [--truncate-source]
                     [--num-batch-buckets NUM_BATCH_BUCKETS] [--eval-bleu]
                     [--eval-bleu-args EVAL_BLEU_ARGS]
                     [--eval-bleu-detok EVAL_BLEU_DETOK]
                     [--eval-bleu-detok-args EVAL_BLEU_DETOK_ARGS]
                     [--eval-tokenized-bleu]
                     [--eval-bleu-remove-bpe [EVAL_BLEU_REMOVE_BPE]]
                     [--eval-bleu-print-samples] [--langs LANG]
                     [--prepend-bos] [--label-smoothing LABEL_SMOOTHING]
                     [--report-accuracy]
                     [--ignore-prefix-size IGNORE_PREFIX_SIZE]
                     [--sentencepiece-model SENTENCEPIECE_MODEL]
                     [--sentencepiece-enable-sampling]
                     [--sentencepiece-alpha SENTENCEPIECE_ALPHA]
                     [--adam-betas ADAM_BETAS] [--adam-eps ADAM_EPS]
                     [--weight-decay WEIGHT_DECAY] [--use-old-adam]
                     [--fp16-adam-stats] [--warmup-updates WARMUP_UPDATES]
                     [--force-anneal FORCE_ANNEAL]
                     [--end-learning-rate END_LEARNING_RATE] [--power POWER]
                     [--total-num-update TOTAL_NUM_UPDATE] [--pad PAD]
                     [--eos EOS] [--unk UNK]
                     data
fairseq-train: error: unrecognized arguments: --min-lr -1
2022-03-29 10:28:32 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2022-03-29 10:28:35 | INFO | fairseq_cli.generate | {'_name': None, 'common': {'_name': None, 'no_progress_bar': False, 'log_interval': 100, 'log_format': None, 'log_file': None, 'tensorboard_logdir': None, 'wandb_project': None, 'azureml_logging': False, 'seed': 1, 'cpu': False, 'tpu': False, 'bf16': False, 'memory_efficient_bf16': False, 'fp16': False, 'memory_efficient_fp16': False, 'fp16_no_flatten_grads': False, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'on_cpu_convert_precision': False, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'amp': False, 'amp_batch_retries': 2, 'amp_init_scale': 128, 'amp_scale_window': None, 'user_dir': None, 'empty_cache_freq': 0, 'all_gather_list_size': 16384, 'model_parallel_size': 1, 'quantization_config_path': None, 'profile': False, 'reset_logging': False, 'suppress_crashes': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'common_eval': {'_name': None, 'path': '/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt', 'post_process': 'sentencepiece', 'quiet': False, 'model_overrides': '{}', 'results_path': None}, 'distributed_training': {'_name': None, 'distributed_world_size': 1, 'distributed_num_procs': 1, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': None, 'distributed_port': -1, 'device_id': 0, 'distributed_no_spawn': False, 'ddp_backend': 'pytorch_ddp', 'ddp_comm_hook': 'none', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': False, 'find_unused_parameters': False, 'gradient_as_bucket_view': False, 'fast_stat_sync': False, 'heartbeat_timeout': -1, 'broadcast_buffers': False, 'slowmo_momentum': None, 'slowmo_base_algorithm': 'localsgd', 'localsgd_frequency': 3, 'nprocs_per_node': 1, 'pipeline_model_parallel': False, 'pipeline_balance': None, 'pipeline_devices': None, 'pipeline_chunks': 0, 'pipeline_encoder_balance': None, 'pipeline_encoder_devices': None, 'pipeline_decoder_balance': None, 'pipeline_decoder_devices': None, 'pipeline_checkpoint': 'never', 'zero_sharding': 'none', 'fp16': False, 'memory_efficient_fp16': False, 'tpu': False, 'no_reshard_after_forward': False, 'fp32_reduce_scatter': False, 'cpu_offload': False, 'use_sharded_state': False, 'not_fsdp_flatten_parameters': False}, 'dataset': {'_name': None, 'num_workers': 1, 'skip_invalid_size_inputs_valid_test': False, 'max_tokens': None, 'batch_size': 4, 'required_batch_size_multiple': 8, 'required_seq_len_multiple': 1, 'dataset_impl': None, 'data_buffer_size': 10, 'train_subset': 'train', 'valid_subset': 'valid', 'combine_valid_subsets': None, 'ignore_unused_valid_subsets': False, 'validate_interval': 1, 'validate_interval_updates': 0, 'validate_after_updates': 0, 'fixed_validation_seed': None, 'disable_validation': False, 'max_tokens_valid': None, 'batch_size_valid': 4, 'max_valid_steps': None, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0, 'grouped_shuffling': False, 'update_epoch_batch_itr': False, 'update_ordered_indices_seed': False}, 'optimization': {'_name': None, 'max_epoch': 0, 'max_update': 0, 'stop_time_hours': 0.0, 'clip_norm': 0.0, 'sentence_avg': False, 'update_freq': [1], 'lr': [0.25], 'stop_min_lr': -1.0, 'use_bmuf': False, 'skip_remainder_batch': False}, 'checkpoint': {'_name': None, 'save_dir': 'checkpoints', 'restore_file': 'checkpoint_last.pt', 'continue_once': None, 'finetune_from_model': None, 'reset_dataloader': False, 'reset_lr_scheduler': False, 'reset_meters': False, 'reset_optimizer': False, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 0, 'keep_interval_updates': -1, 'keep_interval_updates_pattern': -1, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': False, 'no_epoch_checkpoints': False, 'no_last_checkpoints': False, 'no_save_optimizer_state': False, 'best_checkpoint_metric': 'loss', 'maximize_best_checkpoint_metric': False, 'patience': -1, 'checkpoint_suffix': '', 'checkpoint_shard_count': 1, 'load_checkpoint_on_all_dp_ranks': False, 'write_checkpoints_asynchronously': False, 'model_parallel_size': 1}, 'bmuf': {'_name': None, 'block_lr': 1.0, 'block_momentum': 0.875, 'global_sync_iter': 50, 'warmup_iterations': 500, 'use_nbm': False, 'average_sync': False, 'distributed_world_size': 1}, 'generation': {'_name': None, 'beam': 10, 'nbest': 1, 'max_len_a': 0.0, 'max_len_b': 200, 'min_len': 1, 'match_source_len': False, 'unnormalized': False, 'no_early_stop': False, 'no_beamable_mm': False, 'lenpen': 1.0, 'unkpen': 0.0, 'replace_unk': None, 'sacrebleu': True, 'score_reference': False, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': False, 'sampling_topk': -1, 'sampling_topp': -1.0, 'constraints': None, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': None, 'print_step': False, 'lm_path': None, 'lm_weight': 0.0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': False, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': False, 'retain_iter_history': False, 'retain_dropout': False, 'retain_dropout_modules': None, 'decoding_format': None, 'no_seed_provided': False}, 'eval_lm': {'_name': None, 'output_word_probs': False, 'output_word_stats': False, 'context_window': 0, 'softmax_batch': 9223372036854775807}, 'interactive': {'_name': None, 'buffer_size': 0, 'input': '-'}, 'model': {'_name': 'wav2vec2', 'extractor_mode': 'default', 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': 'gelu', 'layer_type': 'transformer', 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.0, 'dropout_input': 0.0, 'dropout_features': 0.0, 'final_dim': 0, 'layer_norm_first': False, 'conv_feature_layers': '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] + [(512,2,2)]', 'conv_bias': False, 'logit_temp': 0.1, 'quantize_targets': False, 'quantize_input': False, 'same_quantizer': False, 'target_glu': False, 'feature_grad_mult': 1.0, 'quantizer_depth': 1, 'quantizer_factor': 3, 'latent_vars': 320, 'latent_groups': 2, 'latent_dim': 0, 'mask_length': 10, 'mask_prob': 0.65, 'mask_selection': 'static', 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'require_same_masks': True, 'mask_dropout': 0.0, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_before': False, 'mask_channel_selection': 'static', 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'num_negatives': 100, 'negatives_from_everywhere': False, 'cross_sample_negatives': 0, 'codebook_negatives': 0, 'conv_pos': 128, 'conv_pos_groups': 16, 'pos_conv_depth': 1, 'latent_temp': [2.0, 0.5, 0.999995], 'max_positions': 100000, 'checkpoint_activations': False, 'required_seq_len_multiple': 1, 'crop_seq_to_multiple': 1, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}, 'task': Namespace(_name='translation_from_pretrained_bart', all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, arch='wav2vec2', azureml_logging=False, batch_size=4, batch_size_valid=4, beam=10, best_checkpoint_metric='loss', bf16=False, bpe=None, broadcast_buffers=False, bucket_cap_mb=25, checkpoint_shard_count=1, checkpoint_suffix='', combine_valid_subsets=None, constraints=None, continue_once=None, cpu=False, cpu_offload=False, criterion='cross_entropy', curriculum=0, data='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/data/codeXglue/text-to-code/concode/data-bin', data_buffer_size=10, dataset_impl=None, ddp_backend='pytorch_ddp', ddp_comm_hook='none', decoding_format=None, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=1, distributed_port=-1, distributed_rank=0, distributed_world_size=1, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, empty_cache_freq=0, eos=2, eval_bleu=False, eval_bleu_args='{}', eval_bleu_detok='space', eval_bleu_detok_args='{}', eval_bleu_print_samples=False, eval_bleu_remove_bpe=None, eval_tokenized_bleu=False, fast_stat_sync=False, find_unused_parameters=False, finetune_from_model=None, fix_batches_to_gpus=False, fixed_validation_seed=None, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, fp32_reduce_scatter=False, gen_subset='test', gradient_as_bucket_view=False, grouped_shuffling=False, heartbeat_timeout=-1, ignore_unused_valid_subsets=False, iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, langs='java,python,en_XX', left_pad_source=True, left_pad_target=False, lenpen=1.0, lm_path=None, lm_weight=0.0, load_alignments=False, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_file=None, log_format=None, log_interval=100, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_source_positions=1024, max_target_positions=1024, max_tokens=None, max_tokens_valid=None, max_valid_steps=None, maximize_best_checkpoint_metric=False, memory_efficient_bf16=False, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides='{}', model_parallel_size=1, nbest=1, no_beamable_mm=False, no_early_stop=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_repeat_ngram_size=0, no_reshard_after_forward=False, no_save=False, no_save_optimizer_state=False, no_seed_provided=False, not_fsdp_flatten_parameters=False, nprocs_per_node=1, num_batch_buckets=0, num_shards=1, num_workers=1, on_cpu_convert_precision=False, optimizer=None, optimizer_overrides='{}', pad=1, path='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt', patience=-1, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, plasma_path='/tmp/plasma', post_process='sentencepiece', prefix_size=0, prepend_bos=False, print_alignment=None, print_step=False, profile=False, quantization_config_path=None, quiet=False, replace_unk=None, required_batch_size_multiple=8, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', results_path=None, retain_dropout=False, retain_dropout_modules=None, retain_iter_history=False, sacrebleu=True, sampling=False, sampling_topk=-1, sampling_topp=-1.0, save_dir='checkpoints', save_interval=1, save_interval_updates=0, score_reference=False, scoring='bleu', seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, slowmo_base_algorithm='localsgd', slowmo_momentum=None, source_lang='en_XX', suppress_crashes=False, target_lang='java', task='translation_from_pretrained_bart', temperature=1.0, tensorboard_logdir=None, threshold_loss_scale=None, tokenizer=None, tpu=False, train_subset='train', truncate_source=False, unk=3, unkpen=0, unnormalized=False, update_epoch_batch_itr=False, update_ordered_indices_seed=False, upsample_primary=-1, use_plasma_view=False, use_sharded_state=False, user_dir=None, valid_subset='valid', validate_after_updates=0, validate_interval=1, validate_interval_updates=0, wandb_project=None, warmup_updates=0, write_checkpoints_asynchronously=False, zero_sharding='none'), 'criterion': {'_name': 'cross_entropy', 'sentence_avg': True}, 'optimizer': None, 'lr_scheduler': {'_name': 'fixed', 'force_anneal': None, 'lr_shrink': 0.1, 'warmup_updates': 0, 'lr': [0.25]}, 'scoring': {'_name': 'bleu', 'pad': 1, 'eos': 2, 'unk': 3}, 'bpe': None, 'tokenizer': None, 'ema': {'_name': None, 'store_ema': False, 'ema_decay': 0.9999, 'ema_start_update': 0, 'ema_seed_model': None, 'ema_update_freq': 1, 'ema_fp32': False}}
2022-03-29 10:28:36 | INFO | fairseq.tasks.translation | [en_XX] dictionary: 50001 types
2022-03-29 10:28:36 | INFO | fairseq.tasks.translation | [java] dictionary: 50001 types
2022-03-29 10:28:36 | INFO | fairseq_cli.generate | loading model(s) from /content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-generate", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 413, in cli_main
    main(args)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 50, in main
    return _main(cfg, sys.stdout)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 102, in _main
    num_shards=cfg.checkpoint.checkpoint_shard_count,
  File "/usr/local/lib/python3.7/dist-packages/fairseq/checkpoint_utils.py", line 370, in load_model_ensemble
    state,
  File "/usr/local/lib/python3.7/dist-packages/fairseq/checkpoint_utils.py", line 419, in load_model_ensemble_and_task
    raise IOError("Model file not found: {}".format(filename))
OSError: Model file not found: /content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt
Traceback (most recent call last):
  File "evaluator.py", line 50, in <module>
    main()
  File "evaluator.py", line 26, in main
    assert len(preds) == len(gts), f"Samples of predictions and answers are not equal, {len(preds)}: {len(gts)}"
AssertionError: Samples of predictions and answers are not equal, 0: 2000
Traceback (most recent call last):
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 110, in <module>
    main()
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 85, in main
    assert len(hypothesis) == len(pre_references[i])
AssertionError
/content/drive/MyDrive/nl2code/projects/plbart/PLBART

Then I change the arguments: --min-lr to --stop-min-lr according to Command-line Tools in FairSeq's documentation. Unfortunately, another error:

/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code
2022-03-29 10:40:42 | INFO | numexpr.utils | NumExpr defaulting to 2 threads.
2022-03-29 10:40:42 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2022-03-29 10:40:45 | ERROR | fairseq.dataclass.utils | Error when composing. Overrides: ['common.no_progress_bar=False', 'common.log_interval=100', "common.log_format='json'", 'common.log_file=null', 'common.tensorboard_logdir=null', 'common.wandb_project=null', 'common.azureml_logging=False', 'common.seed=1234', 'common.cpu=False', 'common.tpu=False', 'common.bf16=False', 'common.memory_efficient_bf16=False', 'common.fp16=False', 'common.memory_efficient_fp16=False', 'common.fp16_no_flatten_grads=False', 'common.fp16_init_scale=128', 'common.fp16_scale_window=null', 'common.fp16_scale_tolerance=0.0', 'common.on_cpu_convert_precision=False', 'common.min_loss_scale=0.0001', 'common.threshold_loss_scale=null', 'common.amp=False', 'common.amp_batch_retries=2', 'common.amp_init_scale=128', 'common.amp_scale_window=null', 'common.user_dir=null', 'common.empty_cache_freq=0', 'common.all_gather_list_size=16384', 'common.model_parallel_size=1', 'common.quantization_config_path=null', 'common.profile=False', 'common.reset_logging=False', 'common.suppress_crashes=False', 'common.use_plasma_view=False', "common.plasma_path='/tmp/plasma'", 'common_eval.path=null', 'common_eval.post_process=null', 'common_eval.quiet=False', "common_eval.model_overrides='{}'", 'common_eval.results_path=null', 'distributed_training.distributed_world_size=1', 'distributed_training.distributed_num_procs=1', 'distributed_training.distributed_rank=0', "distributed_training.distributed_backend='nccl'", 'distributed_training.distributed_init_method=null', 'distributed_training.distributed_port=-1', 'distributed_training.device_id=0', 'distributed_training.distributed_no_spawn=False', "distributed_training.ddp_backend='no_c10d'", "distributed_training.ddp_comm_hook='none'", 'distributed_training.bucket_cap_mb=25', 'distributed_training.fix_batches_to_gpus=False', 'distributed_training.find_unused_parameters=False', 'distributed_training.gradient_as_bucket_view=False', 'distributed_training.fast_stat_sync=False', 'distributed_training.heartbeat_timeout=-1', 'distributed_training.broadcast_buffers=False', 'distributed_training.slowmo_momentum=null', "distributed_training.slowmo_base_algorithm='localsgd'", 'distributed_training.localsgd_frequency=3', 'distributed_training.nprocs_per_node=1', 'distributed_training.pipeline_model_parallel=False', 'distributed_training.pipeline_balance=null', 'distributed_training.pipeline_devices=null', 'distributed_training.pipeline_chunks=0', 'distributed_training.pipeline_encoder_balance=null', 'distributed_training.pipeline_encoder_devices=null', 'distributed_training.pipeline_decoder_balance=null', 'distributed_training.pipeline_decoder_devices=null', "distributed_training.pipeline_checkpoint='never'", "distributed_training.zero_sharding='none'", 'distributed_training.fp16=False', 'distributed_training.memory_efficient_fp16=False', 'distributed_training.tpu=False', 'distributed_training.no_reshard_after_forward=False', 'distributed_training.fp32_reduce_scatter=False', 'distributed_training.cpu_offload=False', 'distributed_training.use_sharded_state=False', 'distributed_training.not_fsdp_flatten_parameters=False', 'dataset.num_workers=1', 'dataset.skip_invalid_size_inputs_valid_test=False', 'dataset.max_tokens=null', 'dataset.batch_size=8', 'dataset.required_batch_size_multiple=8', 'dataset.required_seq_len_multiple=1', 'dataset.dataset_impl=null', 'dataset.data_buffer_size=10', "dataset.train_subset='train'", "dataset.valid_subset='valid'", 'dataset.combine_valid_subsets=null', 'dataset.ignore_unused_valid_subsets=False', 'dataset.validate_interval=1', 'dataset.validate_interval_updates=0', 'dataset.validate_after_updates=0', 'dataset.fixed_validation_seed=null', 'dataset.disable_validation=False', 'dataset.max_tokens_valid=null', 'dataset.batch_size_valid=8', 'dataset.max_valid_steps=null', 'dataset.curriculum=0', "dataset.gen_subset='test'", 'dataset.num_shards=1', 'dataset.shard_id=0', 'dataset.grouped_shuffling=False', 'dataset.update_epoch_batch_itr=False', 'dataset.update_ordered_indices_seed=False', 'optimization.max_epoch=30', 'optimization.max_update=200000', 'optimization.stop_time_hours=0.0', 'optimization.clip_norm=0.0', 'optimization.sentence_avg=False', 'optimization.update_freq=[3]', 'optimization.lr=[5e-05]', 'optimization.stop_min_lr=-1.0', 'optimization.use_bmuf=False', 'optimization.skip_remainder_batch=False', "checkpoint.save_dir='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode'", "checkpoint.restore_file='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/pretrain/plbart_base.pt'", 'checkpoint.continue_once=null', 'checkpoint.finetune_from_model=null', 'checkpoint.reset_dataloader=True', 'checkpoint.reset_lr_scheduler=True', 'checkpoint.reset_meters=True', 'checkpoint.reset_optimizer=True', "checkpoint.optimizer_overrides='{}'", 'checkpoint.save_interval=1', 'checkpoint.save_interval_updates=0', 'checkpoint.keep_interval_updates=-1', 'checkpoint.keep_interval_updates_pattern=-1', 'checkpoint.keep_last_epochs=-1', 'checkpoint.keep_best_checkpoints=-1', 'checkpoint.no_save=False', 'checkpoint.no_epoch_checkpoints=True', 'checkpoint.no_last_checkpoints=False', 'checkpoint.no_save_optimizer_state=False', "checkpoint.best_checkpoint_metric='bleu'", 'checkpoint.maximize_best_checkpoint_metric=True', 'checkpoint.patience=3', "checkpoint.checkpoint_suffix=''", 'checkpoint.checkpoint_shard_count=1', 'checkpoint.load_checkpoint_on_all_dp_ranks=False', 'checkpoint.write_checkpoints_asynchronously=False', 'checkpoint.model_parallel_size=1', 'bmuf.block_lr=1.0', 'bmuf.block_momentum=0.875', 'bmuf.global_sync_iter=50', 'bmuf.warmup_iterations=500', 'bmuf.use_nbm=False', 'bmuf.average_sync=False', 'bmuf.distributed_world_size=1', 'generation.beam=5', 'generation.nbest=1', 'generation.max_len_a=0.0', 'generation.max_len_b=200', 'generation.min_len=1', 'generation.match_source_len=False', 'generation.unnormalized=False', 'generation.no_early_stop=False', 'generation.no_beamable_mm=False', 'generation.lenpen=1.0', 'generation.unkpen=0.0', 'generation.replace_unk=null', 'generation.sacrebleu=False', 'generation.score_reference=False', 'generation.prefix_size=0', 'generation.no_repeat_ngram_size=0', 'generation.sampling=False', 'generation.sampling_topk=-1', 'generation.sampling_topp=-1.0', 'generation.constraints=null', 'generation.temperature=1.0', 'generation.diverse_beam_groups=-1', 'generation.diverse_beam_strength=0.5', 'generation.diversity_rate=-1.0', 'generation.print_alignment=null', 'generation.print_step=False', 'generation.lm_path=null', 'generation.lm_weight=0.0', 'generation.iter_decode_eos_penalty=0.0', 'generation.iter_decode_max_iter=10', 'generation.iter_decode_force_max_iter=False', 'generation.iter_decode_with_beam=1', 'generation.iter_decode_with_external_reranker=False', 'generation.retain_iter_history=False', 'generation.retain_dropout=False', 'generation.retain_dropout_modules=null', 'generation.decoding_format=null', 'generation.no_seed_provided=False', 'eval_lm.output_word_probs=False', 'eval_lm.output_word_stats=False', 'eval_lm.context_window=0', 'eval_lm.softmax_batch=9223372036854775807', 'interactive.buffer_size=0', "interactive.input='-'", 'ema.store_ema=False', 'ema.ema_decay=0.9999', 'ema.ema_start_update=0', 'ema.ema_seed_model=null', 'ema.ema_update_freq=1', 'ema.ema_fp32=False', 'criterion=label_smoothed_cross_entropy', 'criterion._name=label_smoothed_cross_entropy', 'criterion.label_smoothing=0.2', 'criterion.report_accuracy=False', 'criterion.ignore_prefix_size=0', 'criterion.sentence_avg=False', 'bpe=sentencepiece', 'bpe._name=sentencepiece', "bpe.sentencepiece_model='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/sentencepiece/sentencepiece.bpe.model'", 'bpe.sentencepiece_enable_sampling=False', 'bpe.sentencepiece_alpha=null', 'optimizer=adam', 'optimizer._name=adam', "optimizer.adam_betas='(0.9, 0.98)'", 'optimizer.adam_eps=1e-06', 'optimizer.weight_decay=0.01', 'optimizer.use_old_adam=False', 'optimizer.fp16_adam_stats=False', 'optimizer.tpu=False', 'optimizer.lr=[5e-05]', 'lr_scheduler=polynomial_decay', 'lr_scheduler._name=polynomial_decay', 'lr_scheduler.warmup_updates=1000', 'lr_scheduler.force_anneal=null', 'lr_scheduler.end_learning_rate=0.0', 'lr_scheduler.power=1.0', 'lr_scheduler.total_num_update=null', 'lr_scheduler.lr=[5e-05]', 'scoring=bleu', 'scoring._name=bleu', 'scoring.pad=1', 'scoring.eos=2', 'scoring.unk=3']
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/config_loader_impl.py", line 513, in _apply_overrides_to_config
    OmegaConf.update(cfg, key, value, merge=True)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/omegaconf.py", line 613, in update
    root.__setattr__(last_key, value)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 285, in __setattr__
    raise e
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 282, in __setattr__
    self.__set_impl(key, value)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 266, in __set_impl
    self._set_item_impl(key, value)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/basecontainer.py", line 398, in _set_item_impl
    self._validate_set(key, value)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 143, in _validate_set
    self._validate_set_merge_impl(key, value, is_assign=True)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/dictconfig.py", line 159, in _validate_set_merge_impl
    cause=ValidationError("child '$FULL_KEY' is not Optional"),
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/base.py", line 101, in _format_and_raise
    type_override=type_override,
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/_utils.py", line 694, in format_and_raise
    _raise(ex, cause)
  File "/usr/local/lib/python3.7/dist-packages/omegaconf/_utils.py", line 610, in _raise
    raise ex  # set end OC_CAUSE=1 for full backtrace
omegaconf.errors.ValidationError: child 'lr_scheduler.total_num_update' is not Optional
	full_key: lr_scheduler.total_num_update
	reference_type=Optional[PolynomialDecayLRScheduleConfig]
	object_type=PolynomialDecayLRScheduleConfig

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/fairseq-train", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/train.py", line 522, in cli_main
    cfg = convert_namespace_to_omegaconf(args)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/dataclass/utils.py", line 389, in convert_namespace_to_omegaconf
    composed_cfg = compose("config", overrides=overrides, strict=False)
  File "/usr/local/lib/python3.7/dist-packages/hydra/experimental/compose.py", line 37, in compose
    with_log_configuration=False,
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/hydra.py", line 512, in compose_config
    from_shell=from_shell,
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/config_loader_impl.py", line 156, in load_configuration
    from_shell=from_shell,
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/config_loader_impl.py", line 277, in _load_configuration
    ConfigLoaderImpl._apply_overrides_to_config(config_overrides, cfg)
  File "/usr/local/lib/python3.7/dist-packages/hydra/_internal/config_loader_impl.py", line 522, in _apply_overrides_to_config
    ) from ex
hydra.errors.ConfigCompositionException: Error merging override lr_scheduler.total_num_update=null
2022-03-29 10:40:47 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2022-03-29 10:40:50 | INFO | fairseq_cli.generate | {'_name': None, 'common': {'_name': None, 'no_progress_bar': False, 'log_interval': 100, 'log_format': None, 'log_file': None, 'tensorboard_logdir': None, 'wandb_project': None, 'azureml_logging': False, 'seed': 1, 'cpu': False, 'tpu': False, 'bf16': False, 'memory_efficient_bf16': False, 'fp16': False, 'memory_efficient_fp16': False, 'fp16_no_flatten_grads': False, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'on_cpu_convert_precision': False, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'amp': False, 'amp_batch_retries': 2, 'amp_init_scale': 128, 'amp_scale_window': None, 'user_dir': None, 'empty_cache_freq': 0, 'all_gather_list_size': 16384, 'model_parallel_size': 1, 'quantization_config_path': None, 'profile': False, 'reset_logging': False, 'suppress_crashes': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'common_eval': {'_name': None, 'path': '/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt', 'post_process': 'sentencepiece', 'quiet': False, 'model_overrides': '{}', 'results_path': None}, 'distributed_training': {'_name': None, 'distributed_world_size': 1, 'distributed_num_procs': 1, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': None, 'distributed_port': -1, 'device_id': 0, 'distributed_no_spawn': False, 'ddp_backend': 'pytorch_ddp', 'ddp_comm_hook': 'none', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': False, 'find_unused_parameters': False, 'gradient_as_bucket_view': False, 'fast_stat_sync': False, 'heartbeat_timeout': -1, 'broadcast_buffers': False, 'slowmo_momentum': None, 'slowmo_base_algorithm': 'localsgd', 'localsgd_frequency': 3, 'nprocs_per_node': 1, 'pipeline_model_parallel': False, 'pipeline_balance': None, 'pipeline_devices': None, 'pipeline_chunks': 0, 'pipeline_encoder_balance': None, 'pipeline_encoder_devices': None, 'pipeline_decoder_balance': None, 'pipeline_decoder_devices': None, 'pipeline_checkpoint': 'never', 'zero_sharding': 'none', 'fp16': False, 'memory_efficient_fp16': False, 'tpu': False, 'no_reshard_after_forward': False, 'fp32_reduce_scatter': False, 'cpu_offload': False, 'use_sharded_state': False, 'not_fsdp_flatten_parameters': False}, 'dataset': {'_name': None, 'num_workers': 1, 'skip_invalid_size_inputs_valid_test': False, 'max_tokens': None, 'batch_size': 4, 'required_batch_size_multiple': 8, 'required_seq_len_multiple': 1, 'dataset_impl': None, 'data_buffer_size': 10, 'train_subset': 'train', 'valid_subset': 'valid', 'combine_valid_subsets': None, 'ignore_unused_valid_subsets': False, 'validate_interval': 1, 'validate_interval_updates': 0, 'validate_after_updates': 0, 'fixed_validation_seed': None, 'disable_validation': False, 'max_tokens_valid': None, 'batch_size_valid': 4, 'max_valid_steps': None, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0, 'grouped_shuffling': False, 'update_epoch_batch_itr': False, 'update_ordered_indices_seed': False}, 'optimization': {'_name': None, 'max_epoch': 0, 'max_update': 0, 'stop_time_hours': 0.0, 'clip_norm': 0.0, 'sentence_avg': False, 'update_freq': [1], 'lr': [0.25], 'stop_min_lr': -1.0, 'use_bmuf': False, 'skip_remainder_batch': False}, 'checkpoint': {'_name': None, 'save_dir': 'checkpoints', 'restore_file': 'checkpoint_last.pt', 'continue_once': None, 'finetune_from_model': None, 'reset_dataloader': False, 'reset_lr_scheduler': False, 'reset_meters': False, 'reset_optimizer': False, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 0, 'keep_interval_updates': -1, 'keep_interval_updates_pattern': -1, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': False, 'no_epoch_checkpoints': False, 'no_last_checkpoints': False, 'no_save_optimizer_state': False, 'best_checkpoint_metric': 'loss', 'maximize_best_checkpoint_metric': False, 'patience': -1, 'checkpoint_suffix': '', 'checkpoint_shard_count': 1, 'load_checkpoint_on_all_dp_ranks': False, 'write_checkpoints_asynchronously': False, 'model_parallel_size': 1}, 'bmuf': {'_name': None, 'block_lr': 1.0, 'block_momentum': 0.875, 'global_sync_iter': 50, 'warmup_iterations': 500, 'use_nbm': False, 'average_sync': False, 'distributed_world_size': 1}, 'generation': {'_name': None, 'beam': 10, 'nbest': 1, 'max_len_a': 0.0, 'max_len_b': 200, 'min_len': 1, 'match_source_len': False, 'unnormalized': False, 'no_early_stop': False, 'no_beamable_mm': False, 'lenpen': 1.0, 'unkpen': 0.0, 'replace_unk': None, 'sacrebleu': True, 'score_reference': False, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': False, 'sampling_topk': -1, 'sampling_topp': -1.0, 'constraints': None, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': None, 'print_step': False, 'lm_path': None, 'lm_weight': 0.0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': False, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': False, 'retain_iter_history': False, 'retain_dropout': False, 'retain_dropout_modules': None, 'decoding_format': None, 'no_seed_provided': False}, 'eval_lm': {'_name': None, 'output_word_probs': False, 'output_word_stats': False, 'context_window': 0, 'softmax_batch': 9223372036854775807}, 'interactive': {'_name': None, 'buffer_size': 0, 'input': '-'}, 'model': {'_name': 'wav2vec2', 'extractor_mode': 'default', 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': 'gelu', 'layer_type': 'transformer', 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.0, 'dropout_input': 0.0, 'dropout_features': 0.0, 'final_dim': 0, 'layer_norm_first': False, 'conv_feature_layers': '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] + [(512,2,2)]', 'conv_bias': False, 'logit_temp': 0.1, 'quantize_targets': False, 'quantize_input': False, 'same_quantizer': False, 'target_glu': False, 'feature_grad_mult': 1.0, 'quantizer_depth': 1, 'quantizer_factor': 3, 'latent_vars': 320, 'latent_groups': 2, 'latent_dim': 0, 'mask_length': 10, 'mask_prob': 0.65, 'mask_selection': 'static', 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'require_same_masks': True, 'mask_dropout': 0.0, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_before': False, 'mask_channel_selection': 'static', 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'num_negatives': 100, 'negatives_from_everywhere': False, 'cross_sample_negatives': 0, 'codebook_negatives': 0, 'conv_pos': 128, 'conv_pos_groups': 16, 'pos_conv_depth': 1, 'latent_temp': [2.0, 0.5, 0.999995], 'max_positions': 100000, 'checkpoint_activations': False, 'required_seq_len_multiple': 1, 'crop_seq_to_multiple': 1, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}, 'task': Namespace(_name='translation_from_pretrained_bart', all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, arch='wav2vec2', azureml_logging=False, batch_size=4, batch_size_valid=4, beam=10, best_checkpoint_metric='loss', bf16=False, bpe=None, broadcast_buffers=False, bucket_cap_mb=25, checkpoint_shard_count=1, checkpoint_suffix='', combine_valid_subsets=None, constraints=None, continue_once=None, cpu=False, cpu_offload=False, criterion='cross_entropy', curriculum=0, data='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/data/codeXglue/text-to-code/concode/data-bin', data_buffer_size=10, dataset_impl=None, ddp_backend='pytorch_ddp', ddp_comm_hook='none', decoding_format=None, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=1, distributed_port=-1, distributed_rank=0, distributed_world_size=1, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, empty_cache_freq=0, eos=2, eval_bleu=False, eval_bleu_args='{}', eval_bleu_detok='space', eval_bleu_detok_args='{}', eval_bleu_print_samples=False, eval_bleu_remove_bpe=None, eval_tokenized_bleu=False, fast_stat_sync=False, find_unused_parameters=False, finetune_from_model=None, fix_batches_to_gpus=False, fixed_validation_seed=None, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, fp32_reduce_scatter=False, gen_subset='test', gradient_as_bucket_view=False, grouped_shuffling=False, heartbeat_timeout=-1, ignore_unused_valid_subsets=False, iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, langs='java,python,en_XX', left_pad_source=True, left_pad_target=False, lenpen=1.0, lm_path=None, lm_weight=0.0, load_alignments=False, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_file=None, log_format=None, log_interval=100, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_source_positions=1024, max_target_positions=1024, max_tokens=None, max_tokens_valid=None, max_valid_steps=None, maximize_best_checkpoint_metric=False, memory_efficient_bf16=False, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides='{}', model_parallel_size=1, nbest=1, no_beamable_mm=False, no_early_stop=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_repeat_ngram_size=0, no_reshard_after_forward=False, no_save=False, no_save_optimizer_state=False, no_seed_provided=False, not_fsdp_flatten_parameters=False, nprocs_per_node=1, num_batch_buckets=0, num_shards=1, num_workers=1, on_cpu_convert_precision=False, optimizer=None, optimizer_overrides='{}', pad=1, path='/content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt', patience=-1, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, plasma_path='/tmp/plasma', post_process='sentencepiece', prefix_size=0, prepend_bos=False, print_alignment=None, print_step=False, profile=False, quantization_config_path=None, quiet=False, replace_unk=None, required_batch_size_multiple=8, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', results_path=None, retain_dropout=False, retain_dropout_modules=None, retain_iter_history=False, sacrebleu=True, sampling=False, sampling_topk=-1, sampling_topp=-1.0, save_dir='checkpoints', save_interval=1, save_interval_updates=0, score_reference=False, scoring='bleu', seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, slowmo_base_algorithm='localsgd', slowmo_momentum=None, source_lang='en_XX', suppress_crashes=False, target_lang='java', task='translation_from_pretrained_bart', temperature=1.0, tensorboard_logdir=None, threshold_loss_scale=None, tokenizer=None, tpu=False, train_subset='train', truncate_source=False, unk=3, unkpen=0, unnormalized=False, update_epoch_batch_itr=False, update_ordered_indices_seed=False, upsample_primary=-1, use_plasma_view=False, use_sharded_state=False, user_dir=None, valid_subset='valid', validate_after_updates=0, validate_interval=1, validate_interval_updates=0, wandb_project=None, warmup_updates=0, write_checkpoints_asynchronously=False, zero_sharding='none'), 'criterion': {'_name': 'cross_entropy', 'sentence_avg': True}, 'optimizer': None, 'lr_scheduler': {'_name': 'fixed', 'force_anneal': None, 'lr_shrink': 0.1, 'warmup_updates': 0, 'lr': [0.25]}, 'scoring': {'_name': 'bleu', 'pad': 1, 'eos': 2, 'unk': 3}, 'bpe': None, 'tokenizer': None, 'ema': {'_name': None, 'store_ema': False, 'ema_decay': 0.9999, 'ema_start_update': 0, 'ema_seed_model': None, 'ema_update_freq': 1, 'ema_fp32': False}}
2022-03-29 10:40:50 | INFO | fairseq.tasks.translation | [en_XX] dictionary: 50001 types
2022-03-29 10:40:50 | INFO | fairseq.tasks.translation | [java] dictionary: 50001 types
2022-03-29 10:40:50 | INFO | fairseq_cli.generate | loading model(s) from /content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-generate", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 413, in cli_main
    main(args)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 50, in main
    return _main(cfg, sys.stdout)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 102, in _main
    num_shards=cfg.checkpoint.checkpoint_shard_count,
  File "/usr/local/lib/python3.7/dist-packages/fairseq/checkpoint_utils.py", line 370, in load_model_ensemble
    state,
  File "/usr/local/lib/python3.7/dist-packages/fairseq/checkpoint_utils.py", line 419, in load_model_ensemble_and_task
    raise IOError("Model file not found: {}".format(filename))
OSError: Model file not found: /content/drive/MyDrive/nl2code/projects/plbart/PLBART/scripts/text_to_code/base/concode/checkpoint_best.pt
Traceback (most recent call last):
  File "evaluator.py", line 50, in <module>
    main()
  File "evaluator.py", line 26, in main
    assert len(preds) == len(gts), f"Samples of predictions and answers are not equal, {len(preds)}: {len(gts)}"
AssertionError: Samples of predictions and answers are not equal, 0: 2000
Traceback (most recent call last):
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 110, in <module>
    main()
  File "/content/drive/MyDrive/nl2code/projects/plbart/PLBART/evaluation/CodeBLEU/calc_code_bleu.py", line 85, in main
    assert len(hypothesis) == len(pre_references[i])
AssertionError
/content/drive/MyDrive/nl2code/projects/plbart/PLBART

I really don't know what to do this time.

My Environment

pytorch-1.10.0+cu111
google colab

translation with indentation

Hi,

I want to try translation between java and python. As current datasets from CodeXGLUE were representing functions in a single line, it is easy to finetune and test.
But what if I want to do this for python where indentation is very important? i.e, how will the tokenization take care of it?
I saw an example in TransCoder that they used below format for indentation using INDENT, DEDENT and NEWLINE -

def rm_file ( path ) : NEWLINE try : NEWLINE INDENT os . remove (path) NEWLINE print ( " Deleted " )
DEDENT except : NEWLINE INDENT print ( " Error _ while _ deleting _ file " , path ) DEDENT

Can you suggest how to proceed further with indentation using PLBART?

Data used for pretraining

hi!

thanks a lot for your contribution!

I was wondering how exactly did you obtain the pretraining Java source code used in the paper. I read that it comes from the github_repos dataset in bigquery.

Is it just all Java files found in the dataset or did you perform some kind of preprocessing, like removing duplicates (forks and such)?

Thanks in advance!
Gabriel.

Time required for fine-tuning

Hi, thanks for your great work!
When I finetuned downstream tasks on my NVIDIA 3090, I roughly calculated that 100k fine-tuning steps would take 2 days. This does not seem very acceptable. (Especially, CodeBERT takes only a few hours for fine-tuneing for the same task and dataset.)

Any suggestions?
Thanks!

Size of sample is invalid since max_positions=(1024, 1024)

Hi @wasiahmad ,
I trained PLBART for JAVA -> PYTHON translation. But while testing, I was getting below error -

2021-07-21 05:31:11 | INFO | train | {"epoch": 30, "train_loss": "2.69", "train_nll_loss": "0.723", "train_ppl": "1.65", "train_wps": "7795.4", "train_ups": "0.35", "train_wpb": "22402", "train_bsz": "58.2", "train_num_updates": "240", "train_lr": "1.2e-05", "train_gnorm": "0.607", "train_train_wall": "5", "train_wall": "638"}
2021-07-21 05:31:11 | INFO | fairseq_cli.train | done training in 637.0 seconds
Traceback (most recent call last):
  File "/home/jovyan/.local/bin/fairseq-generate", line 8, in <module>
    sys.exit(cli_main())
  File "/home/jovyan/.local/lib/python3.8/site-packages/fairseq_cli/generate.py", line 379, in cli_main
    main(args)
  File "/home/jovyan/.local/lib/python3.8/site-packages/fairseq_cli/generate.py", line 41, in main
    return _main(args, sys.stdout)
  File "/home/jovyan/.local/lib/python3.8/site-packages/fairseq_cli/generate.py", line 132, in _main
    itr = task.get_batch_iterator(
  File "/home/jovyan/.local/lib/python3.8/site-packages/fairseq/tasks/fairseq_task.py", line 227, in get_batch_iterator
    indices = self.filter_indices_by_size(
  File "/home/jovyan/.local/lib/python3.8/site-packages/fairseq/tasks/fairseq_task.py", line 137, in filter_indices_by_size
    raise Exception(
Exception: Size of sample #81 is invalid (=(1024, 1045)) since max_positions=(1024, 1024), skip this example with --skip-invalid-size-inputs-valid-test

I didn't understand what (1024, 1045) and (1024, 1024) mean. I'm using default 510 for training and 9999 for testing as below -

if [[ $SPLIT == 'test' ]]; then
        MAX_LEN=9999 # we do not truncate test sequences
    else
        MAX_LEN=510

Could you plz suggest how to proceed further..?

argument --user-dir: expected one argument

Hi,

I wanted to fine-tune PLBART for Code summarization task. While following the steps, i'm getting below error for !bash run.sh 0 python -

usage: fairseq-train [--user-dir USER_DIR]
fairseq-train: error: argument --user-dir: expected one argument
usage: fairseq-generate [--user-dir USER_DIR]
fairseq-generate: error: argument --user-dir: expected one argument
Total: 0
Traceback (most recent call last):
  File "evaluator.py", line 205, in <module>
    res = bleuFromMaps(goldMap, predictionMap)
  File "evaluator.py", line 198, in bleuFromMaps
    return [round(s * 100.0 / num, 2) for s in score]
  File "evaluator.py", line 198, in <listcomp>
    return [round(s * 100.0 / num, 2) for s in score]
ZeroDivisionError: float division by zero

Any idea how to proceed further ?

plbart-large window size

I believe there is something wrong with the window size (n_positions) of plbart-large. It prints the following:

>>> from transformers import PLBartTokenizer
>>> tokenizer = PLBartTokenizer.from_pretrained("uclanlp/plbart-large")
>>> tokenizer.model_max_length
1000000000000000019884624838656

I will be statically setting this to 1024.

questions about dict.txt and data samples specify methods

  1. which step generates the dict.txt? It seems be generated during "fairseq-preprocess", but "fairseq-preprocess" also have a parameter "--srcdict $DICT_FILE".
  2. how do you make the machine know the end of a data sample(which is a function in this case)? It seems that you use "\n", but functions also have "\n", I am confused about this.
  3. Also, if fairseq-train FILENAME_OF_FIRST_DATASAMPLE: FILENAME_OF_SECOND_DATASAMPLE : FILENAME_OF_THIRD_DATASAMPLE:.....:FILENAME_OF_NTH_DATASAMPLE, will it work?

I am new to this. Thanks.

Processing only one language

Hi @wasiahmad,

I have following queries:

  1. Could you please guide me on the steps if we want to pre-train the model only for the downstream tasks which require only one Programming language (Eg. Clone detection). Does this library incorporates this scenario, or do I need to make certain changes to accommodate it?

  2. Also, the use of number of GPUs has been hardcoded for 8 GPU and not taken as an input argument from the user. Will this work if we have only 1 or 2 GPUs on the system?

  3. What is the early stopping criteria for pre-training? What is the hyperparameter for the same

Thank you,
Aman

The large bleu gap in validation and test dataset

I find that There is a big gap between the bleu of validation and test. for example, on java language, the best validation bleu is about 7, but the test validation bleu is about 18+. Why? (the validation bleu is shown in the fairseq-train, and the test bleu is shown by the extra evaluation script).

Would you mind sharing your dictionary?

A very good work!

I want to cite your work via finetuning some downstream tasks based on your pre-trained model.
Would you mind sharing your token dictionary? It will be very helpful for us. :)

Need help to understand unseen language token for translation task

Hi,
I am looking at translation task. The C# language is not used for pre-training PLBART. However, we are able to fine tune it for unseen language token C#. I want to understand where is the language token added for C# in data samples created for fine tuning PLBART.

The translation.py file in source directory has init method which adds language token to dictionary from langs variable which contains only - java,python,en_XX as defined in this path PLBART/scripts/code_to_code/translation/run.sh

Can you please help with this.

Question about size of pretrain data

Hi,
I followed the tutorial to download the pretrain data step by step. After i downloaded all the .gz files and unzip them, i found there are just 123GB and 58.6GB data for java and python respectively(as to json file number,there are 1020 and 102 respectively), which mismatches the statistics in paper(352GB and 224GB respectively).
Have i missed something?(I saw there is a de-duplicating step in that tutorial, which will reduce the data size, is it the reason?)

Can we add structural information in PLBART?

I know this is not the perfect question to add into this repo. I understand PLABART is based on a transformer architecture and deal with sequence based text stream.

But would be curious to know whether we can embed structural information like AST level edges in PLBART.

Can other programming languages be used?

Hello author, I have two questions:

  1. Why do you use java and python?
    I see https://github.com/facebookresearch/TransCoder provides pre-processing pipeline for three languages: c++\java\python, so i just want ask why do you just use two of them(java and python)?
  2. Can other programming languages be used?
    In this bigmodel work, they use C++, C#, Go, Java, JavaScript, Lua, PHP, Python, Ruby, Rust, Scala, TypeScript for programming languages. However, different programming languages need very different preprocessing pipelines, which is too difficult for me to write language-specific preprocessing code. So, could you recommend some related resources useful for this problem?
    Thx!

Confused about the "max-sentences" in pretraining

Hi,

In the pretraining script, you set the max-sentences to 32. Max-sentences is per GPU, so PER_GPU_TRAIN_BATCH_SIZE is 32. But the "max-tokens" is 2048 and the "tokens-per-sample" is 512, , so PER_GPU_TRAIN_BATCH_SIZE is 4. Why are these two parameters conflicted?

Thanks

Samples of predictions and answers are not equal, {len(preds)}: {len(gts)}

Hi,

I am trying to run the "run.sh" file, but I have the following error:

fairseq-generate: error: argument --task: invalid choice: 'translation_from_pretrained_bart' (choose from 'translation', 'translation_lev', 'language_modeling', 'sentence_prediction', 'multilingual_translation', 'translation_moe', 'semisupervised_translation', 'sentence_ranking', 'multilingual_masked_lm', 'cross_lingual_lm', 'audio_pretraining', 'translation_from_pretrained_xlm', 'denoising', 'masked_lm', 'legacy_masked_lm') Traceback (most recent call last): File "/content/drive/MyDrive/Colab_Notebooks/PLBART-main/scripts/text_to_code/evaluator.py", line 46, in <module> main() File "/content/drive/MyDrive/Colab_Notebooks/PLBART-main/scripts/text_to_code/evaluator.py", line 29, in main assert len(preds) == len(gts), f"Samples of predictions and answers are not equal, {len(preds)}: {len(gts)}" AssertionError: Samples of predictions and answers are not equal, 0: 2000 Traceback (most recent call last): File "/content/drive/MyDrive/Colab_Notebooks/PLBART-main/evaluation/CodeBLEU/calc_code_bleu.py", line 34, in <module> assert len(hypothesis) == len(pre_references[i]) AssertionError

Any idea please to fix this ?

about lang token position in the target

I find that in multilingual denoising task of fairseq , the target's language token is at the end, which is not like the table3 in your paper, where the language token is at the start of target. am I wrong?
image

Support for FastTokenizer in huggingface

Hello, I found there is no a corresponding PLBartTokenizerFast in huggingface, do you have a plan to implement a fast version tokenizer?

In fact, I need to call the word_ids() function of fast tokenizer to get the list indicating the original word corresponding to each tokenized token.
word_ids = tokenized_inputs.word_ids(batch_index=i)

Or do you have any ways to calculate the original word index corresponding to each tokenized token?

Thank you very much!

Dataset not found when execute `bash run.sh` on code-to-text task

Hi, I want to train code-to-text model using pretrained model. I download the file code-to-text.zip, after I execute bash prepare.sh, the files in data/codeXglue/code-to-text/java are listed as below:

.
├── data-bin
│   ├── dict.en_XX.txt
│   ├── dict.java.txt
│   └── preprocess.log
├── test.jsonl
├── test.spm.en_XX
├── test.spm.java
├── train.jsonl
├── train.spm.en_XX
├── train.spm.java
├── valid.jsonl
├── valid.spm.en_XX
└── valid.spm.java

When I execute bash run.sh 0 java, I got FileNotFoundError: Dataset not found: valid (/home/.../PLBART/data/codeXglue/code-to-text/java/data-bin) and FileNotFoundError: Dataset not found: test (/home/.../PLBART/data/codeXglue/code-to-text/java/data-bin) error. May I please ask whether I forget some steps? Which files are supposed to be in data/codeXglue/code-to-text/java? Sorry if I bother, I appreciate your help!

the parameter setting when increasing "tokens_per_sample"

the default setting for tokens_per_sample in PLBART is 512, and when I increase it to 1024, there is an "index of bounds" bug reported by nvidia. but when I set max-source-positions and max-target-positions to 2048(default is 1024), there is no such error. though there is no error, I am not sure if the setting is correct and I am curious to know the meaning of these two parameters. Also, there is another max-positions, which should be set to the same as tokens-per-sample, and this seems to related to the positional embeddings. What are the differences for these three positional parameters?

also, the max-sentences is the batch size for all gpus? if max_tokens is 2048 and tokens_per_sample is 1024, and it is trained on 8 gpus, so the max-sentences should be 8*(2048/1024)=16?

Thanks

Question about the architecture in HuggingFace

I noticed that the model architecture from HuggingFace has a shared embedding layer (shared): Embedding(50005, 768, padding_idx=1) whereas Fairseq (used in this repo) does not.
Will the shared embedding affect the performance for code-to-text tasks?

Thanks!

Missing "java" token in Hugging Face Tokenizer

Hi,

I am trying to replicate the results of PLBART for the code refinement fine-tuning task using Hugging Face. When I tokenize methods that contain the "java" token and then decode them, the "java" token is strangely removed! Here is my code:

code = "public void METHOD_1 ( TYPE_1 VAR_1 ) throws java.lang.Exception { super . METHOD_1 ( VAR_1 ) ; METHOD_2 ( VAR_1 ) ; }"
tokenizer = model_tokenizer_class.from_pretrained("uclanlp/plbart-base", language_codes="base")
model_inputs = tokenizer([code])
print(tokenizer.decode(model_inputs['input_ids'][0], skip_special_tokens=True, clean_up_tokenization_spaces=False))
# The code output is: "public void METHOD_1 ( TYPE_1 VAR_1 ) throws .lang.Exception { super . METHOD_1 ( VAR_1 ) ; METHOD_2 ( VAR_1 ) ; }"

Also, is there any hugging face implementation of the code refinement task using PLBART? My implementation does not achieve the EM and BLEU reported for the test set. I executed the existing fairseq implementation and got EM: 17.67​, however my hugging face implementation gets EM: 5.62! What important factors should I check?

OOM: Ran out of memory with exception || Every time I restart pre-training from last checkpoint

Hi,

I am training on a system which has a time limit of 10 hours. So, every time I restart pre-training from last checkpoint, I get OOM error while it was running properly during the previous run with same configuration.

2022-07-23 15:15:37 | WARNING | fairseq.trainer | OOM: Ran out of memory with exception: CUDA out of memory. Tried to allocate 2.29 GiB (GPU 2; 31.75 GiB total capacity; 28.47 GiB already allocated; 1.09 GiB free; 29.63 GiB reserved in total by PyTorch)

As a hack I reduce MAX_TOKENS by 512 every time and then it works. But now I've reached a point where I cannot reduce the MAX_TOKENS further, but still need to train my model further.

Also I've, noticed just one GPU goes OOM. Actually, I've tried to read it online, the cause is Distributed Data Parallel, tries to load all the data to one GPU and then distributes the load to the rest of the GPUs. But not sure how to deal with it.

Resources:

Total GPUs=8; Tesla V100-SXM2-32GB; total memory = 31.749 GB each;

My pretrain.sh is as follows:

MAX_UPDATE=100000
WARMUP_UPDATES=2000
MAX_SENTENCES=64
MAX_TOKENS=2048
TOKENS_PER_SAMPLE=512
UPDATE_FREQ=60

export CUDA_VISIBLE_DEVICES=$1
fairseq-train $DATA_DIR \
    --add-lang-token \
    --langs $langs \
    --dataset-impl 'mmap' \
    --bpe 'sentencepiece' \
    --sentencepiece-model $SPM_MODEL \
    --arch mbart_base \
    --tokens-per-sample $TOKENS_PER_SAMPLE \
    --max-tokens $MAX_TOKENS \
    --max-sentences $MAX_SENTENCES \
    --update-freq $UPDATE_FREQ \
    --layernorm-embedding \
    --multilang-sampling-alpha 0.3 \
    --train-subset train \
    --valid-subset valid \
    --required-batch-size-multiple 8 \
    --insert 0 \
    --permute-sentences 0 \
    --poisson-lambda 3.5 \
    --mask 0.3 \
    --mask-length 'span-poisson' \
    --replace-length 1 \
    --rotate 0 \
    --mask-random 0.1 \
    --task multilingual_denoising \
    --criterion cross_entropy \
    --dropout 0.1 \
    --attention-dropout 0.1 \
    --relu-dropout 0.0 \
    --weight-decay 0.01 \
    --optimizer adam \
    --adam-eps 1e-06 \
    --clip-norm 0.1 \
    --lr 3e-4 \
    --lr-scheduler polynomial_decay \
    --warmup-updates $WARMUP_UPDATES \
    --total-num-update $MAX_UPDATE \
    --max-update $MAX_UPDATE \
    --fp16 \
    --ddp-backend no_c10d \
    --no-epoch-checkpoints \
    --save-interval-updates 1000 \
    --keep-interval-updates 10 \
    --save-dir $SAVE_DIR \
    --skip-invalid-size-inputs-valid-test \
    --log-format json \
    --log-interval 10 \
    --num-workers 40 \
    --seed 1234 \
    --keep-last-epochs 20 \
    --patience 24 \
    --restore-file $SAVE_DIR/checkpoint_last.pt \
    --tensorboard-logdir $TENSORBOARD_LOGDIR \
    2>&1 | tee $SAVE_DIR/output.log

Please suggest how to deal with it.

How to make inference

I see the repo only has fine tuning. Can I ask how to make inference using trained model

Script for training PLBART

Hi,

Thanks to the great contribution in code Generation. We would like to discover your PLBART model. Could you share with us the script for training PLBART for code generation?

Thank you so much

About the batch size of pre-training

you said you use 2048 batch size in pretraining, but I see that in the pretrain/pretrain.sh (or absolute.sh in old), the real batch size is max-sentences * update-freq * num_of_gpus=32*60*8=15360?

Problems with Hugging Face

Hi,

I noticed that you uploaded the model to hugging face transformers library which is super exciting!
However, when I tried to use it through transformers with the provided guidance, I got the following error:

>>> model = AutoModelForSeq2SeqLM.from_pretrained("uclanlp/plbart-base") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/anaconda3/envs/mdt/lib/python3.7/site-packages/transformers/models/auto/auto_factory.py", line 397, in from_pretrained pretrained_model_name_or_path, return_unused_kwargs=True, **kwargs File "/home/anaconda3/envs/mdt/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 529, in from_pretrained config_class = CONFIG_MAPPING[config_dict["model_type"]] File "/home/anaconda3/envs/mdt/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 278, in __getitem__ raise KeyError(key) KeyError: 'plbart'
This is the command I used:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("uclanlp/plbart-base")
model = AutoModelForSeq2SeqLM.from_pretrained("uclanlp/plbart-base")

Here is my configuration:
python: 3.7.11
pytorch: 1.9.1
transformers: 4.11.3

Thanks.

unzip unsuccessful when running download.sh

I tried to fine tune the code refinement task on the PLBART paper, I set up the conda environment by bash install_env.sh, then download the checkpoints. However, when I run bash download.sh under data/codeXglue, I got this which seems that either unzip or the download was unccessful.

image

Am I missing some steps in the setup?

AssertionError while evaluating 'translation'

Hi @wasiahmad ,

I am trying 'translation' capabilities of PLBART and started finetuning as mentioned. But I'm getting below error in evaluation -

File "calc_code_bleu.py", line 34, in <module>
    assert len(hypothesis) == len(pre_references[i])
AssertionError

Here is a bit detailed traceback -

2021-07-15 13:57:58 | INFO | fairseq_cli.train | early stop since valid performance hasn't improved for last 10 runs
2021-07-15 13:57:58 | INFO | fairseq_cli.train | begin save checkpoint
2021-07-15 13:58:19 | INFO | fairseq.checkpoint_utils | saved checkpoint /content/PLBART/scripts/code_to_code/translation/java_cs/checkpoint_last.pt (epoch 22 @ 14168 updates, score 80.08) (writing took 20.417050701000335 seconds)
2021-07-15 13:58:19 | INFO | fairseq_cli.train | end of epoch 22 (average epoch stats below)
2021-07-15 13:58:19 | INFO | train | {"epoch": 22, "train_loss": "2.08", "train_nll_loss": "0.177", "train_ppl": "1.13", "train_wps": "1562.9", "train_ups": "1.76", "train_wpb": "890.1", "train_bsz": "16", "train_num_updates": "14168", "train_lr": "4.93409e-05", "train_gnorm": "0.534", "train_train_wall": "255", "train_wall": "8414"}
2021-07-15 13:58:19 | INFO | fairseq_cli.train | done training in 8412.8 seconds
  0% 0/250 [00:00<?, ?it/s]/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py:172: UserWarning: --sacrebleu is deprecated. Please use --scoring sacrebleu instead.
  scorer = scoring.build_scorer(args, tgt_dict)
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-generate", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 379, in cli_main
    main(args)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 41, in main
    return _main(args, sys.stdout)
  File "/usr/local/lib/python3.7/dist-packages/fairseq_cli/generate.py", line 172, in _main
    scorer = scoring.build_scorer(args, tgt_dict)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/__init__.py", line 54, in build_scorer
    return _build_scorer(args)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/registry.py", line 54, in build_x
    return builder(args, *extra_args, **extra_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/fairseq/scoring/bleu.py", line 40, in __init__
    character_tokenization=self.args.sacrebleu_char_level,
AttributeError: 'Namespace' object has no attribute 'sacrebleu_char_level'
Traceback (most recent call last):
  File "/content/PLBART/evaluation/evaluator.py", line 36, in <module>
    main()
  File "/content/PLBART/evaluation/evaluator.py", line 20, in main
    assert len(refs) == len(pres)
AssertionError
Traceback (most recent call last):
  File "calc_code_bleu.py", line 34, in <module>
    assert len(hypothesis) == len(pre_references[i])
AssertionError

Could you plz suggest how to proceed further..

AttributeError: module 'sentencepiece' has no attribute 'SentencePieceProcessor'

Firstly, I would like to thank you for your release and documentation. I am fine-tuning the text-to-code model and when I run the "scripts / text_to_code / prepare.sh" file I have the following error in the file "scripts/text_to_code/encode.py", line 30" :

AttributeError: module 'sentencepiece' has no attribute 'SentencePieceProcessor'

Any idea please ?

Can PLBART work on single statement?

Hi, Thanks for sharing the items.

I have a simple question.
In the dataset, most of the data input is a single method declaration paragraph.

But If the input data is single statements, PLBART still works well?

for example, in code-to-code tasks(translate java code to python code),
If we input System.out.println("hello world"); then PLBART could translate it to print("hello world") ?

I'm asking this because I want to use PLBART encoder as statement embedding model.

Thanks!

How to calculate loss in PLBART?

Hi! I'm relatively new to this. I am trying to fine-tune PLBART on an SQL-natural language dataset for code synthesis task. The downside is I am using Google Colab to do this. I am downloading the pretrained PLBART model from huggingface using AutoModelForSeq2SeqLM.from_pretrained("uclanlp/plbart-base"). I am tokenizing the codes and the natural language using the tokenizer provided in the documentation for huggingface and providing the code tokens as input to the model. However, the output of the model has two outputs: logits and a large tuple which I think are the hidden state values. I feel like I should use the logits to calculate loss against the natural language tokens, but the logits are in decimals and some are negative, while the natural language tokens are probably indices that correspond to some internal vocabulary. Can you advise what to do? I apologize if this sounds like a very basic query.

How to use encoder and decoder respectively?

Hi! I'm trying to fine-tune PLBART. PLBART is an encoder-decoder model, but I want to add something different behind the encoder, I'd like to know how to use encoder and decoder respectively? Thanks!

About performance on code summarization

Hi, I read your work and I think it's very nice. In recent I run the code summarization experiment follow your readme as follow:
bash run.sh 1 [pl_lang]
and I got the performance table as follow:

Ruby Javascript Go Python Java PHP
PLBART from paper 14.11 15.56 18.91 19.3 18.45 23.58
PLBART my run 14.36 15.28 18.08 19.92 18.48 23.8

I notice the result of plbart on some pl like "Go" seems lower than paper reported result.
Does every programming language have its own hyper-parameters on the task?

Vocab size issue

I found that the vocab size of embedding layer is 50,004. However, the vocab size of subword (bpe) tokenizer is 50,044, which causes out-of-vocab problem.

I got the vocab size of tokenizer by using this code vocab_size = len(bart.task.source_dictionary)

I faced this problem when I use bart.encode() function.

Here is how I load the pre-trained bart model in python

bart = BARTModel.from_pretrained(model_path, checkpoint_file=model_file)

where model_path contains the pre-trained PLBART, and model_file is plbart_base.pt

I am not sure if I do something wrong here. Can anyone help me?

Thanks.

questions about some implementations of PLBART

  1. in the paper, it says "We mask 35% of the tokens in each instance." But it is "--mask 0.3" in https://github.com/wasiahmad/PLBART/blob/main/pretrain/pretrain.sh. Is there something that I understand wrongly?
  2. can the pre-process method be used in other languages, such as, C? 
    I use your method to train sentencepiece on some C corpus, it gives me the error:

Vocabulary size is smaller than required_chars. 50000 vs 693453. Increase vocab_size or decrease character_coverage with --character_coverage option. 

Also, here are some related logs:

trainer_interface.cc(466) LOG(INFO) all chars count=5475384381
trainer_interface.cc(477) LOG(INFO) Done: 100% characters are covered.
trainer_interface.cc(487) LOG(INFO) Alphabet size=693450
trainer_interface.cc(488) LOG(INFO) Final character coverage=1

  1. for the parameters of the fairseq-train, except reading source codes, are there any other documents to describe these options? I found https://fairseq.readthedocs.io/en/latest/command_line_tools.html, but it is incomplete. Lots of your parsmeters are not shown in that doct.
  2. Also, do you know there are providen scripts to use fairseq to pre-train other BERT related models.
    Thanks

Mismatch when loading the checkpoints

Hi, thanks for your great work!

When I tried to load the pre-trained checkpoints and fine tune, I came across the size mismatch problem. It seems that the dict.txt you provided does not match the checkpoints.

Here is the error message:

size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([50005, 768]) from checkpoint, the shape in current model is torch.Size([50001, 768]).
size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([50005, 768]) from checkpoint, the shape in current model is torch.Size([50001, 768]).
size mismatch for decoder.output_projection.weight: copying a param with shape torch.Size([50005, 768]) from checkpoint, the shape in current model is torch.Size([50001, 768]).

This is the script I used to get the checkpoints:
https://github.com/wasiahmad/PLBART/blob/main/pretrain/download.sh

This is the dict.txt I used:
https://github.com/wasiahmad/PLBART/blob/main/sentencepiece/dict.txt

Here is the command I used to fine tune:
fairseq-train $PATH_2_DATA \ --user-dir $USER_DIR --truncate-source \ --arch mbart_base --layernorm-embedding \ --task translation \ --source-lang $SOURCE --target-lang $TARGET \ --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \ --batch-size $BATCH_SIZE --update-freq $UPDATE_FREQ --max-epoch 30 \ --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' \ --lr-scheduler polynomial_decay --lr 5e-05 --min-lr -1 \ --warmup-updates 500 --max-update 100000 \ --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.0 \ --seed 1234 --log-format json --log-interval 100 \ ${restore} \ --eval-bleu --eval-bleu-detok space --eval-tokenized-bleu \ --eval-bleu-remove-bpe sentencepiece --eval-bleu-args '{"beam": 5}' \ --best-checkpoint-metric bleu --maximize-best-checkpoint-metric \ --no-epoch-checkpoints --patience 5 \ --ddp-backend no_c10d --save-dir $SAVE_DIR 2>&1 | tee ${OUTPUT_FILE};

About 'cbart' in pretrain/binarize.sh

Hello, I'm trying to run the pretraining experiment of PLBART. But I find that there is a 'cbart' dir in binarize (cbart/sentencepiece.bpe.model), I wonder whether the 'cbart' in the bash file should be replaced by '$SPM_DIR'.

pre-trained PLBART model checkpoint

Hi, thanks for the contribution, one more pre-trained LM with code corpora!
Is there any new progress on the release of PLBART checkpoint? Can't wait to try!

ImportError: cannot import name '_bleu' from 'bleu'

HI,

I have a problem when I run the "/scripts/text_to_code/run.sh" file. here is the error:

File "/content/drive/MyDrive/Colab_Notebooks/PLBART-main/scripts/text_to_code/evaluator.py", line 5, in <module> from bleu import _bleu ImportError: cannot import name '_bleu' from 'bleu' (/usr/local/lib/python3.7/dist-packages/bleu/__init__.py) Traceback (most recent call last): File "/content/drive/MyDrive/Colab_Notebooks/PLBART-main/evaluation/CodeBLEU/calc_code_bleu.py", line 34, in <module> assert len(hypothesis) == len(pre_references[i]) AssertionError

if it's still a problem with the packages. Can you send me your project requirements.txt file please?

Multilingual `prepare.sh` throws an error after downloading

While running prepare.sh the followign errors are thrown for all the languages in multilingual directory:

FileNotFoundError: [Errno 2] No such file or directory: '/home/crocoder/Desktop/transformers/PLBART/multilingual/data/processed/valid.php-en_XX.php'
Traceback (most recent call last):
  File "encode.py", line 92, in <module>
    main()
  File "encode.py", line 88, in main
    process(args)
  File "encode.py", line 49, in process
    with open(args.input_source, 'r', encoding='utf-8') as f1, \
FileNotFoundError: [Errno 2] No such file or directory: '/home/crocoder/Desktop/transformers/PLBART/multilingual/data/processed/test.php-en_XX.php'

Any help on this?

Why new nl_eval.py always outputs the "0" bleu score?

I run the experiment on code_to_text.
I use the old evaluation "scripts/code_to_text/evaluator.py" and the new evaluation "nl_eval.py" which you just released, and find that the nl_eval.py outputs a 0 bleu score and the old outputs a normal bleu score
X}T~$XXALT(}4N7PEZ`P21
image

Preprocessing

Thanks to the great contribution in code translation and pre-training area. After reading the related papers (Transcoder and DOBF), we had some clarification questions on the pre-processing procedure of the pre-training corpus built from the Github dataset on google big query. We’d be grateful if you could guide us on these.

  • Length of function: are you using any filtering to remove functions above a certain length? If so, what is the threshold length? (token level or any other criteria?)
  • Code comment extractor: can you share which libraries or toolkits did you use to do this or did you implement your own comment extractor?
  • Do you have any suggestions on extracting bimodal signals (code & description) from the Github dataset? I am feeling that class and class functions are more likely to have NL part (as docstrings).
  • Regarding SQL queries that need to be run on the Google bigquery platform (similar to Transcoder’s approach here):
    WHERE
    NOT c.binary
    AND f.path like '%.py'

Should we change this to the below condition to filter out non-function data

WHERE
NOT c.binary
AND file.path LIKE '%.py'
AND name = 'Python'

Thank you so much

The experiments of task classification

Hi,

Thanks for sharing these interesting works.

When I loaded the best checkpoint to reproduce the results in the defect detection dataset, I got 61.75 ACC. File result.txt in google drive is also 61.78 ACC, while in the paper, the ACC is 63.18.
Is it the best checkpoint?

Thank you so much

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.