Hi, in your readme it says that the --summarize flag needs to be spe

Error when using --summarize with matcher.py about ditto HOT 1 OPEN

megagonlabs commented on July 20, 2024

Error when using --summarize with matcher.py

from ditto.

Comments (1)

Ribo-Py commented on July 20, 2024

I have encountered similar issue.

`Downloading: 100%|███████████████████████████| 28.0/28.0 [00:00<00:00, 21.7kB/s]
Downloading: 100%|██████████████████████████████| 483/483 [00:00<00:00, 401kB/s]
Downloading: 100%|███████████████████████████| 226k/226k [00:00<00:00, 38.4MB/s]
Downloading: 100%|███████████████████████████| 455k/455k [00:00<00:00, 40.8MB/s]
Downloading: 100%|███████████████████████████| 256M/256M [00:05<00:00, 49.4MB/s]
Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_layer_norm.weight', 'vocab_transform.bias', 'vocab_projector.bias', 'vocab_layer_norm.bias', 'vocab_projector.weight', 'vocab_transform.weight']

This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Selected optimization level O2: FP16 training with FP32 batchnorm and FP32 master weights.

Defaults for this optimization level are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'",)
/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/apex/amp/_initialize.py:25: UserWarning: An input tensor was not cuda.
warnings.warn("An input tensor was not cuda.")
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 32768.0
/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
step: 0, loss: 0.7943093180656433
Traceback (most recent call last):
File "train_ditto.py", line 92, in
run_tag, hp)
File "/home/ec2-user/SageMaker/vendor_matching/ditto/ditto_light/ditto.py", line 201, in train
train_step(train_iter, model, optimizer, scheduler, hp)
File "/home/ec2-user/SageMaker/vendor_matching/ditto/ditto_light/ditto.py", line 123, in train_step
for i, batch in enumerate(train_iter):
File "/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ec2-user/SageMaker/vendor_matching/ditto/ditto_light/dataset.py", line 80, in getitem
left, right = combined.split(' [SEP] ')
ValueError: not enough values to unpack (expected 2, got 1)
Traceback (most recent call last):
File "matcher.py", line 7, in
import jsonlines
ModuleNotFoundError: No module named 'jsonlines'
`

from ditto.

Error when using --summarize with matcher.py about ditto HOT 1 OPEN

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent