Git Product home page Git Product logo

Comments (5)

amin-nejad avatar amin-nejad commented on September 12, 2024 1

Had the same issue of no metrics being printed at all, seems like it's because the default setting of the logger is to only print warning messages, not info messages. If you are using the root logger (e.g. logger = logging.getLogger()), before you create the object, run this line:

logging.basicConfig(level=logging.NOTSET)

If you are defining a custom logger yourself (e.g. logger = logging.getLogger("my-logger")), before you create the object, run this line:

logging.root.setLevel(logging.NOTSET)

Now the training process, will print out the loss as well as any other metrics you passed to the learner object.

Alternatively, you can always view the training process either during or afterwards, using tensorboard. The training process creates a folder called tensorboard with all the events files in there.

from fast-bert.

kaushaltrivedi avatar kaushaltrivedi commented on September 12, 2024

please try again with the latest version of the library.

from fast-bert.

008karan avatar 008karan commented on September 12, 2024

I have tried using the latest version still the same issue. I tried to append the metric list but then was getting the error.

keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
 60.00% [6/10 1:05:32<43:41]
 47.85% [100/209 05:03<05:31]
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 8192.0
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 4096.0
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00]
 100.00% [24/24 00:20<00:00] ```

from fast-bert.

008karan avatar 008karan commented on September 12, 2024

@amin-nejad Thanks for the suggestion. I tried using your method still same issue. Added your line before creating logger object.

from fast_bert.metrics import accuracy
import logging
logging.basicConfig(level=logging.NOTSET)
logger = logging.getLogger()
device_cuda = torch.device("cuda")
#metrics = [{'name': 'accuracy', 'function': accuracy}]
metrics = []
metrics.append({'name': 'accuracy_thresh', 'function': accuracy_thresh})
metrics.append({'name': 'roc_auc', 'function': roc_auc})
metrics.append({'name': 'fbeta', 'function': fbeta})
metrics.append({'name': 'accuracy_single', 'function': accuracy_multilabel})
learner = BertLearner.from_pretrained_model(
						databunch,
						pretrained_path='bert-base-uncased',
						metrics=metrics,
						device=device_cuda,
						logger=logger,
						output_dir=MODEL_PATH,
						finetuned_wgts_path=None,
						warmup_steps=50,
						multi_gpu=True,
						is_fp16=True,
						multi_label=True,
						logging_steps=50)

Don't know what's going wrong. 
Also, can you tell how to use tensorboard here? I tried adding it in model.fit() method but getting error. I think I am missing something. Can you help?

from fast-bert.

amin-nejad avatar amin-nejad commented on September 12, 2024

Not sure what the problem is then, changing the logging as I suggested above worked for me.

Re tensorboard, you don't really need to do anything. Once you begin training by calling learner.fit(), in the directory you have specified as your output_dir (MODEL_PATH in your case), you should see a subdirectory called tensorboard. While the model is training, it will constantly update a file in there whose name will be something like events.out.tfevents.1565791914.vm1.

On your terminal, all you need to do is change directory to the tensorboard directory and then run tensorboard --logdir=.. Ensure you have tensorflow installed in the environment you are using, it should automatically come with tensorboard when you install it. If you are using a virtual machine, you also need to ensure you have opened port 6006.

You can find more info here

from fast-bert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.