Comments (9)
At least for NTXentLoss, setting it to a large negative value (instead of float('-inf')
) would be fine, because the purpose is to make particular entries 0 when passed to torch.exp. I'll have to check if it makes sense for the other places where I use float
from declutr.
Thanks @subercui, this error arises because the PyTorch Metric Learning library. I opened an issue on Apex here but no response :( maybe you can open an issue on PyTorch Metric Learning?
from declutr.
Thanks! I'll have a look
from declutr.
I found a manual solution that works. Install PyTorch Metric Learning from source and change:
torch.max(neg_pairs, dim=1, keepdim=True)[0])
to
torch.max(neg_pairs, dim=1, keepdim=True)[0].half())
in NTXentLoss
. Still, I think it makes sense to raise this issue on the PyTorch Metric Learning github.
from declutr.
I think this happens because I create infinity values using python's float('inf')
. I could have an optional half_precision
flag for all loss functions, and if it's True, then cast all numbers made with float()
to pytorch's half()
from declutr.
Ah, I think you are right. There's a discussion on this HF Transformers PR where they end up writing an assert
for a similar scenario:
masked_bias = self.masked_bias.to(w.dtype)
assert masked_bias.item() != -float("inf"), "Make sure `self.masked_bias` is not `-inf` in fp16 mode"
w = torch.where(mask, w, masked_bias)
What about replacing float('inf')
with a very large value instead (see here)? That way, amp can handle it automatically and there's no need for the user to specify half_precision
(update: upon closer inspection of that issue, I am not sure if this will actually work).
from declutr.
Awesome, thanks for weighing in!
from declutr.
v0.9.90.dev0 supports half precision
pip install pytorch-metric-learning==0.9.90.dev0
from declutr.
@KevinMusgrave Awesome! Thanks a lot.
from declutr.
Related Issues (20)
- Cant set up DECLUTR in local AWS linux machine HOT 2
- argument 'lazy' for dataset_reader HOT 2
- Superclass initialization in token embedder HOT 2
- Could not lex the character code 194 HOT 3
- Minimum text length violated despite preprocessing HOT 2
- How to plot the learning curve from the output logs created post training of declutr? HOT 1
- Impact of "shorter" documents (span, number of tokens) for extended pretraining HOT 7
- Installation issue HOT 8
- Wrong training procedure? HOT 6
- Strange issue occuring during Training HOT 2
- load pretrained tf1 model with pytorch HOT 5
- How to integrate a longer sequence model like longformer into declutr architecture HOT 8
- Encoder class breaks for long strings
- can i finetune the model ? HOT 2
- Update DeCLUTR requirements? HOT 5
- How to use a validation dataset when training? HOT 8
- RuntimeError: Error(s) in loading state_dict for DeCLUTR: HOT 2
- Error while encoding HOT 4
- Training with multi gpus HOT 6
- Installation fails in colab notebook HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from declutr.