Comments (4)
Hello!
You could modify this part.
I think the issue is the following one:
CombinedTM takes in input contextualized representations and the vocab.
Contextualized representations are expanded to match the dimension of the vocab.
You could add a Linear layer that reduces the size of the vocab before concatenation.
I believe this should improve performance
from contextualized-topic-models.
Hi,
I tried to add it like this but still doesnt work.
Is this what you meant by adding a linear layer?
35000 is my vocab size. My desired size is 52k.
Thanks in advance.
from contextualized-topic-models.
Hello!
I'd probably do something like this:
self.adapt_bert = nn.Linear(bert_size, 400)
self.adapt_vocab = nn.Linear(input_size, 400)
In this way, embedded dimensions are constrained to 400. You will still need to reconstruct a 52K vocabulary size and the model might struggle a bit, but there's not much we can do for that currently.
from contextualized-topic-models.
Hi,
Thanks for your help! I tried to use a more powerful CPU to compute 52k vocab and it works well without constraining embedded dimensions.
Thanks!
from contextualized-topic-models.
Related Issues (20)
- How to create 'miscellaneous' topic from this model HOT 1
- Numpy error evalation scores HOT 17
- OSError: [Errno 22] Invalid argument HOT 5
- representation embedding HOT 18
- How to work with Large dataset? HOT 14
- How to Find coherence of this Topic and Model? HOT 1
- GPU and CPU usage HOT 2
- Custom Embedding vs Vocabulary HOT 10
- [help] Required versions HOT 4
- Perplexity HOT 3
- AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names' HOT 2
- Loading own embedding & division by zero error HOT 7
- Testing with custom embedding HOT 7
- More time spent for finding smaller number of topics HOT 5
- Add patience to reduce LR as CTM argument HOT 1
- Bug: Minor bug when constructing the model directory path
- Running cythonize failed! HOT 2
- Variable naming issues HOT 3
- ImportError: cannot import name 'CombinedTM' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from contextualized-topic-models.