Comments (5)
Hi,
Thanks for the question. The code in the notebook is really just a proof of concept to guide in the creation of full landscapes. The code you want to look at for expansion is in expansion.py (here). That's the code that takes only a small subset of all patents for inference.
If the notebook isn't getting through the L1 and L2 expansion, what's probably happening is that the levels of expansion are just too large to fit into memory or to be serialized. You could try removing the serialization sections of the code, but when you re-run it it's going to re-query and do all that work again instead of caching it locally.
Hope that's helpful.
Thanks.
from patents-public-data.
Hello,
Yes, it's getting through the L1 and L2 expansion. (I can see the l1_patents_df and l2_patents_df). But it's not clear to me where in the code the pruning is happening?
Or would you iterate through the L1 and L2 patents with the model and predict the class of each patent?
Thanks!
from patents-public-data.
Hello, just to close out. I did end up iterating through the entire L1 and L2 and it seemed to work well.
Thanks!
from patents-public-data.
from patents-public-data.
Hi @sfd9898 , do you think you can help on this one #47?
The model gets deleted on cloud storage, I really want to try the model. I will be super appreciated if you can share the local copy of the models to me if you still have it. Thanks a lot!
from patents-public-data.
Related Issues (20)
- Error in word2vec: model from Google Cloud Storage was not downloaded HOT 4
- Unable to use the Patent-BERT HOT 1
- BERT for Patents: unable to access hidden layers HOT 6
- embedding model is not found// Automated Patent Landscaping HOT 5
- Expiration date HOT 2
- BERT for Patents yields 1024 element array, but embedding_v1 is 64 element HOT 5
- ResourceExhaustedError while running Document_representation_from_BERT HOT 2
- Empty Tables in the Dataset
- Linking proteins and humangenes annotation preferred name to identifier HOT 9
- Converting Tensorflow Bert for Patent saved model to keras.
- How to access hidden layers? HOT 3
- How to download HOT 5
- BERT-Base
- context tokens
- Generating new Document Embeddings
- Sklearn 1.1.1 Issue HOT 1
- Dataset lacking cited_by data even though its available on the website. HOT 4
- claim_text_extraction.ipynb df = pd.read_csv('./data/20k_G_and_H_publication_numbers.csv') workaround
- Lots of Patents in the latest patent dataset are missing a description
- Missing embedding HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from patents-public-data.