Comments (8)
The randomness probably comes from DGL sampler. You can try this. https://docs.dgl.ai/en/0.4.x/generated/dgl.random.seed.html
I think it should fix the randomness of the DGL sampler. If not, we'll fix it. Thanks
from dgl-ke.
Thanks for the reply!
After some digging, it seems like OMP_NUM_THREADS also need to be set to 1 to get same outputs from sampler, since the default edge sampler is using multi-threading when creating negative entity list. However, the final output embeddings ares still different. I'm wondering is it even possible to get completetly same embedding from different training process, especially when using multi-process/thread and multi GPU?
from dgl-ke.
If it's multithreading or multiprocessing, I think it's impossible to get it reproducible. I'm not sure if any of the GPU parallel computation can make it non-deterministic as well.
from dgl-ke.
Thanks for your help!
I managed to produce deterministic results on both CPU and GPU. Here's what I've done.
- Fix random seed of numpy, Pytorch and DGL. For Pytorch on GPU, CUDA random seed and some other CuDNN related options need to be set as well.(https://pytorch.org/docs/stable/notes/randomness.html)
- Set both --num_thread and --num_proc to 1, and turn off all other multi-thread related options(like --async_update).
- Set OMP_NUM_THREADS=1. This step is crucial, since even if you set both thread num and procedure num to 1, DGL will still use OpenMP to automatically parallelize other jobs at background, which will introduce randomness.
from dgl-ke.
Good to know. Thanks for showing us how to make the training deterministic. It'll be useful for future users.
from dgl-ke.
3. OMP_NUM_THREAD
Does the second operation matter? And how to set num_thread and num_proc to be 1?
Thank you.
from dgl-ke.
@Megavoxel01, How to set num_threads and num_proc to be 1 and Set OMP_NUM_THREADS=1?
from dgl-ke.
@Megavoxel01 can you please give more detailed information (actual files to modify and code) to get a deterministic model?
Or is there any chance that it has been integrated into dgl-ke since then?
Thanks in advance
from dgl-ke.
Related Issues (20)
- Upgrade DGL dependency HOT 2
- Can DGL_KE models be implemented on dynamic knowledge graphs? HOT 1
- Force dtype to int64 to ensure that we don't index with non-long tensor
- IndexError: list index out of range when training on raw user defined knowledge graph HOT 4
- Support Adam or Adagrad HOT 8
- Can not install dgl 0.4.3 HOT 4
- DGLBACKEND s not recognized as an internal or external command HOT 1
- No module named 'ogb
- whether just assign vertexes but not the edges together with on graph partition when use METIS
- [BUG] Quick start example code does not work HOT 4
- dgl.__version__ >= 0.8 breaks on partition.py HOT 2
- RuntimeError: Cannot re-initialize CUDA in forked subprocess HOT 1
- Multi-gpu training is not effective on specific cases
- `graph.HeteroGraph` Error happened when running example HOT 1
- !DGLBACKEND=pytorch dglke_train Not Working HOT 1
- pytorch dglke_train Not Working, Expected type graph.Graph but get graph.HeteroGraph HOT 3
- can't train my KG ,it keeps telling me 'AssertionError: test set is not provided'
- Installation error, no corresponding version HOT 1
- DGL-KE TransR Predict Error
- 'dgl' has no attribute '_deprecate'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dgl-ke.