Comments (3)
Hi @Rpona
Could you please check that MPI itself works fine, e.g. with:
mpirun -n 4 -ppn 1 -machinefile ~/mpd.hosts hostname
and simple "Hello World" MPI test.
BTW which MPI do you use?
Additionally please check and clean up (if there're a lot of stale files) /dev/shm files.
from mlsl.
Hi @itemko
Thanks for your reply.
I used intel mpi and mpi runtime build-in of mlsl package.
Both they met the same issues.
I used this hello word.c as below and it can run well.
http://mpitutorial.com/tutorials/mpi-hello-world/
I clean up /dev/shm files., but it didn't work.
(BTW, before I compiling intel caffe with USE_MLSL=1, it can run well. )
from mlsl.
Hi @itemko
I fixed this issue with
export MLSL_NUM_SERVERS=0
export MLSL_STATS=1
I can run mlsl_test on multi-node now. Thank you!
from mlsl.
Related Issues (19)
- make mlsl_test.cpp with gcc failed HOT 1
- Error while using MLSL with Intel Deep Learning SDK HOT 1
- Segment Fault when Calling MLSL::Environment::GetProcessIdx() HOT 2
- Where is install.sh HOT 1
- tf_cnn_benchmarks HOT 4
- mlsl_test with MLSL_NUM_SERVERS>0 error HOT 25
- ep_server build fails HOT 3
- run intel caffe using multi-node with mlsl on AMD cpus ,stopped at Iteration 0 HOT 3
- MAX_COMPUTE_OP is defined as 400 which is too small and failed to run distributed resnet50 and resnet101 HOT 1
- run multi-nodes intel-caffe on AMD cpus(x86) by mlsl ,only few cores used HOT 1
- I_MPI_FABRICS: only "tcp" works HOT 1
- Memory Leak
- Multi-node Training HOT 2
- how to Run MLSL program?
- Memory corrupted error : running MLSL on multiple node HOT 4
- assert (!(pollfds[i].revents & ~POLLIN & ~POLLOUT & ~POLLHUP & ~POLLERR)) failed : running MLSL on multiple node HOT 1
- Multi-node Training Deadlock HOT 17
- Can not run mlsl_test HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlsl.