Comments (6)
To confirm, is it that you your failed to compile llvm with libtensorflow-gpu-linux-x86_64-1.15.0.tar.gz, then you used tensorflow-c-api to compile llvm and successfully trained the fuscia demo?
During training, the pipeline iterates between data collection (compiling the given corpus with the current TF model) and training (update the the TF model). In this process, the majority of the time is spent on data collection instead of on training (data collection/compiling is much more expensive than training in our pipeline), so it's not surprising that the gpu usage is low during training. Therefore, the best way to accelerate training process is to train on a beefy machine with higher data collection parallelism to save the data collection time.
from ml-compiler-opt.
To confirm, is it that you your failed to compile llvm with libtensorflow-gpu-linux-x86_64-1.15.0.tar.gz, then you used tensorflow-c-api to compile llvm and successfully trained the fuscia demo?
Yes, that's right and I am using a cloud server with 60 vCpu and a tesla T4 16G gpu. Does it mean that I need to change to a higher cpu to do this training?
yes, higher/more powerful cpu would be far more helpful; you definitely don't need a powerful gpu for the training (cpu can probably do the training pretty fast cause it's quite lightweight)
from ml-compiler-opt.
Thanks for trying it out and providing the feedback! We are looking into this.
I see your comment under another issue that the Fuchsia demo worked for you, so I'm wondering is it that you managed to figure out this issue?
from ml-compiler-opt.
Hi, I didn't solve this issue but I successfully trained the Fuchsia demo following the instructions. I am working on migrating the demo to my project which the python script used a lot of time (maybe 12h or so) and I found the gpu usage is lower than 20%, is there any idea to accelerate training process.
from ml-compiler-opt.
To confirm, is it that you your failed to compile llvm with libtensorflow-gpu-linux-x86_64-1.15.0.tar.gz, then you used tensorflow-c-api to compile llvm and successfully trained the fuscia demo?
Yes, that's right and I am using a cloud server with 60 vCpu and a tesla T4 16G gpu. Does it mean that I need to change to a higher cpu to do this training?
from ml-compiler-opt.
Thanks a lot for saving my day~
from ml-compiler-opt.
Related Issues (20)
- 【Question】How to use GPU training, just install tensorflow-gpu? will there be better performance if using a larger model? HOT 1
- 【Question】Why use llvm-size to calculate rewards? llvm also calculates size rewards? HOT 2
- 【Question】Can you open the code of ES algorithm? HOT 2
- 【Question】What parameters need to be passed in to compile the data set? -Oz -Xclang -fembed-bitcode=all? HOT 2
- How to train a model using bin's llvmbc and llvmcmd segments?I want to optimize directly using the executable program HOT 6
- Why can’t I use llvmbc and llvmcmd of executable programs?
- questions about feature log HOT 1
- Is it not very accurate to use the size reward of the entire file as the reward for each caller-callee feature, if the file is large and has a large number of caller-callee? HOT 1
- What does the size of sequence_examples depend on, and how to set its size? HOT 5
- Does llvm-15.04 support mlgo? What versions of tensorflow and other libraries are needed? HOT 1
- Why is the length of the reward limited to 3 or more? HOT 3
- How to know the effect of model inlining when training the model? HOT 1
- how to get model.tflite file from inlining-Oz-99f0063-v1.1.tar.gz HOT 1
- Why “-static” affects the test results of the model HOT 2
- why need to calculate reward_stat? I see llvm_trainer.train use reward from sequence_example.reward HOT 1
- Can I merge all the bc files into a total bc file for training?
- How to compile other dataset using llvm's thinlto flag?
- Where do I find pretrained models for MLGOPerf? HOT 4
- `--compile_task` flag missing HOT 2
- [non-issue] MLGO Questions HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ml-compiler-opt.