Comments (8)
Hi, soyebn,
- Though as the search terminates, it only prints the FLOPs of models, the model is expected to have a low latency if you set the objective as latency. Sorry for that I didn't add the latency measurement at the end of the search process, but in
retrain.py
after search. You could further add the measurement code at the end of the search if needed. - As the search process involves multiple paths, the predicted latency term in the loss function doesn't equal to the real value exactly, so as in Proxyless. You could adjust the latency of the final searched model by tuning the below parameters for cost optimization.
DenseNAS/configs/imagenet_search_cfg_mbv2.yaml
Lines 41 to 42 in e479206
A tuning tip:sub_obj
parameters can be tuned with resuming the model with operation weights trained yet but architecture parameters untrained, which is saved as a checkpoint. This can help you save computation resources.
Hoping this could help you!
from densenas.
Hi Jaimin,
You are immense help already. Thanks for the reply.
I was not really worried about latency not bring printed. I just wanted a way to tell search process what is the latency of my interest. So let's say, I have 40 ms latency in mind, then I should set sub_loss_factor to 0.4 and log_base =15.0 (or this also needs to be changed to 40.0)
For tuning tip, I should do the following right?
arch_update_epoch: 50
if_resume:true
resume:
load_path: ''path to weights_49.pt"
load_epoch: 49
from densenas.
Hi, Soyeb,
Normally, log_base
is set as the target value and sub_loss_factor
is used for controlling the magnitude of cost optimization. When log_base
is larger, the latency is lower; vise versa.
The tuning tip you described is right!
from densenas.
Hi Jamin,
I read your paper one more time and understood it little better.
I am running two experiments (both use weights_49.pt) with log_base =15 and 30. I am observing the following,
log_base | running_latency | additional_loss_due_to_latency |
---|---|---|
30 | 32.xx | 0.15 |
15 | 32.xx | 0.19 |
Is this on expected lines? I was thinking in the log_base=15 case running latency should settle down close to 15.
How did you obtain various models like DenseNetA/B/C/Large? If you could share various params (log_base, sub_loss_factor) you might have changed to achieve those, it will guide me on how do I go about getting models with different latency.
Thanks again for all your help.
from densenas.
Hi, Soyeb,
Does the running_latency in your table denote the latency printed during the search? The latency printed during the search is predicted according to the architecture parameters of the super network. As the architecture probabilities are computed by applying softmax on architecture parameters and differ little, the predicted latency during the search may differ slightly with different latency optimization hyper-params. And as the latency prediction during the search involves all the paths between blocks but only one path in the final searched architecture, the predicted latency differs from the final derived ones or the target ones. It is better to observe the latency of the final searched model but not only the predicted latency during the search.
Indeed, log_base
plays a same role as sub_loss_factor
, as log_a b = log b / log a
. But log_base
may help you better tune the hyper-params. For my experiments, I set sub_loss_factor
as 0.22, 0.2, 0.15, 0.05 for DenseNAS-A, -B, -C, -Large respectively and log_base
as 15 for all.
Thanks for your interest in our work and hoping my answer may help you.
from densenas.
Hi Jamin,
You are right, I meant latency printed during search as "running_latency".
My two rounds of search ([log_base=15, sub_loss_fac=0.15 ]and [log_base=15, sub_loss_fac=0.12]) got over. Is there any code I can reuse to estimate latency based on lat_list_densenas_mbv2_xp32 file used for searching. The latency numbers shown in the paper, e.g. 17.9 ms for DenseNAS-C, was reported by actually measuring time by running or there was latency measuring code which used pre-computed latency table like used in lat_list_densenas_mbv2_xp32. I attempted but looks little tricky as lat_list_densenas_mbv2_xp32 is tied to supernet, hence the question. Any suggestion?
Thanks for your help again.
from densenas.
Hi Soyeb,
I didn't implement the code for the latency prediction of a derived model based on the lookup table. You could directly measure the latency of the model by using the following code or predict the latency by indexing the blocks and op types of the lookup table.
Line 142 in e479206
from densenas.
Please reopen it if needed.
from densenas.
Related Issues (20)
- index out of range HOT 8
- About Search HOT 5
- some questions about the code HOT 1
- some questions about search_space_mbv2.py HOT 2
- 关于DenseNAS中stage的疑问 HOT 4
- retrain阶段的net_config问题 HOT 2
- Some questions about the search stage HOT 2
- architecture derived problem HOT 6
- Sub loss factor in ResNet-based search space HOT 4
- 有关代码的一些问题 HOT 4
- no random noise in gumbelsoftmax? HOT 1
- 請問正確的路徑配置? HOT 1
- How to execute your codes?
- Search on CIFAR10
- net_config HOT 1
- Shape alignement code HOT 1
- Script for search HOT 3
- Provide the implementation of the DenseNAS ? HOT 1
- Preparation of the dataset HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from densenas.