Hi, I want to know the performance improvement for SqueezeNet with reference to A

You might be interested in my CNN analysis tool at <a href="http://dgschwend.github.io

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

regarding performance improvement for AlexNet about squeezenet HOT 20 CLOSED

forresti commented on September 23, 2024

regarding performance improvement for AlexNet

from squeezenet.

Comments (20)

dgschwend commented on September 23, 2024 8

You might be interested in my CNN analysis tool at http://dgschwend.github.io/netscope.
The web-based tool allows you to calculate the total number of operations, weights, and activation memory needed for each layer in a given caffe network. Squeezenet v1.0, SqueezeNet v1.1 and AlexNet are included as presets.

Processing time should be more or less proportional to the number of Multiply-Accumulate Operations. In embedded systems, the intermediate memory needed for the activations/feature maps is probably also relevant. Squeezenet v1.1 is definitely an improvement there. Here's a summary:

CNN	#MACC Operations	#Weights	#Activations
AlexNet	1140M	62.37M	2.39M
Squeezenet v1.1	388M	1.23M	7.84M
Squeezenet v1.0	861M	1.24M	12.73M
Inception v3	3230M	23.83M	18.51M
GoogleNet	1600M	6.99M	10.37M
VGG-16	16360M	169.8M	30.06M

Edit: added some other well-known CNNs for comparison. all input crops = 227x227x3
Edit2: Fixed MACCs for VGG-16

from squeezenet.

forresti commented on September 23, 2024

When you say "performance," do you mean "speed," "accuracy," or something else?

from squeezenet.

williamjames1 commented on September 23, 2024

yes basically I want to understand time required for processing one image using original Alexnet and after using SqueezeNet at more or less same accuracy (using GPU-cudnn).

from squeezenet.

Grabber commented on September 23, 2024

@forresti that's an extremely useful benchmark, when we are talking about embedding CNNs into small devices it is not only about shrinking the model size but aggressively reducing the number of computations per frame.

I really think Squeezenet is on the way, but there are no numbers about it...

from squeezenet.

forresti commented on September 23, 2024

@dgschwend Very nice! I have been using Netscope.

If I remember correctly, GoogLeNet-v1 has ~10x fewer MACCs than VGG-19. Could I be wrong about that?

from squeezenet.

dgschwend commented on September 23, 2024

@forresti You're right, I somehow missed one digit... fixed!

from squeezenet.

Grabber commented on September 23, 2024

@dgschwend That is a must have tool, I was exactly thinking about an oscilloscope for CNNs! Do you think your tool could generate visual representation about layers too?

On Darknet framework there is a feature called visualize that generates visual representations of the filters, layer by layer, take a look:

It would be useful to have this kind of visual representation rendered on the menu that is shown when you pass the mouse over the network layer.

from squeezenet.

dgschwend commented on September 23, 2024

@Grabber This gets a little bit off-topic, maybe we can move the discussion to my netscope repository? dgschwend/netscope#1

from squeezenet.

forresti commented on September 23, 2024

@dgschwend BTW, I noticed you've run Netscope on Inception-v3. Do you have Caffe config files for Inception-v3? (And, better yet... a working training protocol for Inception-v3?)

http://dgschwend.github.io/netscope/#/preset/inceptionv3

from squeezenet.

dgschwend commented on September 23, 2024

@forresti You can view and edit the ".prototxt" content by clicking on the "(edit)" link near the network title (http://dgschwend.github.io/netscope/#/preset/inceptionv3)

The original model is from https://github.com/smichalowski/google_inception_v3_for_caffe, but I never tried training it.

from squeezenet.

forresti commented on September 23, 2024

@dgschwend Got it! Thanks a lot!

from squeezenet.

psyhtest commented on September 23, 2024

@williamjames1 @Grabber @dgschwend @forresti

By the way, we have recently released CK-Caffe, a framework for collaborative performance analysis and optimisation of Caffe across multiple platforms, libraries, models, etc.

For example, this Jupyter notebook compares the best performance per image across 4 CNNs and 4 BLAS libraries on a Samsung Chromebook 2 platform. When using OpenBLAS, SqueezeNet 1.1 is 2 times faster than SqueezeNet 1.0 and 2.4 times faster than AlexNet, broadly in line with expectations set by the SqueezeNet paper.

We also have comparisons for other platforms, models and optimisations. (We are discussing with our customer what and when we can release in addition to the core CK-Caffe framework.)

In addition, we are working on an engine for crowdsourcing benchmark results from Linux, Android, Windows, etc. platforms. Stay tuned and feel free to get in touch!

from squeezenet.

gbrand-salesforce commented on September 23, 2024

I wonder how it compares to ResNet50 (or even ResNet18, which would probably be closer to the accuracy of SqueezeNet)

from squeezenet.

psyhtest commented on September 23, 2024

@gbrand-salesforce
That's the sort of questions we are aiming to answer with CK-Caffe. If you have a deploy.prototxt and a platform of interest, we can easily run the experiments there and share results to build a common knowledge. Ping me if you are interested.

from squeezenet.

dgschwend commented on September 23, 2024

@psyhtest Looks like a very interesting project!

Feel free to benchmark my ZynqNet CNN, too, if you're interested. I started with SqueezeNet and tried to build a very well-balanced CNN architecture, which fits well onto a custom-designed FPGA accelerator (you might be interested in this part as well...).

The project report and all code from my Master Thesis "ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network" are public. 😉

from squeezenet.

michaelholm-ce commented on September 23, 2024

Hello all --- I'm interested in finding the fastest model architecture (i.e. lowest number of MACC's) with reasonable accuracy (speed is more important than accuracy for me). Based on comparisons posted here, it looks like Squeezenet1.1 is the best choice, but based on my reading, the darknet reference model (https://pjreddie.com/darknet/imagenet/#reference) or the so-called quicknet (https://arxiv.org/pdf/1701.02291.pdf) seem faster, but I have not been able to find any caffe implementations for these. Ideally, I would like to train using caffe in DIGITs, but do not have the experience to implement these in caffe from scratch.

Any thoughts or recommendations here?

from squeezenet.

mrgloom commented on September 23, 2024

In my tests using Caffe SqeezeNet v1.1 is slightly slower than AlexNet (I was using build-in tool for measure forward pass performance):
https://github.com/mrgloom/kaggle-dogs-vs-cats-solution

Offtopic: regarding layer activation and weights visualization NVIDIA DIGITS can do this, but netscope have nicer visualization of networks.

from squeezenet.

anuragmundhada commented on September 23, 2024

The Googlenet page on all Netscope analyzers out there shows the wrong number for MACC (it's 10 times higher). Do you know why that is? @dgschwend
https://dgschwend.github.io/netscope/#/preset/googlenet

from squeezenet.

dgschwend commented on September 23, 2024

@anuragmundhada, What‘s your golden reference regarding the MACCs?
Let‘s open an issue on the http://github.com/dgschwend/netscope project for that discussion...

from squeezenet.

anuragmundhada commented on September 23, 2024

Taken the reference from the Inception-v2 paper:
Rethinking the Inception Architecture for Computer Vision - https://arxiv.org/pdf/1512.00567.pdf
Table 3 states that Cost is 1.5 Bn Ops - which should correspond to 1.5G macc, if I am not wrong.

Opening an issue on your repo. I mentioned this here only because you had put in a comment above, and seemed to have faced the same problem before correcting it.

from squeezenet.

regarding performance improvement for AlexNet about squeezenet HOT 20 CLOSED

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent