forresti / squeezenet Goto Github PK
View Code? Open in Web Editor NEWSqueezeNet: AlexNet-level accuracy with 50x fewer parameters
License: BSD 2-Clause "Simplified" License
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters
License: BSD 2-Clause "Simplified" License
Here's the snippet from the train_val.prototxt
file for SqueezeNet V1.1. Thank you.
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "pool10"
bottom: "label"
top: "loss"
#include {
# phase: TRAIN
#}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "pool10"
bottom: "label"
top: "accuracy"
#include {
# phase: TEST
#}
}
layer {
name: "accuracy_top5"
type: "Accuracy"
bottom: "pool10"
bottom: "label"
top: "accuracy_top5"
#include {
# phase: TEST
#}
accuracy_param {
top_k: 5
}
}
I tried to run SqueezeNet in opencv dnn module
but i got the opencv error
Assertion failed <dim <= 2> in cv::Mat::reshape..............................
Anyone succesfully run squeezeNet in opencv dnn module??!!
Thanks for sharing this work. I am comparing the GPU memory utilization of the BVLC CaffeNet and SqueezeNet. The GPU Memory usage is not what I expect on Ubuntu 14.04 with a Titan X.
Idle:
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1504 G /usr/bin/X 337MiB |
| 0 2631 G compiz 113MiB |
| 0 3502 G ...s-passed-by-fd --v8-snapshot-passed-by-fd 129MiB |
| 0 10627 G /usr/bin/nvidia-settings 22MiB |
+-----------------------------------------------------------------------------+
After loading a caffe.Classifier with SqueezeNet's weights and deploy.prototxt with PyCaffe in a Jupyter notebook:
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1504 G /usr/bin/X 337MiB |
| 0 2631 G compiz 113MiB |
| 0 3502 G ...s-passed-by-fd --v8-snapshot-passed-by-fd 131MiB |
| 0 10627 G /usr/bin/nvidia-settings 22MiB |
| 0 13713 C /usr/bin/python 229MiB |
+-----------------------------------------------------------------------------+
While classiyfing with SqueezeNet: (t = timeit.Timer('net.predict([image], oversample=True).flatten().argsort()[:5]', 'from main import net, image') t.timeit(100):)
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1504 G /usr/bin/X 337MiB |
| 0 2631 G compiz 106MiB |
| 0 3502 G ...s-passed-by-fd --v8-snapshot-passed-by-fd 137MiB |
| 0 10627 G /usr/bin/nvidia-settings 22MiB |
| 0 13713 C /usr/bin/python 543MiB |
+-----------------------------------------------------------------------------+
BVLC CaffeNet Comparison
Idle:
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1504 G /usr/bin/X 337MiB |
| 0 2631 G compiz 113MiB |
| 0 3502 G ...s-passed-by-fd --v8-snapshot-passed-by-fd 133MiB |
| 0 10627 G /usr/bin/nvidia-settings 22MiB |
+-----------------------------------------------------------------------------+
After creating a CaffeNet caffe.Classifier:
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1504 G /usr/bin/X 338MiB |
| 0 2631 G compiz 113MiB |
| 0 3502 G ...s-passed-by-fd --v8-snapshot-passed-by-fd 139MiB |
| 0 10627 G /usr/bin/nvidia-settings 22MiB |
| 0 14231 C /usr/bin/python 184MiB |
+-----------------------------------------------------------------------------+
While classiyfing with CaffeNet: (t = timeit.Timer('net.predict([image], oversample=True).flatten().argsort()[:5]', 'from __main__ import net, image') t.timeit(100):)
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1504 G /usr/bin/X 338MiB |
| 0 2631 G compiz 113MiB |
| 0 3502 G ...s-passed-by-fd --v8-snapshot-passed-by-fd 139MiB |
| 0 10627 G /usr/bin/nvidia-settings 22MiB |
| 0 14231 C /usr/bin/python 465MiB |
+-----------------------------------------------------------------------------+
SqueezeNet appears to use more GPU memory than the reference BVLC CaffeNet. Am I missing something?
I'm trying to reproduce this example http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb using SqueezeNet, but for this picture https://github.com/BVLC/caffe/blob/master/examples/images/cat.jpg predicted class is 278 which is n02119789 kit fox, Vulpes macrotis
from https://github.com/HoldenCaulfieldRye/caffe/blob/master/data/ilsvrc12/synset_words.txt
Is it normal? Or something is wrong?
Here is full code:
import numpy as np
import matplotlib.pyplot as plt
# The caffe module needs to be on the Python path;
import sys
caffe_root = '/home/myuser/Downloads/caffe'# Change this line !
sys.path.insert(0, caffe_root + 'python')
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path.
import caffe
caffe.set_mode_cpu()
model_def = '/home/myuser/Desktop/GeneralDBCreator/python/SqueezeNet/SqueezeNet_v1.0/deploy.prototxt'
model_weights = '/home/myuser/Desktop/GeneralDBCreator/python/SqueezeNet/SqueezeNet_v1.0/squeezenet_v1.0.caffemodel'
net = caffe.Net(model_def, # defines the structure of the model
model_weights, # contains the trained weights
caffe.TEST) # use test mode (e.g., don't perform dropout)
net.blobs['data'].reshape(1, # batch size
3, # 3-channel (BGR) images
227, 227) # image size is 227x227
image_path= '/home/myuser/Desktop/GeneralDBCreator/python/cat.jpg' # Change this line !
image = caffe.io.load_image(image_path)
mu= np.array([104.0069879317889, 116.66876761696767, 122.6789143406786])
#mu= np.array([104, 117, 123])
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension
transformer.set_mean('data', mu) # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255) # rescale from [0, 1] to [0, 255]
transformer.set_channel_swap('data', (2,1,0)) # swap channels from RGB to BGR
transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image
output = net.forward()
output_prob = output['prob'][0] # the output probability vector for the first image in the batch
print 'predicted class is:', output_prob.argmax()
labels_file = '/home/myuser/Desktop/GeneralDBCreator/python/synset_words.txt'
labels = np.loadtxt(labels_file, str, delimiter='\t')
print 'output label:', labels[output_prob.argmax()]
What image preprocessing was performed on the imagenet images to achieve the stated feedforward top5 accuracy?
E.g. resize uniformly to 256 at the smallest dimension, then center crop
I've had a tough time figuring this out, and any help would be much appreciated.
Many thanks!
@macd @forresti @antingshen @samster25 @terrychenism
Hi, I use command"
caffe.exe time --model=SqueezeNet_v1.1_deploy.prototxt -gpu 0 -iterations 100
" to test the time.
AlexNet:11ms, SqueezeNet:30ms.
even use cudnn v4, the time of SqueezeNet is still twice or even three times than AlexNet.
Do you have any advice?
As noticed by @milakov, it's quite surprising to see the following 1x1 convolutional layer having the padding of 1:.
layer {
name: "conv10"
type: "Convolution"
bottom: "fire9/concat"
top: "conv10"
convolution_param {
num_output: 1000
pad: 1
kernel_size: 1
}
}
Any comments?
Hi
Can you explain why do we need the random seed? I noticed the random seed for ImageNet classification is set to 34. I also trained a model for face verification, without random seed, it sometimes doesn't converge. But when I set the random seed as 1000, it always converges. Can you explain how to determine random seed when faced with different training tasks?
Hi,
Has anyone tried to train a binary-weight/activation model of SqueezeNet? I'm trying to do so with XNOR-Net, but I can't get past 30% top-1 accuracy with binary weights only and 24% top-1 accuracy with binary weights and binary activations. I was expecting top-1 accuracies similar to binarized AlexNet (~50% with binary weights, ~40% with binary weights and binary activations, respectively).
Help is highly appreciated.
Thanks,
Alex
The comments on the SqueezeNet models indicate that the .caffemodel files include the training data. The deploy.caffemodel files' weights and biases are 0.0. Is this intentional? It wasn't evident from the comments. I see the squeezenet_v1.1.caffemodel does have non-zero weights and biases.
Look at any of the parameter data
import caffe
import numpy
nnet = caffe.Net("deploy.prototxt",1, weights= "deploy.caffemodel")
net.params['conv1'][0].data[...] # weights
net.params['conv1'][1].data[...] # biases
I have trained the two versions of Squeezenet, with success, thanks @forresti !
When training the one with residual connections, I am stucked. Whatever learning policy I took, the one shipped in this repo, or the plainly step
, I cannot train it to the results given in the paper. The accuracy is a bit lower than Squeezenet v1.0....
I know that I should post this in that repo, but I can't find issues tab there....
Anyone could shed me some light? Thanks in advance!
Hello,
Thank you for your contribution and beautiful work.
May I ask you to also share with us the log files for your training ?
Thank you in advance
how to compress the SqueezeNet with deep compression? Can you share the training code of SqueezeNet with deep compression?
I'm trying to measure speed of doing inference on a single image with SqueezeNet. When I run it on CPU, SqueezeNet seems fast enough (comparing to VGG). But when it is on GPU, SqueezeNet gets very slower, even slower than CPU.
Does anyone know why it gets slow on GPU? Should I do something on SqueezeNet when I run it on GPU?
Here are some results of experiments I have made for SqueezeNet vs VGG in terms of their speeds both on CPU and GPU.
On CPU, SqueezeNet is much faster than VGG16.
[inference time]
VGG average response time: 2.21110591888[sec/image]
SqueezeNet average response time: 0.288291954994[sec/image]
On GPU, VGG16 gets really faster, even faster than SqueezeNet. And SqueezeNet gets even slower than it on CPU.
[inference time]
VGG16 average response time: 0.0961683591207[sec/image] # get very fast
SqueezeNet average response time: 1.50337402026[sec/image] # get very slow <= why?
Thanks!
First, Thank you for sharing this awesome work,
I am trying to fine tune SqueezeNet to my own dataset (which is basically a subset of ImageNet labels),
Changes made in order to fine tune, inspired by this:
conv10
to conv10-new
.param
block to conv10-new
to increase learning rate for this layer: param {
lr_mult: 5
decay_mult: 1
}
param {
lr_mult: 10
decay_mult: 0
}
conv10-new
num_output
to my own number of classesbase_lr
by a factor of 10 to 0.004
(Tried several numbers, so far the above performed best)
While I was able to do it with AlexNet, with SqueezeNet my accuracy is about 20% lower, any tips for fine tuning?
Do you have a plan to release deploy.prototxt?
I plan to run classify.py from Caffe to test squeezenet performance for images I have and compare it with alexnet
.
Hi,
SqueezeNet is really cool architecture! I have added it to my caffenet-variants benchmark and it looks even better than caffenet.
https://github.com/ducha-aiki/caffenet-benchmark/blob/master/Architectures.md
Name | Accuracy | LogLoss | Comments |
---|---|---|---|
CaffeNet128-2048 | 0.470 | 2.36 | Pool5 = 3x3,fc6-fc7=2048 |
CaffeNet128-4096 | 0.497 | 2.24 | Pool5 = 3x3, fc6-fc7=4096 |
SqueezeNet128 | 0.530 | 2.08 | Reference SqueezeNet solver, but linear lr_policy and batch_size=256 (320K iters) |
SqueezeNet128+ELU | 0.555 | 1.95 | Reference solver, but linear lr_policy and batch_size=256 (320K iters).ELU |
Note, that because of speed reasons, I use image size = 128 px, so performances of all nets are degraded compared to classical 227px.
I`d like to suggest a bit different solver setup for SqueezeNet.
According to my tests on caffenet128, linear lr_policy works better, than squared, as in your solver:
https://github.com/ducha-aiki/caffenet-benchmark/blob/master/Lr_policy.md
Name | Accuracy | LogLoss | Comments |
---|---|---|---|
Step 100K | 0.470 | 2.36 | Default caffenet solver, max_iter=320K |
Poly lr, p=0.5, sqrt | 0.483 | 2.29 | bvlc_quick_googlenet_solver, All the way worse than "step", leading at finish |
Poly lr, p=2.0, sqr | 0.483 | 2.299 | |
Poly lr, p=1.0, linear | 0.493 | _2.24_ |
Best regards, Dmytro.
Hi, it's magic to see squeeze the parameters so much, great work. Two issues when I "caffe time" the model in Titan X:
I0722 09:33:39.867264 18424 caffe.cpp:377] Average Forward pass: 128.444 ms.
I0722 09:33:39.867269 18424 caffe.cpp:379] Average Backward pass: 307.341 ms.
I0722 09:33:39.867275 18424 caffe.cpp:381] Average Forward-Backward: 436.085 ms.
AlexNet:
I0722 09:34:11.348625 18438 caffe.cpp:377] Average Forward pass: 91.4737 ms.
I0722 09:34:11.348630 18438 caffe.cpp:379] Average Backward pass: 175.433 ms.
I0722 09:34:11.348635 18438 caffe.cpp:381] Average Forward-Backward: 267.041 ms.
Did I do something wrong or these are the issues after increasing the # of layers?
Thank you so much.
This is my training log.
https://gist.github.com/kli-nlpr/e0705a0d58a04178b8e6dbe554e7f072
The traning loss is always about 6.9....
I use the same train_val.prototxt and solver.prototxt as yours.
Thanks.
Hi, seems that original v1.0 had nice dimension relationships:
227 -> (227-7)/2+1=111 -> (111-3)/2+1=55 etc.
But in v1.1 we start to get:
227 -> (227-3)/2+1=113 -> (113-3)/2+1=56 etc.
To get output of conv1 to be 111, input image should be decreased to 223x223. Not sure how exactly Caffe handles this, but something mismatch in v1.1. Any idea?
Any intuition why does increasing number of anchors from 9 to 16, decreases mAP by 10%? I was expecting if it does not increase mAP at least it should have kept same. Is it due to replacing FC with conv layer in your SqueezeDet vs. YOLO?
Hello guys, I want to train SqueezeNet from scratch with my own data instead of using ImageNet dataset. I would like to ask is that possible or not. I am new to Caffe, I already know Tensorflow but seems like there is no SqueezeNet Tensorflow version support.
Hi,
I have a question about the training data resolution of SqueezeNet. Is it resized to 256x256 or with the smaller side 256 px. I have use both resolutions to test your pretrained model. Seem 256x256 resolution get a higher accuracy.
I wrote my own network, when start training my network, my total_loss decreased to 0.6~0.7 and does not goingdown any more everytime. I try to finetuning bias and stddev, but nothing worked.
And What is more, the accuracy does not increase, it is like some random number...
what can i do to fix the network?
Hi, I have noticed that you put ReLU after classifier, which is not a common practice. Is there some reason for it?
layer {
name: "conv10"
type: "Convolution"
bottom: "fire9/concat"
top: "conv10"
convolution_param {
num_output: 1000
kernel_size: 1
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
}
}
layer {
name: "relu_conv10"
type: "ReLU"
bottom: "conv10"
top: "conv10"
}
layer {
name: "pool10"
type: "Pooling"
bottom: "conv10"
top: "pool10"
pooling_param {
pool: AVE
global_pooling: true
}
}
Hi
When fine-tuning SqueezeNet, should some layers be frozen?
Thanks
Hi, is there any plan to, or is it possible to provide the model for SqueezeNet v1.1 with Residual Connections trained via Dense→Sparse→Dense (DSD) Training?
Hi,
I have downloaded the provided model. But when I looked its properties, type is PCX image not binary??
How can I have a binary type??
Hello
I wonder why we need to use pad=1 in conv-10 layer.
What is the goal to use padding with a 1x1 kernel ?
Thanks in advance for your help.
Alex
layer {
name: "conv10"
type: "Convolution"
bottom: "fire9/concat"
top: "conv10"
convolution_param {
num_output: 1000
pad: 1
kernel_size: 1
}
}
conv10 is a layer which has only 1*1 conv filters,could anyone please tell me why the pad is set to 1?thanks
Hi,
I want to know the performance improvement for SqueezeNet with reference to AlexNet.
Any idea?
Thanks.
William. J.
the most recent best performance net is resnet. any squeezenet version of resnet?
While trying out the SqueezeNet variants (1.0 and 1.1) on a Jetson TX1 dev board with TensorRT 1.0.0, I got the following error:
Parameter check failed in addPooling, condition: windowSize.h > 0 && windowSize.w > 0 && windowSize.h*windowSize.w < MAX_KERNEL_DIMS_PRODUCT
error parsing layer type Pooling index 64
I believe this refers to the following layer in the definitions (identical in both variants):
layer {
name: "pool10"
type: "Pooling"
bottom: "conv10"
top: "pool10"
pooling_param {
pool: AVE
global_pooling: true
}
}
I've just got the following advice from NVIDIA:
TensorRT caffe parser doesn't support global pooling, so it's just taking the H and W parameters from the network definition, and those default to 0.
The API check is complaining that there isn't a valid pooling layer definition.
If you replace the global pooling with an explicitly defined window, TensorRT should work.
Alas, I'm not a Caffe expert, so I'm struggling a bit with how to do that. Can anyone suggest please how the SqueezeNet definitions should be updated, so as to maintain the recognition accuracy?
I fine-tuned my own data based on the train_val.prototxt in which I change the num_output to 12(I just prepared 12 class person) and the name of conv10 to myconv10. When training, the accuracy reached 1 quickly, like below:
I0122 16:43:48.676445 13661 solver.cpp:218] Iteration 40 (0.0557035 iter/s, 718.088s/40 iters), loss = -nan
I0122 16:43:48.676497 13661 solver.cpp:237] Train net output #0: accuracy = 1
I0122 16:43:48.676512 13661 solver.cpp:237] Train net output #1: accuracy_top5 = 1
I0122 16:43:48.676530 13661 solver.cpp:237] Train net output #2: loss = -nan (* 1 = -nan loss)
I0122 16:43:48.676544 13661 sgd_solver.cpp:105] Iteration 40, lr = 0.03984
but sadly, when doing the prediction, I found the out put of prob layer is nan, here is the result:
output {'prob': array([[[[nan]],
[[nan]],
[[nan]],
[[nan]],
[[nan]],
[[nan]],
[[nan]],
[[nan]],
[[nan]],
[[nan]],
[[nan]],
[[nan]]]], dtype=float32)}
did anybody meet this before?
@forresti I want to use your idea to sqeeze my own big network, but i have no idea how to implement it. Can you give me some ideas to sqeeze any big model?? Can I use the original model to finetune the sqeezed model??
This work is very exciting! The provided weights does work as expected. The prototxt works out of the box with the default ilsvrc2012 lmdb data that came with caffe's examples.
However, my training loss from scratch has not decreased even after the full 85k iterations. I tried rebuilding the latest version of caffe, running a second time, and increasing the batch size by 4x: none of these attempts seemed to help. Am I correct in understanding that the model is meant to be trained end-to-end without tricks like layer-by-layer training or anything like that?
To help me diagnose my problem, would it be possible for you to provide a reference set of initialization weights caffemodel (or/and one of your earliest intermediate snapshots)?
Thank you for your help!
Is much of them is the training image data? thank you!
I am getting this:
2017-07-14 23:09:09.599393-0500 Light[10405:2914525] [core] Error Domain=com.apple.CoreML Code=1 "Input image feature image does not match model description" UserInfo={NSLocalizedDescription=Input image feature image does not match model description, NSUnderlyingError=0x1c0a5ae50 {Error Domain=com.apple.CoreML Code=1 "Image is not valid width 227, instead is 1280" UserInfo={NSLocalizedDescription=Image is not valid width 227, instead is 1280}}}
Hello,
I'm trying to use this model, but the number in the various strata looks weird.
Shouldn't the input be 3X224x224, as in the paper, instead of 3X227x227?
And what does it mean that the first dimension is 10?
btw, awesome work.
The accuracy is 0, and the loss is too high all the time when I run the model on cifar10.
Do I need to delete avg pooling layer?
SqueezeNet 1.0 and 1.1 are now available as a built-in model in the official pytorch/vision repo. In addition to model implementation, I pre-trained SqueezeNet 1.0 and 1.1 on ImageNet for PyTorch model zoo, and accuracy is even slightly better than the original Caffe models:
Model | Top-1 accuracy | Top-5 accuracy |
---|---|---|
SqueezeNet 1.0 | 58.000% | 80.488% |
SqueezeNet 1.1 | 58.184% | 80.514% |
Links: code in the repo, discussion is in the PR
I think a very interesting combination with SqueezeNet is RFCN or YOLO for object detection. I'm trying to port SqueezeNet from Caffe to Darknet + YOLO.
Could someone help to review it?
It is a port from v1.1
squeezenet.cfg
[net]
batch=64
subdivisions=1
height=227
width=227
channels=3
momentum=0.9
decay=0.0005
learning_rate=0.001
policy=steps
steps=20,40,60,80,20000,30000
scales=5,5,2,2,.1,.1
max_batches=40000
[crop]
crop_width=227
crop_height=227
flip=0
angle=0
saturation = 1.5
exposure = 1.5
# SqueezeNet: conv1
[convolutional]
filters=64
size=3
stride=2
activation=relu
# SqueezeNet: pool1
[maxpool]
size=3
stride=2
# SqueezeNet: fire2/squeeze1x1
[convolutional]
filters=16
size=1
activation=relu
# SqueezeNet: fire2/expand1x1
[convolutional]
filters=64
size=1
activation=relu
# SqueezeNet: fire2/expand3x3
[convolutional]
filters=64
size=3
pad=1
activation=relu
# SqueezeNet: fire2/concat
[route]
layers=-3
# SqueezeNet: fire3/squeeze1x1
[convolutional]
filters=16
size=1
activation=relu
# SqueezeNet:fire3/expand1x1
[convolutional]
filters=64
size=1
activation=relu
# SqueezeNet: fire3/expand3x3
[convolutional]
filters=64
size=3
pad=1
activation=relu
# SqueezeNet: fire3/concat
[route]
layers=-3
# SqueezeNet: pool3
[maxpool]
size=3
stride=2
# SqueezeNet: fire4/squeeze1x1
[convolutional]
filters=32
size=1
activation=relu
# SqueezeNet: fire4/expand1x1
[convolutional]
filters=128
size=1
activation=relu
# SqueezeNet: fire4/expand3x3
[convolutional]
filters=128
size=3
pad=1
activation=relu
# SqueezeNet: fire4/concat
[route]
layers=-3
# SqueezeNet: fire5/squeeze1x1
[convolutional]
filters=32
size=1
activation=relu
# SqueezeNet: fire5/expand1x1
[convolutional]
filters=128
size=1
activation=relu
# SqueezeNet: fire5/expand3x3
[convolutional]
filters=128
size=3
pad=1
activation=relu
# SqueezeNet: fire5/concat
[route]
layers=-3
# SqueezeNet: pool5
[maxpool]
size=3
stride=2
# SqueezeNet: fire6/squeeze1x1
[convolutional]
filters=48
size=1
activation=relu
# SqueezeNet: fire6/expand1x1
[convolutional]
filters=192
size=1
activation=relu
# SqueezeNet: fire6/expand3x3
[convolutional]
filters=192
size=3
pad=1
activation=relu
# SqueezeNet: fire6/concat
[route]
layers=-3
# SqueezeNet: fire7/squeeze1x1
[convolutional]
filters=48
size=1
activation=relu
# SqueezeNet: fire7/expand1x1
[convolutional]
filters=192
size=1
activation=relu
# SqueezeNet: fire7/expand3x3
[convolutional]
filters=192
size=3
pad=1
activation=relu
# SqueezeNet: fire7/concat
[route]
layers=-3
# SqueezeNet: fire8/squeeze1x1
[convolutional]
filters=64
size=1
activation=relu
# SqueezeNet: fire8/expand1x1
[convolutional]
filters=256
size=1
activation=relu
# SqueezeNet: fire8/expand3x3
[convolutional]
filters=256
size=3
pad=1
activation=relu
# SqueezeNet: fire8/concat
[route]
layers=-3
# SqueezeNet: fire9/squeeze1x1
[convolutional]
filters=64
size=1
activation=relu
# SqueezeNet: fire9/expand1x1
[convolutional]
filters=256
size=1
activation=relu
# SqueezeNet: fire9/expand3x3
[convolutional]
filters=256
size=3
pad=1
activation=relu
# SqueezeNet: fire9/concat
[route]
layers=-3
# SqueezeNet: drop9
[dropout]
probability=.5
# SqueezeNet: conv10
[convolutional]
filters=1000
size=1
activation=relu
# SqueezeNet: pool10
[avgpool]
# YoLo: output = (5 * 2 + CLASSES) * SIDE^2
[connected]
output=784
activation=linear
# YoLo
[detection]
classes=1
coords=4
rescore=1
side=7
num=3
softmax=0
sqrt=1
jitter=.2
object_scale=1
noobject_scale=.5
class_scale=1
coord_scale=5
I'm no sure how to exactly port these cases:
weight_filler {
type: "xavier"
}
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
Concat layers are strange too, don't know what index use on [route] frame=-?
Hi,
I've trained a new model based on SqueezeNet V1.1, and it achieved 61% top-1 accuracy on ImageNet without sacrificing parameter numbers and efficiency.
I've uploaded my model to this [https://github.com/miaow1988/SqueezeNet_v1.2] repository.
Would you please added my repository to your README.md file, so more people could know this work.
Jie
I use the tensorflow, how to convert the .caffemodel to .pkl. Thank you!!!
Hi, thanks for your sharing first of all.
Which layer is the best for feature extraction? Did you study any test about it?
Thanks.
Hello,
I am wondering why v1.1 does not have deploy.prototxt. Missed commit? Thank you.
Do you have any plan to release a pre-trained SqueezeNet model that DSD technique is applied? Thanks.
Hello, how will Squeezenet used to compress other network models, such as Faster-rcnn, the specific how to operate it, can you explain it in detail? Thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.