Git Product home page Git Product logo

Comments (19)

shicai avatar shicai commented on May 24, 2024 2

first, add a softmax layer to the end of your deploy.prototxt:

layer {
  name: "prob"
  top: "prob"
  type: "Softmax"
  bottom: "fc6"
}

then, try the following scripts:

nh, nw = 224, 224
im = caffe.io.load_image(img_path)
im = caffe.io.resize_image(im, [nh, nw])

img_mean = np.array([103.939, 116.779, 123.68], dtype=np.float32)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2, 0, 1))  # row to col
transformer.set_channel_swap('data', (2, 1, 0))  # RGB to BGR
transformer.set_raw_scale('data', 255)  # [0,1] to [0,255]
transformer.set_mean('data', img_mean)
transformer.set_input_scale('data', 0.017)

net.blobs['data'].reshape(1, 3, nh, nw)
net.blobs['data'].data[...] = transformer.preprocess('data', im)
out = net.forward()
prob = out['prob']
prob = np.squeeze(prob)

idx = np.argsort(-prob) 
print(idx[0:5])

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

scale is a multiplication op, like this:
im[:,:,0] = (im[:,:,0] - 103.94) * 0.017

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

if you are using caffe, you can do it like this:

  transform_param {
    scale: 0.017
    mirror: false
    crop_size: 224
    mean_value: [103.94,116.78,123.68]
  }

from densenet-caffe.

flyyufelix avatar flyyufelix commented on May 24, 2024

It works now. Thanks!

from densenet-caffe.

JinmingZhao avatar JinmingZhao commented on May 24, 2024

Hi, @shicai @flyyufelix , I have the same problem, the prediction result is not correct, I use the lables as:
https://gist.github.com/shicai/fa9f98edc23521382955d4731636d1af
and my test code as below, Thanks!

def test_prediction():
    model_path = 'DenseNet_161.caffemodel'
    proto_path = 'DenseNet_161.prototxt'
    img_path = sys.argv[1]

    net = initilize(prototext_path=proto_path, model_path=model_path, gpuId=0)

    int2label_path = 'caffe_image_labels'
    with open(int2label_path, 'r') as f:
        lines = f.readlines()
    int2label = [line.strip() for line in lines]

    # default is RGB and [0,1]
    img = caffe.io.load_image(img_path)  # H x W x 3
    nh, nw = resize_shape(img, min_size=256) #keey ratio
    # nh, nw = 224, 224

    # ref : https://github.com/flyyufelix/DenseNet-Keras/blob/master/test_inference.py
    import cv2
    img = cv2.resize(cv2.imread(img_path), (nw, nh)).astype(np.float32)
    img[:, :, 0] = (img[:, :, 0] - 103.94) * 0.017
    img[:, :, 1] = (img[:, :, 1] - 116.78) * 0.017
    img[:, :, 2] = (img[:, :, 2] - 123.68) * 0.017
    print img.shape
    img = img.transpose((2, 0, 1))
    transformed_img = img

    print(transformed_img, transformed_img.shape)
    net.blobs['data'].reshape(1, 3, nh, nw)
    net.blobs['data'].data[...] = transformed_img
    # print(net.blobs['data'].data[...])
    print('forward')
    net.forward()
    ft = net.blobs['conv5_blk/bn'].data  # relu5_blk
    print(ft.shape)

    prob = net.blobs['fc6'].data
    print prob.shape
    prob = np.reshape(prob, (1000,))
    prob = np.exp(prob) / np.sum(np.exp(prob))
    print prob[249], prob[251], np.max(prob)
    print(int2label[np.argmax(prob)])

from densenet-caffe.

JinmingZhao avatar JinmingZhao commented on May 24, 2024

Thanks for your reply, but the prediction is still wrong:

def test_prediction():
    model_path = 'DenseNet_161.caffemodel'
    proto_path = 'DenseNet_161.prototxt'
    img_path = sys.argv[1]

    net = initilize(prototext_path=proto_path, model_path=model_path, gpuId=0)

    int2label_path = 'caffe_image_labels'
    with open(int2label_path, 'r') as f:
        lines = f.readlines()
    int2label = [line.strip() for line in lines]
    int2label = np.asarray(int2label)

    nh, nw = 224, 224
    im = caffe.io.load_image(img_path)
    im = caffe.io.resize_image(im, [nh, nw])

    img_mean = np.array([103.939, 116.779, 123.68], dtype=np.float32)
    transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
    transformer.set_transpose('data', (2, 0, 1))  # row to col
    transformer.set_channel_swap('data', (2, 1, 0))  # RGB to BGR
    transformer.set_raw_scale('data', 255)  # [0,1] to [0,255]
    transformer.set_mean('data', img_mean)
    transformer.set_input_scale('data', 0.017)

    net.blobs['data'].reshape(1, 3, nh, nw)
    net.blobs['data'].data[...] = transformer.preprocess('data', im)
    out = net.forward()
    prob = out['prob']
    prob = np.squeeze(prob)

    idx = np.argsort(-prob)
    print(idx[0:5])
    print(int2label[idx[0:5]])

the model download from the 'baidu disk'
the picture is 'cat.jpg'
the lables as, https://gist.github.com/shicai/fa9f98edc23521382955d4731636d1af
the prediction result is,
[892 681 916 644 650]
["'n04548280 wall clock'" "'n03832673 notebook, notebook computer'"
"'n06359193 web site, website, internet site, site'"
"'n03729826 matchstick'" "'n03759954 microphone, mike'"]

Thanks,
Jinming

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

maybe your model is corrupted, please download the model again.

from densenet-caffe.

JinmingZhao avatar JinmingZhao commented on May 24, 2024

Hi, @shicai , I test the prediction of DenseNet_121 model is ok, but I have redownload the DenseNet_161 that is still not work, could you please check the DenseNet_161 model on the "baidu disk" ?

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

it should be ok, since it has been tested by others on github. or you can try to download the model from Google Drive.

from densenet-caffe.

JinmingZhao avatar JinmingZhao commented on May 24, 2024

@shicai I have test model from the Google Drive, the result is same, and the results is still wrong. But, just modify 161 to 121, the result will be right. could you test the 161 model?

from densenet-caffe.

JinmingZhao avatar JinmingZhao commented on May 24, 2024

And I will download the 169 and 201 models for verify this problem.

from densenet-caffe.

JinmingZhao avatar JinmingZhao commented on May 24, 2024

@shicai
There are some test result:s
with image shape 224 * 224:
{121,169,201} model are OK, but 161 model (baidu disk and Google Driven) can not predict the cat.jpg and a dog picture (Whole body), but can predict a husky dog (only a head) , the top 5 result are ["'n02108915 French bulldog'" "'n02110185 Siberian husky'" "'n02808304 bath towel'" "'n02085620 Chihuahua'' "'n02097298 Scotch terrier, Scottish terrier, Scottie'"].
with image shape 256*?, 256 is the smaller edge, process the image for keeping ratio.
{121,169,201} model are OK, but 161 mode is not work.

So the 161 model has something wrong?

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

please check this:
md5sum DenseNet_161.caffemodel = 26fee6531e67a7c239e10fa009ca2a57

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

I downloaded DenseNet161 from Google Drive.
It works well, and the top 5 predicted labels are: [ 0 1 389 397 392]

from densenet-caffe.

JinmingZhao avatar JinmingZhao commented on May 24, 2024

26fee6531e67a7c239e10fa009ca2a57 DenseNet_161.caffemodel (Google Driven)
26fee6531e67a7c239e10fa009ca2a57 DenseNet_161.caffemodel.old (baidu disk)

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

no, i use a tench image n01440764_37.JPEG from ImageNet dataset.

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

when using cat.jpg in caffe examples, the result is [282 285 281 263 287]

from densenet-caffe.

JinmingZhao avatar JinmingZhao commented on May 24, 2024

Oh, I found the reason, I have modify the prototxt
name: "DENSENET_161"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224
to
name: "DENSENET_161"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 1
input_dim: 1 .

Because I have seen a prototxt in openpose that is written in this way:
input: "image"
input_dim: 1
input_dim: 3
input_dim: 1 # This value will be defined at runtime
input_dim: 1 # This value will be defined at runtime
So, I think this can be OK.
Would you please explain the reason? I haven' t use the caffe before, so I don't understand some details.

Thank you very much! Sorry, It's my mistake.

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

The four values of input_dim indicate N,C,H,W respectively.
N=1 means a single image, C=3 means color image with RGB channels.
The last two values are the height and width of input image.
openpose uses a customized caffe, maybe some image processing steps have been changed.

from densenet-caffe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.