bestivictory / ilgnet Goto Github PK

Python 100.00%

ilgnet's Introduction

ILGnet

This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.

In this paper we investigate the image aesthetics classification problem, aka, automatically classifying an image into low or high aesthetic quality, which is quite a challenging problem beyond image recognition. Deep convolutional neural network (DCNN) methods have recently shown promising results for image aesthetics assessment. Currently, a powerful inception module is proposed which shows very high performance in object classification. However, the inception module has not been taken into consideration for the image aesthetics assessment problem. In this paper, we propose a novel DCNN structure codenamed ILGNet for image aesthetics classification, which introduces the Inception module and connects intermediate Local layers to the Global layer for the output. Besides, we use a pre-trained image classification CNN called GoogLeNet on the ImageNet dataset and fine tune our connected local and global layer on the large scale aesthetics assessment AVA dataset [1]. The experimental results show that the proposed ILGNet outperforms the state of the art results in image aesthetics assessment in the AVA benchmark.

The AVA dataset

For a fair comparison, we adopted same strategy to construct two sub datasets of AVA as the previous work.

[1] Naila Murray, Luca Marchesotti, Florent Perronnin. AVA: A Large-Scale Database for Aesthetic Visual Analysis. Computer Vision and Pattern Recognition (CVPR), 2012.

• AVA1: We chose the score of 5 as the boundary to divide the dataset into high quality class and low quality class. In this way, there are 74,673 images in low quality and 180,856 images in high quality. the training and test sets contain 235,529 and 20000 images.

• AVA2: to increase the gap between images with high aesthetic quality and images with low aesthetic quality, we firstly sort all images by their mean scores. Then we pick out the top 10% images as good and the bottom 10% images as bad. Thus, we select 51,106 images form the AVA dataset. And all images are evenly and randomly divided into training set and test set, which contains 25,553 images.

The way of test

please use caffe test tools to test accuracy.

The Accuracy of this random partition in the './data'

The accuracy we achieve in the AVA1 dataset is 81.68% with δ=0.And the accuracy is up to 82.66% using Inception V4.

The accuracy we achieve in the AVA2 dataset is 85.50%.And the accuracy is up to 85.53% using Inception V4.

We achieve the state of the art of the aesthetic classification accuracy.

The random partition programs are in the './src'

The Trained Models

The size of the trained model is above 500MB.

You can download them from the BaiduYun cloud disk or Google Drive:

BaiduYun Links:

ILGnet-AVA1.caffemodel

ILGnet-AVA2.caffemodel

Google Drive Links:

ILGnet-AVA1.caffemodel

ILGnet-AVA2.caffemodel

Plus:The deploy.prototxt before is wrong. Now we upload the correct file, and thanks for your suggestion.

Our paper

Xin Jin, Jingying Chi, Siwei Peng, Yulu Tian, Chaochen Ye and Xiaodong Li. Deep Image Aesthetics Classification using Inception Modules and Fine-tuning Connected Layer. The 8th International Conference on Wireless Communications and Signal Processing (WCSP), Yangzhou, China, 13-15 October, 2016 pdf(5.94MB) oral presentation(19.1MB) arXiv(1610.02256) [Project]

If you find our model/method/dataset useful, please cite our work:

@inproceedings{DBLP:conf/wcsp/JinCPTYL16,

author = {Xin Jin and Jingying Chi and Siwei Peng and Yulu Tian and Chaochen Ye andXiaodong Li},

title = {Deep image aesthetics classification using inception modules and fine-tuning connected layer},

booktitle = {8th International Conference on Wireless Communications {&} Signal Processing, {WCSP} 2016, Yangzhou, China, October 13-15, 2016},

pages = {1--6},

year = {2016},

crossref = {DBLP:conf/wcsp/2016},

url = {http://dx.doi.org/10.1109/WCSP.2016.7752571},

doi = {10.1109/WCSP.2016.7752571},

timestamp = {Fri, 16 Dec 2016 12:48:17 +0100},

biburl = {http://dblp.uni-trier.de/rec/bib/conf/wcsp/JinCPTYL16},

bibsource = {dblp computer science bibliography, http://dblp.org}

}

Latest edit

Jan 15, 2017

ilgnet's People

Contributors

Stargazers

Watchers

ilgnet's Issues

caffe新手，能提供详细一点的使用说明吗？

感谢作者开源这款优秀软件，刚接触caffe，希望能测试一下这个代码，就是有点不知道从哪里下手，作者能提供详细一点的安装使用说明吗？谢谢

How to do the prediction via pretrained model?

This is my script to load the pretrained model and make prediction. However, it is not able to recognize the apparently pretty or ugly image, so I am thinking maybe my input is not right.
There are several places that I am not quite sure:

Is the input RGB or BGR?
The input scale should be 0~255 rather than 0-1, right?
The AVA1_mean file is (3, 256, 256), should I crop it to (227, 277, 3) and subtract that from each image?
If possible, can anyone post a script about how to correctly read image, load model and make prediction? It is much appreciated.

import numpy as np
from PIL import Image

def preprocess_image(fp, ava1mean):
    im = Image.open(fp).convert("RGB")
    im = im.resize([227, 227])
    im = np.asarray(im).astype(np.float32) # 227, 227, 3
    if len(im.shape) != 3:
        raise Exception
    # im = im[:, :, ::-1] shall we convert RGB -> BGR?
    im -= ava1mean
    return im # 227, 227, 3


ava1mean = np.load("../ILGnet/mean/AVA1_mean.npy") # 3, 256, 256
ava1mean = ava1mean.transpose(1, 2, 0)[14:241,14:241,:] # 227, 227, 3

inputs = [preprocess_image("../ugly.jpg", ava1mean)]
classifier = caffe.Classifier("deploy2.prototxt", "ILGnet-AVA1.caffemodel",
                              image_dims=[227, 227])

print(classifier.predict(inputs, True))```

Random output value for same image

Upon running the test.py on the same image over multiple times, I get random output numbers as results. Variations are quite big ranging from 0.1 to 0.8 and 0.9. I'm using caffe 1.0.0

Unknown bottom blob 'label' (layer 'loss1/loss', bottom index 1)

新手试运行了代码：
import numpy as np
import matplotlib.pyplot as plt
caffe_root = '/opt/caffe/'
import sys
sys.path.insert(0, caffe_root + 'python')
import caffe
MODEL_FILE = caffe_root + 'ILGnet/deploy.prototxt'
PRETRAINED = caffe_root + 'ILGnet/ILGnet-AVA2.caffemodel'
IMAGE_FILE = caffe_root+'examples/images/cat.jpg'
mean_file=caffe_root + 'ILGnet/AVA2_mean.npy'
caffe.set_mode_cpu()
net = caffe.Classifier(MODEL_FILE, PRETRAINED,
mean=np.load(mean_file).mean(1).mean(1),
channel_swap=(2,1,0),
raw_scale=255,
image_dims=(227, 227))
input_image = caffe.io.load_image(IMAGE_FILE)
plt.imshow(input_image)
prediction = net.predict([input_image])
plt.plot(prediction[0])
plt.show()
print 'predicted class:', prediction[0].argmax()
不知道哪里有错，希望能解答

想请教一下AVA1的具体训练参数

您好，
您的train.prototxt是AVA2使用的，那AVA1训练使用的train.prototxt是否也完全相同呢？
我用AVA1_solver.prototxt加上train.prototxt进行训练很快会出现loss=87.3365的现象，即便将学习率调小，使用batchsize=48训练了10W个iteration之后准确率依旧只有75%左右。

difference between train.prototxt and ILGNet_v4.prototxt

I want to know what is the difference between train.prototxt and ILGNet_v4.prototxt. I would appreciate it if you could provide me with more information.

why different output of the same image in two different test?

I have used the pretrained model you offered(https://pan.baidu.com/s/1slMv4yp), and just modify the model name and image name of your test code. But in two different test, I have got different output results, for example, {good:0.6, bad:0.4} {good:0.4, bad:0.6}. It makes me confused and expects your answers~

why the image numbers of test set in this repository and in the paper are different ?

In the paper , image number of test set is 19930, but in this repository the number is 20000. And in readme.md , it is said that the test set in this repository is random partition, so the test accuracy is different , 81.68% in this repository and 79.25% in your paper . Could you please provide the image id of the test set in your paper ? Thank you very much.

测试了10万张图片，试验结果感觉很不理想，这是最高分图片截图：

爬虫爬了10万张高清图，精美的、中庸的以及恶劣的图片都有，在服务器上用ILGnet最新的脚本测试了一下，使用的ILGnet-AVA2.caffemodel，这是得分最高的图片：

很普通中庸的图片排在前面，和官方的例子相差甚远，而实际上，这10W张图片里面漂亮、意境唯美的图片非常多，很多大师级别的摄影图片aesthetic评分也一般，官方的代码似乎还是有点哪里不对吗？

AVA数据集在哪里下载

AVA数据集在哪里下载？？

How to deploy? : )

I am running into some issues trying to deploy the code. When I try to deploy the code, the temp_wl and loss1/classifier_wl layers are initialized randomly, so the output is random and doesn't work. As you suggested, I removed the following code:

weight_filler {
  type: "xavier"
}
bias_filler {
  type: "constant"
  value: 0.2
}

But then, all the weights and biases were 0.

Could you advise me on how to properly deploy your pre-trained model? Do I need to modify deploy.prototxt? Currently, I am using deploy.prototxt and ILGnet-AVA1.caffemodel. Your test.py did not seem to work for me.

用自己的数据集fine-tune时，预训练模型用哪一个好？

你好，谢谢你的论文以及代码，有学到很多。我是初次使用caffe，所以有些问题不太懂，想请教下：

关于数据的输入：我是不是应该先根据train.txt/val.txt + 类似create_imagenet.sh，生成lmdb文件呢？caffe可以直接输入图片吗？
如果我想训练自己的数据集，预训练模型是使用你给的ILGnet-AVA1.caffemodel，还是仅在imagenet上预训练的caffemodel呢？（如果使用仅在imagenet上预训练的caffemodel的话，去哪里下载呢？）
期待回复！祝好！

bestivictory / ilgnet Goto Github PK

ilgnet's Introduction

ILGnet

ilgnet's People

Contributors

Stargazers

Watchers

Forkers

ilgnet's Issues

Recommend Projects

Recommend Topics

Recommend Org