yiling-chen / view-finding-network Goto Github PK

View Code? Open in Web Editor NEW

74.0 5.0 16.0 43.18 MB

A deep ranking network that learns to find good compositions in a photograph.

License: GNU General Public License v3.0

Python 100.00%

dataset evaluation deep-learning aesthetics tensorflow

view-finding-network's Introduction

View Finding Network

This repository contains the dataset and scripts used in the following article:

Yi-Ling Chen, Jan Klopp, Min Sun, Shao-Yi Chien, Kwan-Liu Ma, "Learning to Compose with Professional Photographs on the Web", in Proc. of ACM Multimedia 2017. (Supplemetnal)

News Check out this PyTorch implementation if you are interested.

Dependencies

You will need to have tensorflow (version > 1.0), skimage, tabulate, pillow installed on your system to run the scripts.

Download the dataset

Clone the repository to your local disk.
Under a command line window, run the following command to get the training images from Flickr:

$ python download_images.py -w 4

The above command will launch 4 worker threads to download the images to a default folder (./images).

Training

Run create_dbs.py to generate the TFRecords files used by Tensorflow.
Run vfn_train.py to start training.

$ python vfn_train.py --spp 0

The above example starts training with SPP disabled. Or you may want to enable SPP with either max or avg options.

$ python vfn_train.py --pooling max

Note that if you changed the output filenames when running create_dbs.py, you will need to provide the new filenames to vfn_train.py. Take a look at the script to check out other available parameters or run the following command.

$ python vfn_train.py -h

Evaluation

We provide the evaluation script to reproduce our evaluation results on Flickr cropping dataset. For example,

$ python vfn_eval.py --spp false --snapshot snapshots/model-wo-spp

You will need to get sliding_window.json and the test images from the Flickr cropping dataset and specify the path of your model when running vfn_eval.py. You can also try our pre-trained model, which can be downloaded from here.

If you want to get an aesthetic score of a patch, please take a look at the example featured by ModelDepot

Questions?

If you have questions/suggestions, feel free to send an email to (yiling dot chen dot ntu at gmail dot com).

If this work helps your research, please cite the following article:

@inproceedings{chen-acmmm-2017,
  title={Learning to Compose with Professional Photographs on the Web},
  author={Yi-Ling Chen and Jan Klopp and Min Sun and Shao-Yi Chien and Kwan-Liu Ma},
  booktitle={ACM Multimedia 2017},
  year={2017}
}

view-finding-network's People

Contributors

Stargazers

Watchers

Forkers

fangyizhang peternara marvin521 993917172 duke24k convexsetgithub a1rb4ck ak9250 shineyusong gaosandy ahuirecome xuman2019 s-p-z wlaikuan sahar-github woshidandan

view-finding-network's Issues

How to trian a model use own images datase

Hi, thank you so much for sharing your code! I have some questions and hope you could help me.
I try to use my images dataset to train a model,but I was puzzled to the file dataset.pkl . I don't know how to create the dataset.pkl.Could you tell me what information the file contains and how to make own dataset ?
It will be great if you can help me, thank you a lot.

[clarify] 1.create_database , 2.sliding_window.json

Dear Yiling,
Thanks for your help!
Can you help clarify:

how you created the database?
how sliding_window.json is generated?

Details:

how you created the database?

from training script, I saw the loss is feature_vec multiplied by loss matrix
and loss matrix is like 2 diagonal-1s matrix concatenated.

did you created the tf database
such that feature_vec matmul the loss_matrix
means
feature_vec(full) - feature(crop)?

how sliding_window.json is generated?
Is the generation script available on github too?
I saw you mentioned faster-rcnn, and this make me wonder if sliding_window.json is related to faster-rcnn?

BR,
JimmyYS

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape

Hi , thank you for your help! But I got a wrong. I run this pre-trained modelmodel-wo-spp is ok, but try to run other pre-trained modelmodel-spp-max with an err :InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [9216,1000] rhs shape= [12544,1000], So what should I change the paramete ?

How to download ICDB dataset

How can I download the Image Cropping Database?

A problem about vfn_eval

Hello, thanks for sharing your code! I find a problem when I try vin_eval. The output bounding boxes are random although the input image is the same, which means that the results are different when I run the vfn_eval, I can't solve this problem.

evaluation error: InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [9216,1000] rhs shape= [12544,1000]

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [9216,1000] rhs shape= [12544,1000]

This is a problem during downloading image

multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x0000020386853F98>'. Reason: 'TypeError("cannot serialize '_io.BufferedReader' object",)

code:
pool.map(fetch_image, URLs)

how can i solve ? thanks.

the range of the output score

First of all, thanks for your work and sharing.
I'm using the output score of VFN (the output of score_func) as my evaluation parameter, but I want normalized output. So could you please tell me the upper limit of the VFN output score? Thanks a lot!

python 3 _pickle.UnpicklingError: the STRING opcode argument must be quoted

I got an error here:
db = pkl.load(open("dataset.pkl", "rb"))

Had to run the dos2unix.py to format the pkl file. Maybe you can add dos2unix file to the project. It's listed a couple posts down:
https://stackoverflow.com/questions/45368255/error-in-loading-pickle

can't restore from pre-train model

DataLossError (see above for traceback): Unable to open table file ./model-spp-max/model-spp-max.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

About your pre-trained model

The pre-trained model you provides has many tensors, such as Variable_5/Adam_1, Variable_1,
ranker/fc6w, ranker/fc6b, ranker/fc6b/Adam, ranker/fc6w/Adam and so on. I wonder which one should I choose for fc6W and fc6b in network.py? or what is the correspondence between these tensor values and the weights in network.py?How should I use it? Thank you very much!