Git Product home page Git Product logo

view-finding-network's Introduction

View Finding Network

This repository contains the dataset and scripts used in the following article:

Yi-Ling Chen, Jan Klopp, Min Sun, Shao-Yi Chien, Kwan-Liu Ma, "Learning to Compose with Professional Photographs on the Web", in Proc. of ACM Multimedia 2017. (Supplemetnal)

News Check out this PyTorch implementation if you are interested.

Dependencies

You will need to have tensorflow (version > 1.0), skimage, tabulate, pillow installed on your system to run the scripts.

Download the dataset

  • Clone the repository to your local disk.
  • Under a command line window, run the following command to get the training images from Flickr:
$ python download_images.py -w 4

The above command will launch 4 worker threads to download the images to a default folder (./images).

Training

  • Run create_dbs.py to generate the TFRecords files used by Tensorflow.
  • Run vfn_train.py to start training.
$ python vfn_train.py --spp 0

The above example starts training with SPP disabled. Or you may want to enable SPP with either max or avg options.

$ python vfn_train.py --pooling max

Note that if you changed the output filenames when running create_dbs.py, you will need to provide the new filenames to vfn_train.py. Take a look at the script to check out other available parameters or run the following command.

$ python vfn_train.py -h

Evaluation

We provide the evaluation script to reproduce our evaluation results on Flickr cropping dataset. For example,

$ python vfn_eval.py --spp false --snapshot snapshots/model-wo-spp

You will need to get sliding_window.json and the test images from the Flickr cropping dataset and specify the path of your model when running vfn_eval.py. You can also try our pre-trained model, which can be downloaded from here.

If you want to get an aesthetic score of a patch, please take a look at the example featured by ModelDepot

Questions?

If you have questions/suggestions, feel free to send an email to (yiling dot chen dot ntu at gmail dot com).

If this work helps your research, please cite the following article:

@inproceedings{chen-acmmm-2017,
  title={Learning to Compose with Professional Photographs on the Web},
  author={Yi-Ling Chen and Jan Klopp and Min Sun and Shao-Yi Chien and Kwan-Liu Ma},
  booktitle={ACM Multimedia 2017},
  year={2017}
}

view-finding-network's People

Contributors

kloppjp avatar yiling-chen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

view-finding-network's Issues

the range of the output score

First of all, thanks for your work and sharing.
I'm using the output score of VFN (the output of score_func) as my evaluation parameter, but I want normalized output. So could you please tell me the upper limit of the VFN output score? Thanks a lot!

How to trian a model use own images datase

Hi, thank you so much for sharing your code! I have some questions and hope you could help me.
I try to use my images dataset to train a model,but I was puzzled to the file dataset.pkl . I don't know how to create the dataset.pkl.Could you tell me what information the file contains and how to make own dataset ?
It will be great if you can help me, thank you a lot.

A problem about vfn_eval

Hello, thanks for sharing your code! I find a problem when I try vin_eval. The output bounding boxes are random although the input image is the same, which means that the results are different when I run the vfn_eval, I can't solve this problem.

can't restore from pre-train model

DataLossError (see above for traceback): Unable to open table file ./model-spp-max/model-spp-max.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

[clarify] 1.create_database , 2.sliding_window.json

Dear Yiling,
Thanks for your help!
Can you help clarify:

  1. how you created the database?
  2. how sliding_window.json is generated?

Details:

  1. how you created the database?

from training script, I saw the loss is feature_vec multiplied by loss matrix
and loss matrix is like 2 diagonal-1s matrix concatenated.

did you created the tf database
such that feature_vec matmul the loss_matrix
means
feature_vec(full) - feature(crop)?

  1. how sliding_window.json is generated?
    Is the generation script available on github too?
    I saw you mentioned faster-rcnn, and this make me wonder if sliding_window.json is related to faster-rcnn?

BR,
JimmyYS

About your pre-trained model

The pre-trained model you provides has many tensors, such as Variable_5/Adam_1, Variable_1,
ranker/fc6w, ranker/fc6b, ranker/fc6b/Adam, ranker/fc6w/Adam and so on. I wonder which one should I choose for fc6W and fc6b in network.py? or what is the correspondence between these tensor values and the weights in network.py?How should I use it? Thank you very much!

This is a problem during downloading image

multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x0000020386853F98>'. Reason: 'TypeError("cannot serialize '_io.BufferedReader' object",)

code:
pool.map(fetch_image, URLs)

how can i solve ? thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.