Git Product home page Git Product logo

catrank's Introduction

forthebadge

What's in here?

This repo contains pretrained models that predict relative upvotes on Reddit for image-only and image + text models. The subreddits these models were trained on are /r/pics, /r/aww, /r/cats, /r/FoodPorn, /r/MakeupAddiction, and /r/RedditLaqueristas, so if you want to know if an image would probably be upvoted within these communities, you've come to the right place! If you want to read more about the technical details, check out the project page and paper here.

What is required to run this package?

To install requirements, run

pip install -r requirements.txt

How do I score images?

If you want to score according to the /r/aww community

python score_example.py examples/bodhi.jpg aww

which outputs:

examples/bodhi.jpg		34.8/100

the first column is the filename, and the second column is the score out of 100 for the image (higher is better). The score is the percentile of the image's score on a test split.

How do I score images plus their captions?

If you want to score a cat alongside a caption according to the /r/cats community, you can do

python score_example.py examples/taz.jpg cats --caption "Please don't sit on me!"

which outputs

examples/taz.jpg		please dont sit on me		55.8/100

How do I score lots of images/captions?

If you want to score many images/captions at once, you can use --list_mode True; in this case, the image and caption arguments are assumed to be text files. The image text file has one filename per line, and the caption text file has one caption per line. The first line of the image file should correspond to the first line of the caption file, and so on. For example, you can run

python score_example.py examples/example_image_list.txt --caption examples/example_caption_list.txt cats --list_mode True

which outputs

examples/bodhi.jpg		who says bulldogs cant be c...	22.1/100
examples/lizzy.jpg            	my 20 year old little girl ...	99.4/100
examples/taz.jpg              	please dont sit on me         	55.8/100

Unsurprisingly, the model doesn't like a dog (Bodhi) being posted in /r/cats, though the model likes the story about an elderly cat (Lizzy). As an interesting experiment, you can check the effect the captions had on the scores by running

python score_example.py examples/example_image_list.txt cats --list_mode True

and comparing to the previous output.

I want to train my own models!

If you want to train your own models, you'll need to get the datasets that these were trained on, which are not in this repo. They are available for download here.

Citation and contact

If you find the models here useful, please cite our paper!

@inproceedings{hessel2017cats,
	title={Cats and Captions vs. Creators and the Clock: Comparing Multimodal Content to Context in Predicting Relative Popularity},
	author={Hessel, Jack and Lee, Lillian and Mimno, David},
	booktitle={Proceedings of the 26th International Conference on the World Wide Web},
	year={2017},
	organization={International World Wide Web Conferences Steering Committee}
}

If you have any questions, you can contact [email protected]

catrank's People

Contributors

jmhessel avatar gpleiss avatar jacobrgardner avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.