Git Product home page Git Product logo

goodreads_genre_recognition's Introduction

701project

Task :Recognising book genre- based on goodreads dataset. We will use different features-including text features(description and title), image features(Book covers) and other numerical features(such as number of pages, ratings and more) in order to recognise a book genre.

Important: Data files are in github (under data directory)

You should take all content of data directory from github (under data)

github- dgabsi/701project

(updates were made from danielaneuralx which is my working github but its all mine.)

Main notebooks are:

Please run them in that order(because the first preporcess the data)

  • eda_goodreads.ipynb (include exploration of dataset)
  • inference_goodreads.ipynb (learning models)

Report: 701projectreport_bookgenre.pdf

Project structure:

  • root
    • data
      • images-source(Directory-don't delete -this holds dataset images. It Exists in github)
        • 1.jpg (image file)
        • 2.jpg (image file)
        • ..(230K images)
        • books_images_names.csv
        • goodreads_imagestxt.txt
      • images-train (Directory-This directory and content will be created)
      • images-val(Directory-This directory and content will be created)
      • images-test(Directory-This directory and content will be created)
      • goodreads_books_eng_f1.csv (Dont delete-This is the first dataset csv)
      • goodreads_books_eng_f2.csv (Dont delete-This is the second dataset csv)
    • goodreads (package)
      • init.py
      • baseline.py
      • conv_goodreads.py
      • custom_nn_with_embeddings
      • results_utils.py
      • utils.py
    • configuration.yml (Very important -dont delete)= this holds hyperparameters configuration and general parameters
    • eda_goodreads.ipynb
    • inference_goodreads.ipynb

Needs packages:

  • tensorflow
  • matplotlib
  • sckit-learn
  • os
  • pyyaml
  • numpy
  • pandas
  • tesnorflow-addons (for F1 metric)
  • nltk
  • gensim
  • spacy
  • nltk
  • pickle
  • shutil

should run : python -m spacy download en_core_web_md

Please for any problem or question-find me at [email protected]

goodreads_genre_recognition's People

Contributors

danielaneuralx avatar dgabsi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.