Git Product home page Git Product logo

bankclassify's Introduction

BankClassify - automatically classify your bank statement entries

Note: This is not 'finished' software. I use it for dealing with my bank statements, but it is not 'production-ready' and may crash or do strange things. It is also set up for my particular usage, so may not work for you. However, I hope it will be a useful resource.

This code will classify each entry in your bank statement into categories such as 'Supermarket', 'Petrol', 'Eating Out' etc. It learns from previously classified data, and corrections you make when it guesses a category incorrectly, and improves its performance over time.

How to use

  1. Install the required libraries: pip install -r requirements.txt

  2. Run the code in example.py as a demonstration. This will interactively classify the example bank statement data in Statement_Example.txt and save the results in AllData.csv. In the interactive classification you will be presented with a list of categories (with ID numbers), the details of a transaction, and a guessed category. You have three choices:

    • To accept the guessed category, just press Enter
    • To correct the classifier to a category that is in the list shown, enter the ID number of the category and press Enter
    • To add a new category, type the name of the category and press Enter
  3. Examine the output in AllData.csv manually, or run bc._prep_for_analysis() and look at bc.in and bc.out for incomings and outgoings respectively. You will see there is a cat column with the category in it.

To use it with your own data:

  • If you use Santander UK as your bank: just run bc.add_data(filename) with the filename of your downloaded statement file. Delete AllData.py first though, or the example data will be used as part of the training data.
  • If you use another bank: Write your own function to read in your statement data from your bank. It must return a pandas dataframe with columns of date, desc and amount. Add this to the BankClassify class and call it instead of _read_santander_file.

Known issues

For Barclays bank sometimes the CSV file contains multiple commas within the 'memo' (transaction description) column. You can either manually patch your data before you run the tool or be aware that due to the work-around implemented we could potentially be losing valuable information beyond the comma.

bankclassify's People

Contributors

robintw avatar takesthebiscuit avatar tornikenats avatar mudroprogramer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.