Git Product home page Git Product logo

icsv2ledger's Introduction

icsv2ledger

This is a command-line utility to convert CSV files of transactions, such as you might download from an online banking service, into the format used by John Wiegley's excellent Ledger system.

The 'i' stands for interactive. Here's what it's designed to do:

  • For each CSV input you give it, it creates a Ledger output.

  • As it runs through the entries in the CSV file, it tries to guess which Ledger account and Ledger payee they should be posted against, based on your historical decisions.

  • It shows you which account, payee, (optionally tags), it's going to use, giving you the opportunity to change it. If it got it right, just hit return.

  • When you are entering an account/payee name, you get auto-completion if you press the Tab key. You don't have to match the start of the name, so typing 'foo[tab]' inserts 'Expenses:Food'.

  • When you are entering a tag, you get auto-completion if you press the Tab key. If you would like to remove a tag from the list of tags just prefix the tag with a minus '-'. When you are done with the tags just hit return.

  • It stores the history of auto-completion in a mapping file, for converting transaction descriptions onto payee/account(/tag) names. You can also edit this by hand. It can load this in future as the basis of its guesses.

  • The payee/account names used in the auto-completion are read both from the mapping file and, optionally, from a Ledger file or files. It runs ledger payees and ledger accounts to get the names. It can also scan a separate ledger file containing 'account' definitions. The tags are only read from the mapping file.

Synopsis

icsv2ledger.py [options] -a STR [infile [outfile]]

Arguments summary

infile                input filename or stdin in CSV syntax
outfile               output filename or stdout in Ledger syntax

Options summary

Options can either be used from command line or in configuration file. --account is a mandatory option on command line. --config-file, --src-account and --help are only usable from command line.

--account STR, -a STR
                      ledger account used as source
--src-account STR
                      ledger account used as source, overrides --account option
--clear-screen, -C    clear screen for every transaction
--cleared-character {*,!, }
                      character to clear a transaction
--config-file FILE, -c FILE
                      configuration file
--credit INT          CSV column number matching credit amount
--csv-date-format STR
                      date format in CSV input file
--csv-decimal-comma   comma as decimal separator in the CSV
--currency STR        the currency of amounts
--date INT            CSV column number matching date
--debit INT           CSV column number matching debit amount
--default-expense STR
                      ledger account used as destination
--delimiter           CSV delimiter
--desc STR            CSV column number matching description
--effective-date INT  CSV column number matching effective date
--encoding STR        text encoding of CSV input file
--incremental         append output as transactions are processed
--ledger-binary FILE  path to ledger binary  
--ledger-date-format STR
                      date format for ledger output file
--ledger-decimal-comma
                      comma as decimal separator in the ledger
--ledger-file FILE, -l FILE
                      ledger file where to read payees/accounts
--mapping-file FILE   file which holds the mappings
--accounts-file FILE  file which holds a list of allowed accounts
--quiet, -q           do not prompt if account can be deduced
--reverse             reverse the order of entries in the CSV file
--skip-dupes          detect transactions that have already been imported and skip
--confirm-dupes       detect transactions that have already been imported and prompt to skip
--skip-lines INT      number of lines to skip from CSV file
--skip-older-than     skip entries more than X days old
--tags, -t            prompt for transaction tags
--template-file FILE  file which holds the template
--prompt-add-mappings prompt before adding entries to mapping file
--entry-review        displays summary of ledger formatted entry and prompts before committing
-h, --help            show this help message and exit

Options

Options can either be used from command line or in configuration file. From command line the syntax is --long-option VALUE with dashes, and in configuration file the syntax is long_option=VALUE with underscores.

There is an order of precedence for options. First hard coded defaults (documented below) are used, overridden by options from configuration file if any, and finally overridden by options from command line if any.

--account STR, -a STR

is the ledger account used as source for ledger transactions. This is the only mandatory option on command line. Default is Assets:Bank:Current.

When used from command line, it is both the section name in configuration file and the account name. Account name could then be overridden in configuration file. See section Configuration file example where SAV from command line is overridden with account=Assets:Bank:Savings Account.

--src-account STR

similar to --account option, it is the ledger account used as source for ledger transactions but allows the --account option to be overridden after the config file has been parsed. This is a command-line only option and must not be provided in any section of the config file. Use of this option allows users to treat sections of the config file as generic import recipes that can be used to import all files that use the same layout while providing a means to specify the ledger source account to use during the importing of transactions.

--clear-screen, -C

will clear the screen before every prompting. Default is False.

--cleared-character {*,!, }

is the character to mark a transaction as cleared. Ledger possible value are * or ! or . Default is *.

--config-file FILE, -c FILE

is configuration filename.

The file used will be first found in that order:

  1. Filename given on command line with --config-file,
  2. .icsv2ledgerrc in current directory,
  3. .icsv2ledgerrc in home directory.

--credit INT

is the CSV file column which contains credit amounts. The first column in the CSV file is numbered 1. Default is 4.

See also documentation of --debit option for negating amounts.

--csv-date-format STR

describes the date format in the CSV file.

See the python documentation for the various format codes supported in this expression.

--csv-decimal-comma

will assume that number use the comma ',' as decimal in the csv.

If the --ledger-decimal-comma option is not set, comma will be converted into dot.

--currency STR

is the currency of amounts. Default is locale currency_symbol.

--date INT

is the CSV file column which contains the transaction date. Default is 1.

--debit INT

is the CSV file column which contains debit amounts. Default is 3.

If your bank writes all amounts in same column, credits as positive amounts and debits as negative amounts, then set credit to correct column and debit to 0.

If your bank writes debits as a negative number and you want to negate the amount, then use --debit=-3. It will negate amounts in column 3 and use them as debits amounts.

--default-expense STR

is the default ledger account used as destination (generally an expense) for ledger transactions. Default is Expenses:Unknown.

--delimiter STR

is the CSV delimiter character. Default is ,. Special characters can be expressed using standard escape sequences, such as \t for a tab.

--desc STR

is the CSV file column which contains the transaction description as supplied by the bank. Default is 2.

This description will be used as the input for determining which payee and account to use by the auto-completion.

It is possible to provide a comma separated list of CSV column indices (like desc=2,5) that will concatenate fields in order to form a unique description. That enriched description will serve as base for the mapping.

--effective-date INT

is the CSV column number which contains the date to be used as the effective date. Default is 0. Use of this option currently requires a template file. See section Transaction template file.

--encoding STR

is the text encoding of the CSV input file. Default is utf-8. The encoding should be specified if the CSV file contains non-ASCII characters (typically in the transaction description) in an encoding other than UTF-8.

--incremental

appends output as transactions are processed. The default flow is to process all CSV input and then output the result. When --incremental is specified, output is written after every transaction. This allows one to stop (ctrl-c) and restart to progressively process a CSV file (--skip-dupes is a useful companion option).

--ledger-binary

is the path to the ledger binary. Not neccessary if it is in PATH or is at either /usr/bin/ledger or /usr/local/bin/ledger

--ledger-date-format STR

describes the date format to be used when creating ledger entries. If --ledger-date-format is defined, then --csv-date-format must also be defined to be able to convert dates. If --ledger-date-format is not defined, then the date from CSV file is reused.

See the python documentation for the various format codes supported in this expression.

--ledger-decimal-comma

will assume that number should be print using the comma ',' as decimal when creating ledger entries.

If the --csv-decimal-comma option is not set, dot will be converted into comma.

--ledger-file FILE, -l FILE

is ledger filename where to get the list of already defined accounts and payees.

The file used will be first found in that order:

  1. Filename given on command line with --ledger-file,
  2. .ledger in current directory,
  3. .ledger in home directory.

--mapping-file FILE

is the file which holds the mapping between the description and the payee/account names to use. See section Mapping file.

The file used will be first found in that order:

  1. Filename given on command line with --mapping-file,
  2. .icsv2ledgerrc-mapping in current directory,
  3. .icsv2ledgerrc-mapping in home directory.

Warning: the file must exists so that mappings are added to the file.

--accounts-file FILE

is an optional file that can be used to hold a master list of all account names, and will be used as a source for account names. See section Accounts file.

The file used will be first found in that order:

  1. Filename given on command line with --accounts-file,
  2. .icsv2ledgerrc-accounts in current directory,
  3. .icsv2ledgerrc-accounts in home directory.

--quiet, -q

will not prompt if account can be deduced from existing mapping. Default is False.

--reverse

will print ledger entries in reverse of their order in the CSV file.

--skip-dupes

will attempt to detect duplicate transactions in ledger file by comparing MD5Sum of transactions. The MD5Sum is calculated from the formatted CSV values including the source account. The source account is included to avoid false positives on generic transaction descriptions when the source account is different and thus should not be considered a duplicate. MD5Sum of existing transactions are as a ; MD5Sum: ... comment in the current ledger file (which means your output template will need this comment). This can help if you download statements without using a precise date range. A useful pattern is to include MD5Sum comments for both "sides" of a transaction if you download from multiple sources that resolve to a single transaction (e.g. paying a credit card from checking). Note: use of this flag by itself will detect and skip duplicate entries automatically with no interaction from user. If you want to be prompted and determine whether to skip or not see --confirm-dupes.

--confirm-dupes

same as --skip-dupes but will prompt user to indicate if they want the detected duplicate entry to be skipped or treated as a valid entry. This is useful when importing transactions that commonly contain generic descriptions.

--skip-lines INT

is the number of lines to skip from the beginning of the CSV file. Default is 1.

--tags, -t

will interactively prompt for transaction tags. Default is False.

The normal behavior is for one description to prompt for payee and account, and store this in mapping file. By setting this option, the description can also be mapped to additional tags.

At the prompt: fill a tagname and press Enter key as many times as you need tags. Remove an existing tag by preceding it with minus, like -tagname. When finished, press Enter key on an empty line.

This --tags option only prompt for tags. You have to add ; {tags} in your template to make tags appear in generated Ledger transactions.

--template-file FILE

is template filename, which contains the template to use when generating ledger transactions. See section Transaction template file.

The file used will be first found in that order:

  1. Filename given on command line with --template-file,
  2. .icsv2ledgerrc-template in current directory,
  3. .icsv2ledgerrc-template in home directory.

--skip-older-than DAYS

will not process any entries in the CSV file which are more than DAYS old. If DAYS is negative then the entire CSV file is processed.

--prompt-add-mappings

will prompt user before adding entries to the mapping file. This is useful when you would prefer to manually adjust an existing entry or add the entry manually to the mapping file.

--entry-review

allows the ability to review the generated ledger entry and Commit, Modify or Skip the entry. If the entry is not committed then the values for payee, account and optionally tags is prompted for again.

Example

The below command will use the [SAV] section of the configuration file to process the CSV file.

./icsv2ledger.py -a SAV file.csv

Configuration file example

The following is an example configuration file where you can save your icsv2ledger's options.

A configuration file typically contains one section per bank account to be imported. In the below example there are two bank accounts: SAV and CHQ.

[SAV]
account=Assets:Bank:Savings Account
currency=AUD
date=1
csv_date_format=%d-%b-%y
ledger_date_format=%Y/%m/%d
desc=6
credit=2
debit=-1
mapping_file=mappings.SAV

[SAV_addons]
beneficiary=3
purpose=4


[CHQ]
account=Assets:Bank:Cheque Account
currency=AUD
date=1
csv_date_format=%d/%m/%Y
ledger_date_format=%Y/%m/%d
desc=2
credit=3
debit=4
mapping_file=mappings.CHQ
skip_lines=0

Addons

In section Configuration file example the SAV_addons section enables to save a CSV field value to a tag value. Those tags can then be used, for the SAV account, in your own transaction template:

 ; purpose: {addon_purpose}
 ; beneficiary: {addon_beneficiary}

Mapping file

A typical mapping file might look like:

/SAFEWAY/,Safeway,Expenses:Food
/ITUNES.*/,iTunes,Expenses:Entertainment
THE WRESTLERS INN,"The ""Wrestlers"" Inn",Expenses:Food
/MACY'S/,"Macy's, Inc.",Expenses:Food
MY COMPANY 1234,My Company,Income:Salary
MY COMPANY 1234,My Company 1234,Income:Salary:Tips
MY TRANSFER 1,Transfer to Savings,Transfers:Savings,transfer_to=Assets:Savings

It uses simple string-matching by default, but if you put a '/' at the start and end of a string it will instead be interpreted as a regular expression.

Mapping is based on your historical decisions. Later matching entries overwrite earlier ones, that is in example above MY COMPANY 1234 will be mapped to My Company 1234 and Income:Salary:Tips.

Experimental You can use transfer_to= to another asset to make the transfer to record in a "transfer" double-entry pattern. In the example above for the Transfers:Savings account with the transfer_to=Assets:Savings would create the following entries:

2012/01/01 Transfer to Savings Transfers:Savings $100 Assets:Checking

2012/01/01 Transfer to Savings Assets:Savings $100 Transfers:Savings

You can additionally add a file= value after transfer_to= to write the second entry in another file. This is useful if you split your accounts per file and want to write the first transaction in the checking file and the second in the savings file.

Accounts File

To prevent inconsistencies it is possible to user ledger --strict mode, along with a file that defines a list of allowable accounts. (See the ledger 3 manual, section 4.6 'Keeping it Consistent')

The accounts file should look like:

account Expenses:Food
account Expenses:Entertainment
account Income:Salary
account Income:Salary:Tips

All other lines will be ignored so you if you have a single ledger file that has account definitions mixed throughout it, it is safe (although potentially time consuming) to pass it to icsv2ledger as the accounts-file.

Transaction template file

The built-in default template is as follows:

{date} {cleared_character} {payee}
    ; MD5Sum: {md5sum}
    ; CSV: {csv}
    {debit_account:<60}    {debit_currency} {debit}
    {credit_account:<60}    {credit_currency} {credit}
    {tags}

Details on how to format the template are found in the Format Specification Mini-Language.

The values that can be used are: date, effective_date, cleared_character, payee, transaction_index, debit_account, debit_currency, debit, credit_account, credit_currency, credit, tags, md5sum, csv. And also the addon tags like addon_xxxx. See section Addons.

Runtime Requirements

icsv2ledger should work in a vanilla Python 2.7 or 3.x environment, as it uses only base packages.

Note that the 'ledger' binary must be installed in the local PATH in which icsv2ledger is used, as the binary is invoked for various operations.

Contributing

Feedback/contributions most welcome.

Known Issues

On Mac OS X when CSV is passed via stdin to icsv2ledger you may not see any prompts offering defaults and asking for your input. This is due to an inferior readline library (libedit) installed by default on Mac OS X. Install a proper readline library and you're good to go.

% sudo easy_install readline

On Windows the default Python installation does not provide a readline library. The pyreadline library provides native python emulation of this functionality and must be installed to run this utility.

Author

icsv2ledger was originally created by Quentin Stafford-Fraser but includes valuable contributions from many others, including Peter Ross, Alexis Hildebrandt, Thierry and Eric Entzel.

See also

ledger, hledger

icsv2ledger's People

Contributors

afh avatar ajamian avatar belidzs avatar d-e-s-o avatar daucourt avatar eentzel avatar garyp avatar hfs avatar iartarisi avatar kalafut avatar kevincaseiras avatar mlamby avatar mondjef avatar petdr avatar peterdc avatar quentinsf avatar riclage avatar tbm avatar thdox avatar themiurgo avatar tikank avatar vanicat avatar wrycta avatar zacchiro avatar zdw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

icsv2ledger's Issues

Mapping Regex is not matching

If the pattern you are try to match is at the start of the string then it will match otherwise it will not find an match

Patten: /TFR-TO 1234456/

Desciption to check for match: "WU-005 TFR-TO 123456"

will not work unless I change the match code to search as follows:

            if m[0].search(entry.desc, 0):
                payee, account, tags = m[1], m[2], m[3]
                found = True

thanks

Brian

please support "-tag" syntax to remove metadata values

[ context: this thread on the Ledger-cli maling list https://groups.google.com/forum/#!topic/ledger-cli/AfGKE2Svv0s ]

To remove a metadata tag "FOO", icsv2ledger supports the "-FOO" syntax, which is great.
However, to remove a metadata value "FOO:BAR", one currently has to write "-FOO:BAR" which can get quite tedious when "BAR" is a long string. It would be nice if icsv2ledger would support the "-FOO" syntax to remove the whole "FOO:BAR" metadata value.

With many thanks for maintaining icsv2ledger!

Use credit account from input/mapping instead of global one

I want to use budgets like:

assets:budgets:food
assets:budgets:clothing

which means I want to use a different credit account for each transaction.

Currently this is not possible because the global variable self.credit_account is used in the output file.
The expexted input prompt could be something like:

2017/12/22 Food Market                       -9
Payee [Food Market] > food_market
Debit Account [Expenses:Food] > expenses:food
Credit Account [Assets:Budgets:Food] > assets:budgets:food

Using libedit on Mac OS X breaks raw_input prompt

Hi guys,
I've been using icsv2ledger some more and noticed that on my workstation running Mac OS X
the prompts from raw_input are not presented when I pass in csv data via stdin.

easy_installing readline support fixes the problem. It seems the libedit that Apple ships by default
is inferior to the official readline. Do any of you also use Mac OS X? And can you verify the problem?
if yes I think the README should include a little note on the issue.

Here's a little test script I used

 #! /usr/bin/env python
 #  when readline or rlcomleter are imported the prompt given to raw_input does not show
 # on a Mac OS X system that by default ships with libedit.
 # % easy_install readline
 # fixes the problem

import sys
import readline
import rlcompleter

def main():
  message = sys.stdin.readlines()
  sys.stdin = open('/dev/tty')
  foo = raw_input('prompt> ')

if __name__ == "__main__":
      main()

Provide a working example

Nothing fancy, just 3 lines of csv converted to 3 entries of ledger.

I tried for 20 Minutes now and couldn't get anything working, so it would be nice to see how it is intended to work.

metadata values inlined on a single line, without separator

I recently updated to the current HEAD of icsv2ledger, and tags formatting no longer works properly for me. I'm using the following template:

{date} {cleared_character} {payee}
    ; Label: {addon_label}
    {tags}
    {debit_account:<35}    {debit:>9} {debit_currency}
    {credit_account:<35}    {credit:>9} {credit_currency}

And when multiple tags are associated to a transaction, I get entries like this:

2017-04-19 * train ticket
    ; Label: some stuff that looks OK
    ; Author: Person1Card: 12345678X-Payee: RATP
    Expenses:Transport:Train
    Assets:Cmut:Checking                      -10.00 EUR

The actual tags I entered in icsv2ledger here are "Author", "Card", and "X-Payee"; all are actually metadata values.
Note the lack of spacing between tags on line 3.

I see two issues here:

  1. metadata values (as opposed to tags) should be on separate lines to be valid (AFAICT from the ledger manual)
  2. either way, when tags are on a single line, they should be separated to avoid creating spurious tags that do not correspond to what you entered in icsv2ledger

ability to extract column substrings

[ context: this thread on the Ledger-cli maling list https://groups.google.com/forum/#!topic/ledger-cli/AfGKE2Svv0s ]

Currently, the granularity of icsv2ledger addons is that of columns: extra columns present in the input CSV can be referenced from templates.
It would be nice if addons granularity could be more fine grained than that, for instance by supporting (named) grouping in regexps and adding the ability to reference them from templates. This would allow very flexible information extraction for banks that put a lot of information in transaction descriptions.

With many thanks for maintaining icsv2ledger!

Credit/debit columns using $0.00 instead of blank creates unbalanced transaction

My bank exports csv files with credit/debit columns, and the unused column contains 0.00, rather than being blank. An example file:

Bank Account,Date,Narrative,Debit Amount,Credit Amount,Categories,Serial                                                                                         
123456789,01/03/2015,"WITHDRAWAL BY PAYMENT SHOP1", 19.39 , 0.00,POS,                                                                                            
123456789,01/03/2015,"WITHDRAWAL BY PAYMENT SHOP2", 23.60 , 0.00,POS,                                                                                            
123456789,01/03/2015,"WITHDRAWAL BY PAYMENT SHOP3", 17.48 , 0.00,POS,                                                                                            
123456789,01/03/2015,"DEPOSIT", 0.00 , 100.00,CASH,

This results in an output ledger file with unbalanced transactions, as the 0.00 is transferred straight onto one of the accounts in the ledger transaction:

2015/03/01 * Shop 1
    Expenses:Buying Stuff                                           AUD 19.39
    Assets:Bank                                      AUD 0.00

2015/03/01 * Shop 2
    Expenses:Buying Stuff                                           AUD 23.60
    Assets:Bank                                      AUD 0.00

2015/03/01 * Shop 3
    Expenses:Buying Stuff                                           AUD 17.48
    Assets:Bank                                      AUD 0.00

2015/03/01 * Deposit to account
    Assets:Cash In Wallet                                           AUD 0.00
    Assets:Bank                                      AUD 100.00

This causes ledger to complain the transactions are unbalanced:

While parsing file "bugfix.ledger", line 17: 
While balancing transaction from "bugfix.ledger", lines 13-17:
> 2015/03/01 * Shop 3
>     Expenses:Buying Stuff                                           AUD 17.48
>     Assets:Bank                                      AUD 0.00
Unbalanced remainder is:
           AUD 17.48
Amount to balance against:
           AUD 17.48
Error: Transaction does not balance
While parsing file "bugfix.ledger", line 23: 
While balancing transaction from "bugfix.ledger", lines 19-23:
> 2015/03/01 * Deposit to account
>     Assets:Cash In Wallet                                           AUD 0.00
>     Assets:Bank                                      AUD 100.00
Unbalanced remainder is:
          AUD 100.00
Amount to balance against:
          AUD 100.00
Error: Transaction does not balance 

This doesn't seem to happen if the csv file leaves the unused column blank for each transaction.

No way to have multiple currencies

If I have a file with the currency symbol for credit and debit, the currency symbol is stripped and the transaction is not balanced.

Example CSV:

07/02/19,ALIEXPRESS,$ 12.40,₪ 46.32,
07/02/19,ALIEXPRESS,$ 8.29,₪ 30.97,
07/02/19,ALIEXPRESS.COM,$ 10.98,₪ 41.02,

this will result in transactions like:

2019-02-03 * ALIEXPRESS.COM
    ; MD5Sum: c4d25e113867328107d41df888567599
    ; CSV: 03/02/19,ALIEXPRESS.COM,$ 7.60,₪ 28.34,
    Expenses:Unknown                                                ₪ 7.60
    Liabilities:CreditCard:CalFix                                   ₪ -28.34

2019-02-03 * ALIEXPRESS.COM
    ; MD5Sum: ec1c415dbec58336d539fe46a279bf8b
    ; CSV: 03/02/19,ALIEXPRESS.COM,$ 3.36,₪ 12.53,
    Expenses:Unknown                                                ₪ 3.36
    Liabilities:CreditCard:CalFix                                   ₪ -12.53

2019-02-03 * WWW.ALIEXPRESS.COM
    ; MD5Sum: 1e95004ef338600cfd343c75c88d5b82
    ; CSV: 03/02/19,WWW.ALIEXPRESS.COM,$ 10.58,₪ 39.45,
    Expenses:Unknown                                                ₪ 10.58
    Liabilities:CreditCard:CalFix                                   ₪ -39.45

There should be a way to just get the raw transaction.

bug: fails without explicitly setting a --ledger-file value

Thank you for an excellent utility!

I started using about a year ago. A couple months ago I finally updated my local clone to your latest changes and found that the commands I had been using were failing. After some debug, it appears that the codeflow without a --ledger-file passed bypasses the only logic that initializes a couple of sets, without these initialized the code fails later.

Also noticed that the utility now assumes the ledger executable is in the current PATH.

I have a bugfix in my fork of the tool, as well as a small edit to the README stating that the ledger binary must be in the current shell PATH. Happy to open a pull request.

confirm/edit review step after each transaction

[ context: this thread on the Ledger-cli maling list https://groups.google.com/forum/#!topic/ledger-cli/AfGKE2Svv0s ]

It would be nice if icsv2ledger could provide an extra (maybe optional) step to review/edit each transactions after it has generated it.
That would shield from mistyping transaction information and, more generally, improve the user sense of being in control of what's happening while going through many transactions at a time.

With many thanks for maintaining icsv2ledger!

Blank lines at end of csv file cause IndexError

If you have extra lines at the end of a csv file it causes this error

$ ./icsv2ledger.py -a amex amextest.csv amextest.txt
Traceback (most recent call last):
  File "./icsv2ledger.py", line 631, in <module>
    main()
  File "./icsv2ledger.py", line 628, in main
    process_input_output(options.infile, options.outfile)
  File "./icsv2ledger.py", line 610, in process_input_output
    ledger_lines = process_csv_lines(csv_lines)
  File "./icsv2ledger.py", line 621, in process_csv_lines
    options)
  File "./icsv2ledger.py", line 314, in __init__
    self.date = fields[options.date - 1]
IndexError: list index out of range

It seems like a common enough mistake that there should be some way to handle it, either to ignore blank lines and throw a warning, or stop if there are blank lines at the end of a csv file and show a message that they need to be removed.

Credits and Debits are both treated as having the same sign

My bank not only separates credits and debits into different columns, but also adds a sign. This means that icsv2ledger will take money out of checking and put it into income, or (if I flip the credit and debit columns) will take money out of expenses and put it into checking.

Invalid ledger file encoding

Thanks for the great tool! Here is an error.

Given file in UTF-8, under Windows 10, Powershell 5, Python 3, I call:

py -3 .\icsv2ledger\icsv2ledger.py -c .icsv2ledgerrc -a ROCKET --quiet --encoding utf-8 .\rub.pdf.csv

And it breaks with:

Traceback (most recent call last):
  File ".\icsv2ledger\icsv2ledger.py", line 867, in <module>
    main()
  File ".\icsv2ledger\icsv2ledger.py", line 744, in main
    csv_comments = csv_from_ledger(options.ledger_file)
  File ".\icsv2ledger\icsv2ledger.py", line 579, in csv_from_ledger
    for line in f:
  File "C:\Users\art\AppData\Local\Programs\Python\Python35-32\lib\encodings\cp1251.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 1020: character maps to <undefined>

How to fix it?

UPDATE.

578:    with open(ledger_file) as f:
579:        for line in f:

Add mapping account string to invoke a prompt

It'd be cool if I could have a mapping line that sets the payee, but prompts for the account. That way, I can have a rule to match, say "/Amazon.*/," but set accounts based on each individual transaction.

Documentation: --desc should be integer

The docs state

--desc STR            CSV column number matching description
--effective-date INT  CSV column number matching effective date

which should read

--desc INT            CSV column number matching description
--effective-date INT  CSV column number matching effective date

Or even better, specify CSV column by name since ordering of columns might be arbitrary depending on tools prior in the CSV transformation chain.

how do I undo last input?

hello,
Is there an option to go back to the previous entry?
If I make a mistake when entering a payee or account, do I need to start all over processing that csv file?
thank you,

What is the correct .icsv2ledgerrc format

The point of the .icsv2ledgerrc file is that it is not necessary to specify its location or usage when executing ./icsv2ledger.py right?
And then the sole purpose of this file (.icsv2ledgerrc) is to specify the location of the configfile (configfile.txt) right?
Can anyone provide a sample .icsv2ledgerrc file for me to see?
My .ics2ledgerrc file reads as follows:
config_file=/home/user/configs/configfile.txt

And I am getting this error when running ./icsv2ledgerrc without specifying a config file:
Traceback (most recent call last):
File "./icsv2ledger.py", line 982, in
main()
File "./icsv2ledger.py", line 797, in main
options = parse_args_and_config_file()
File "./icsv2ledger.py", line 206, in parse_args_and_config_file
config.read(args.config_file)
File "/usr/lib/python3.5/configparser.py", line 696, in read
self._read(fp, filename)
File "/usr/lib/python3.5/configparser.py", line 1077, in _read
raise MissingSectionHeaderError(fpname, lineno, line)
configparser.MissingSectionHeaderError: File contains no section headers.
file: '/home/anil/.icsv2ledgerrc', line: 1
'config_file=/home/anil/configs/configfile.txt\n'

thank you,

Do you want any of these features?

Hi,

First: icsv2ledger is great. It is by far the most helpful addition to ledger for me.

I've been making some small additions that you might be interested in. If your are, let me know and I can structure a PR. I didn't want to just hit you with a PR since you may want some but not all changes, style might not be to your liking, etc. Check out: https://github.com/kalafut/icsv2ledger/commits/master

In summary they are:

  • --incremental mode. This changes the output behavior to both: a) append b) output after every transaction. I like to watch and tweak the output in my editor as I go. This would also address #60 I think.
  • --skip-dupes. Ignored "duplicates", which are currently defined as those CSV lines that already appear as a CSV: ... comments in the ledger file. This is a conservative approach vs MD5. I worry that if a bank changes anything (caps, quote style, spacing around commas), the MD5-style dupe detection would be hopelessly lost. At least with the full CSV line, one could improve the logic to detect "close enough" (e.g. only CSV formatting changes or whatever). This naturally relies on CSV: being in the template.
  • Trim trailing whitespace on output. (Does anyone actually like trailing WS?)
  • Allow ctrl-c without showing an ugly exception. I frequently exit and tweak the mapping file, then reload.

Running all of these together has really improved my workflow. I enter and exit the program frequently, adjust the output and mappings as I go, and with --skip-dupes I make constant forward progress.

Output overwrites existing file rather than appending

Happy to make a pull request changing this behavior, but it may change some folks workflows if they are expecting the file to be overwritten. Perhaps there is a particular reason. My expectation would be that if the specified outfile doesn't exist it is created otherwise it is appended to.

--skip-dupes does not work

I run the following command and then after entering a few transactions I Ctrl-C to exit then re-run the command expecting that the transactions that I have already previously imported to be skipped as a result of using the --skip-dupes option.

./icsv2ledger.py -c config -a PCF_CHQ_JEFF --skip-dupes --incremental --ledger-file /root/ledger/test.dat /root/ledger/csv_imports/PCF.csv /root/ledger/test1.dat

I am using the default embedded template that includes the ;CSV comment line which as I understand is used by the --skip-dupes option.

here is an example of what gets added to the outfile, am I missing something or is this not the way the --skip-dupes suppose to work?

2016/04/08 * TRANSFER IN
; MD5Sum: 3817f6ad5b24d1c65cb08568b27d0d88
; CSV: 04/08/2016, TRANSFER IN,,198
Liabilities:Bank:PCF:LoC
Assets:Bank:PCF:Checking $ 198

Effective date cannot be optional

I'm using a config file that reads effective_date from a CSV field and assigns is to the transaction via a template like this: {date}={effective_date}.

If the effective_date CSV field is empty the program crashes (ValueError: time data '' does not match format '%d/%m/%Y')

Is there any way the template assignment of the effective_date tag to be optional, ie. only take place when effective_date is actually not null?

LE: It turns out the error is only raised when there is a custom ledger_date_format (which would be applied over a null value). I still think it would be best if there was a way to only add the ={effective_date} when the date actually exists; but I don't know how to do this with python format specification (or whether it's even possible in the first place).

--ledger_file doesn't catch errors

If for some reason the ledger file is corrupted, icsv2ledger doesn't give a warning or stop execution when using the --ledger_file option to get existing account and payees

Why not use the raw_csv hexdigest to detect duplicates?

I import csv files which which will produce duplicate postings (in terms of desc, credit, debit).

With the --skip-dupes option set these postings are ignored. This problem can be resolved by using the md5sum of the raw_csv line.

This observation is also noted at:

icsv2ledger/icsv2ledger.py

Lines 558 to 560 in 55629c7

# We also record this - in future we may use it to avoid duplication
#self.md5sum = hashlib.md5(self.raw_csv.encode('utf-8')).hexdigest()
self.md5sum = hashlib.md5(','.join(x.strip() for x in (self.date,self.desc,self.credit,self.debit,self.credit_account)).encode('utf-8')).hexdigest()

Why not un-comment that line? Or provide an option to use the raw_csv input as a skip-dupes criteria?

Any example for multiple sources resolving one single transaction ?

Hi

Does anyone have a minimal working example for this particular case?:

A useful pattern is to include MD5Sum comments for both "sides" of a transaction if you download from multiple sources that resolve to a single transaction (e.g. paying a credit card from checking).

Excerpt from

--skip-dupes

will attempt to detect duplicate transactions in ledger file by comparing MD5Sum of transactions. The MD5Sum is calculated from the formatted CSV values including the source account. The source account is included to avoid false positives on generic transaction descriptions when the source account is different and thus should not be considered a duplicate. MD5Sum of existing transactions are as a ; MD5Sum: ... comment in the current ledger file (which means your output template will need this comment). This can help if you download statements without using a precise date range. A useful pattern is to include MD5Sum comments for both "sides" of a transaction if you download from multiple sources that resolve to a single transaction (e.g. paying a credit card from checking). Note: use of this flag by itself will detect and skip duplicate entries automatically with no interaction from user. If you want to be prompted and determine whether to skip or not see --confirm-dupes.

--skip-dupes does not, somewhat unexpectedly, skip the prompts to enter payee and account

It seems the prompt for entering payee and account still appears, even though that line of the csv has already been "dealt with".

What I got:

~ echo "2018/01/01,someshop,-5\n" > in.csv
~ icsv2ledger.py --skip-lines=0 --csv-date-format=%Y/%m/%d --credit 3 --debit 0 --desc 2 --skip-dupes --account Assets:Checking in.csv out.csv

2018/01/01 someshop                                 -5
Payee [someshop] >
Account [Expenses:Unknown] >
~ cat out.csv
2018/01/01 * someshop
    ; MD5Sum: 57d3077de9bff6d607111ed9e3a882d0
    ; CSV: 2018/01/01,someshop,-5
    Expenses:Unknown
    Assets:Bank:Current                                             £ -5

~ icsv2ledger.py --skip-lines=0 --csv-date-format=%Y/%m/%d --credit 3 --debit 0 --desc 2 --skip-dupes --account Assets:Checking in.csv out.csv

2018/01/01 someshop                                 -5
Payee [someshop] >
Account [Expenses:Unknown] >
~ cat out.csv
2018/01/01 * someshop
    ; MD5Sum: 57d3077de9bff6d607111ed9e3a882d0
    ; CSV: 2018/01/01,someshop,-5
    Expenses:Unknown
    Assets:Bank:Current                                             £ -5

What I expected:

~ echo "2018/01/01,someshop,-5\n" > in.csv
~ icsv2ledger.py --skip-lines=0 --csv-date-format=%Y/%m/%d --credit 3 --debit 0 --desc 2 --skip-dupes --account Assets:Checking in.csv out.csv

2018/01/01 someshop                                 -5
Payee [someshop] >
Account [Expenses:Unknown] >
~ cat out.csv
2018/01/01 * someshop
    ; MD5Sum: 57d3077de9bff6d607111ed9e3a882d0
    ; CSV: 2018/01/01,someshop,-5
    Expenses:Unknown
    Assets:Bank:Current                                             £ -5

~ icsv2ledger.py --skip-lines=0 --csv-date-format=%Y/%m/%d --credit 3 --debit 0 --desc 2 --skip-dupes --account Assets:Checking in.csv out.csv
<<<Nothing here, or perhaps some helpful message>>>
~ cat out.csv
2018/01/01 * someshop
    ; MD5Sum: 57d3077de9bff6d607111ed9e3a882d0
    ; CSV: 2018/01/01,someshop,-5
    Expenses:Unknown
    Assets:Bank:Current                                             £ -5

Add support for the "transfer" account pattern

When the csv row being imported refers to a transfer to another account, it would be great if icsv2ledger could support the "transfer" pattern as described here:

Using the Transfer account pattern, you'd tweak your CSV > ledger
conversion to generate the following records:

2012/01/01 Transfer from Checking to Savings
 ; Generated from Checking account CSV > ledger
 Transfers:Checking_Savings  $100
 Assets:Checking

2012/01/01 Transfer from Checking to Savings
 ; Generated from Savings account CSV > ledger
 Assets:Savings  $100
 Transfers:Checking_Savings

According to the original poster, this pattern has the following advantages, e.g.:

  1. No manual tweaking after a CSV > ledger conversion.
  2. You can immediately know where any transfer went wrong (and which
    account didn't register it)
  3. When everything has cleared, the balance in the Transfer account is
    zero so it doesn't appear in any balance reports

I could try to make a PR to support the above but not sure how to approach it. Maybe we can add a third parameter to the mapping file to indicate a "transfer" pattern account?

should learn tags from input ledger file

[ context: this thread on the Ledger-cli maling list https://groups.google.com/forum/#!topic/ledger-cli/AfGKE2Svv0s ]

It would be nice if icsv2ledger could learn from the input ledger file (--ledger-file) not only account names and payees, but also tags.

With many thanks for maintaining icsv2ledger!

Only saves to ledger file when --incremental option is used

After running the command:
./icsv2ledger.py -c config.txt -a CHASE ChaseCredit.csv ledger.dat
the ledger.dat file is empty unless I use the --incremental option.

The above command works on previous commits, which makes me believe it is a bug.

I will be submitting a PR with a fix if you are interested.
To fix this, my PR makes icsv2ledger always write to the ledger file incrementally. Therefore the only functionality of the --incremental flag is to append to the ledger file instead of overwriting. Let me know if you would prefer it be renamed to --append. I didn't want to break backwards compatibility. Consequently I fixed the incompatibility between the --reverse and --incremental options.

Account naming restriction in account file should be mentioned in documentation

I have a ledger with account names containing non-us characters such as ääüß, etc. (The ledger contains transaction data in German).

When invoking icsv2ledger with the option --accounts-file these account names are not accepted.

This is due to the regexp patter at https://github.com/quentinsf/icsv2ledger/blob/master/icsv2ledger.py#L623

I believe it could be helpful to mention this restriction at https://github.com/quentinsf/icsv2ledger/blob/master/README.md#accounts-file

or alternatively allow special characters in the regexp pattern.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.