railslove / cmxl Goto Github PK
View Code? Open in Web Editor NEWyour friendly MT940 SWIFT file parser for bank statements
Home Page: http://railslove.com
License: MIT License
your friendly MT940 SWIFT file parser for bank statements
Home Page: http://railslove.com
License: MIT License
latest version on rubygems is 0.1.3 there have been some changes since then.
Ruby 3.3 changes the parameters to Regexp.new
, which in turn breaks rchardet19
s universal detector, which uses an incompatible parameter style.
Can the rchardet19
dependency be changed to rchardet
, as the former seems to have been abandoned in 2014?
In case of chargebacks there are additional informations in the :61: fields (OCMT
and CHGS
data).
The current regular expression completely fails in parsing these :61: lines because of that additional data.
Changing the regular expression of Cmxl::Fields::Transaction
to
/^(?<date>\d{6})(?<entry_date>\d{4})?(?<storno_flag>R?)(?<funds_code>[CD]{1})(?<currency_letter>[a-zA-Z])?(?<amount>\d{1,12},\d{0,2})(?<swift_code>(?:N|F).{3})(?<reference>NONREF|.{0,16})((?:\/\/)(?<bank_reference>[^\r\n]*))?((?:[\r\n])?((?:\/OCMT\/)(?<ocmt>[^\/]*)(?:\/)(?:\/CHGS\/)(?<chgs>[^\/]*)(?:\/)))?/i
fixes that and additionally gives the OCMT and CHGS fields.
The changed part is: the whole bank reference group is now optional and can contain any characters except for CR-LF. After that there may be an additional block separated by CR-LF containing /OCMT/3a15num
with an optional slash at the end followed by /CHGS/3a15num
with an optional slash at the end.
I have :86: at the end of a statement. Per https://www.sepaforcorporates.com/swift-for-corporates/account-statement-mt940-file-format-overview/ this tag is to be treated as meta data at the statement level:
Tag 86 – Information to Account Owner
Optional – 6x65x
Additional information about the statement as a whole
In debugging it in the parser, this tag is getting associated as meta data for tag 64 (Available Balance) which immediately precedes it AND does not have an add_meta_data method. I believe this information should be added at the statement level as meta data. In my case the value is ":86:/OSDR/HSBCPLPW" which is the SWIFT code or BIC.
First of all thanks for sharing the code.
I updated cmxl version in my project from 0.2.0 to 1.4.1. Then one of my tests parsing a mt940 is failing.
Quickly i found why this happens. Generation_date in statement wants to read from field :20 or :13
In my case i get from my bank (Commerzbank) field :20 only with reference no extra values are provided.
Field :13 is not provided at all because it is mt942.
def generation_date field(20).date || field(13).date end
Call of field(13) gives nil and then method date fails with
NoMethodError: undefined method `date' for nil:NilClass
:20 exists in mt940 and mt942 so even if there is no generation_date provided in that field field(20).date will not fail. It only returns nil. But then field(13).date is called and gives error mentioned above.
Method generation_date should check if field :13 is provided before call date on it.
Hello!
It seems to me that there are no CI runs connected to commits right now, is that right?
missing tags are:
I used the value given by the sha method in statement and transaction to find them in a database.
This works fine for me but some weeks before i thought i lost some transactions in my database.
In fact they were all there but the sha hashes were the same.
2 cases
First case:
Debit transfer with same day, same amount, same receiver account only transaction information differs (invoice number)
Sha hash is the same because all fields in :61 are identical. The difference is in :86
My quick fix is i build my own sha from :61 and information from :86.
Second case:
Credit transfer all values identical. Sender made accidently same transaction twice the same day. This case is really rare but happend in real world. My fix for this i built also my own hash and add a increment to raw transaction data (source).
So my questions to discuss:
Are the hashes meant to be used to identify transactions?
If yes should cmxl handle these rare cases or should the piece of software which uses cmxl handle this?
Lines 20 to 21 in 5e41274
Worst case is that someone might send differing data to CodeClimate. That aside, it may be annoying if all forks post to the same CodeClimate Project.
I actually don't know if our source file is valid, but we have a file (real file, coming from a bank) that has opening_date and closing_date set at 2019/12/30, but one transaction with an individual date of 2019/12/31.
The whole case shouldn't be possible, but in this case the cmxl's parsed "entry_date" is "1230" and the "date" field is "191231" - which is the correct date. Not only is the entry_date field wrong (I guess it's taking it's value from the global opening/closing dates) but it also misses the year.
For now, is it safe to use "date" instead of "entry_date"?
I recently had problems with MT940 so I upgraded to 1.1.0 and from that time I got issues (but it may also be the bank changing the format).
MT940 piece:
:61:180627D79,NMSCXXXX3550//MA-20-00084395
28/06/1812:15 PIZZA HUT MA3550
:86:XXXX3550 /TYPE/631/PAYM CARTE
result:
#<Cmxl::Fields::Transaction:0x0000000007ad8c78 @tag="61", @modifier=nil, @source="180627D79,NMSCXXXX3550//MA-20-00084395\n28/06/1812:15 PIZZA HUT MA3550", @data={"date"=>"180627", "entry_date"=>nil, "storno_flag"=>"", "funds_code"=>"D", "currency_letter"=>nil, "amount"=>"79,", "swift_code"=>"NMSC", "reference"=>"XXXX3550//MA-20-", "bank_reference"=>nil, "supplementary"=>"00084395"}, @match=#<MatchData "180627D79,NMSCXXXX3550//MA-20-00084395" date:"180627" entry_date:nil storno_flag:"" funds_code:"D" currency_letter:nil amount:"79," swift_code:"NMSC" reference:"XXXX3550//MA-20-" bank_reference:nil supplementary:"00084395">, @details=#<Cmxl::Fields::StatementDetails:0x0000000007ad8570 @tag="86", @modifier=nil, @source="XXXX3550 /TYPE/631/PAYM CARTE", @data={"transaction_code"=>"XXX", "details"=>"X3550 /TYPE/631/PAYM CARTE", "seperator"=>"X"}, @match=#<MatchData "XXXX3550 /TYPE/631/PAYM CARTE" transaction_code:"XXX" details:"X3550 /TYPE/631/PAYM CARTE" seperator:"X">>>
Why do I have such reference and bank_reference? Is it proper result of processing?
I would expect a reference of "XXXX3550", bank_reference "MA-20-00084395" and supplementary "28/06/1812:15 PIZZA HUT MA3550"
I use Cmxl.config[:statement_separator] = /\r?\n-\r?\n(?:[^:]*\r?\n)+/m
Hi, I’m currently having problems with Deutsche Bank. The error message I’m getting is:
Cmxl::Field::LineFormatError: Wrong line format: ":08 Karten?25nr. 5355999999999975 Origi?26nal 49,00 USD 1 EUR/1,\r\n12385?27 USD Entgelt 0,44 EUR?30DEUTDEDBFRA?31DE1950070024000402\r\n0480?32DEUTSCHE BANK"
The transaction that’s causing the issue is:
:61:190425D44,04NMSCNONREF
:86:106?109075/658?20EREF+000000000193592204?21MREF+CN3R3U?22CRED+DE7
600200000132558?23SVWZ+STARTER//8449273399/US?24 22-04-2019T03:46
:08 Karten?25nr. 5355999999999975 Origi?26nal 49,00 USD 1 EUR/1,
12385?27 USD Entgelt 0,44 EUR?30DEUTDEDBFRA?31DE1950070024000402
0480?32DEUTSCHE BANK
Note the linebreak and the :08
right after the newline, which does not indicate a new section but is just a part of the transaction details that has been inconveniently split.
Unfortunately I don’t know right now if this is a configuration issue from my side or a case that Cmxl currently cannot handle...
Error message:
/Users/xxxx/.rvm/gems/ruby-2.2.1/gems/cmxl-0.1.3/lib/cmxl/field.rb:51:in `parse': Wrong line format: "{1:F01AXISINBBAXXXXXXXXXXXXX}{2:I940XXXXXXXXAXXXN}{4:" (Cmxl::Field::LineFormatError)
Perhaps the statement_separator I'm using doesn't match my file:
Cmxl.config[:statement_separator] = /\n-.\n/m
Here is the MT940 file I'm trying to parse and convert:
{1:F01AXISINBBAXXXXXXXXXXXXX}{2:I940XXXXXXXXAXXXN}{4:
:20:MT940/78274
:25:xxxxxxxxxxxxxx
:28C:3/1
:60F:C160201INR0,00
:61:1602080208CR1338474,92NMSC89962273
:86:-TX BRN-REF NO.0234FIR1600187 USD 19995/RLZ
:61:1602120212DR390000,00NCHK73401
:86:-CHQ73401 TX BRN-CLG-CHQ PAID TO SANTHANA
:61:1603050305DR27300,00NCHK73403
:86:-CHQ73403 TX BRN-CLG-CHQ PAID TO SANTHANA
:61:1604060406DR39000,00NCHK73404
:86:-CHQ73404 TX BRN-CLG-CHQ PAID TO SANTHANA
:61:1604150415DR33249,00NCHK73405
:86:-CHQ73405 TX BRN-CLG-CHQ PAID TO CHENNAI AIRCONDITIONERS
:61:1604280428DR5000,00NCHK73406
:86:-CHQ73406 TX VERVE FINANCIAL SERVICES PVT LTD
:61:1604300430DR392751,00NCHK73407
:86:-CHQ73407 TX BY SALARY
:61:1605040504DR35100,00NCHK73408
:86:-CHQ73408 TX BRN-CLG-CHQ PAID TO P SANTHANA GOPALA KRISHNA
:61:1605070507DR5000,00NCHK73402
:86:-CHQ73402 TX BRN-CLG-CHQ PAID TO RELIANCE COMMUNICATI
:61:1605090509DR76917,00NMSC315222
:86:-TX INB/NEFT/AXIR161301965349/Reliance Comm/Reliance B
:61:1605120512CR1322831,02NMSC96582848
:86:-TX BRN-REF NO.0234FIR1600665 USD 19995/RLZ
:61:1605170517DR61425,00NTRF7335278
:86:-TX INB/IFT/ANITHA SURESH/Admin Reimbursement
:61:1605250525DR38000,00NCHK73409
:86:-CHQ73409 TX VERVE FINANCIAL SERVICES
:62F:C160527INR1557563,94
-}
Is there any specific dependency on rchardet19 gem or could it be upgraded to latest rchardet?
It also contains support for Ruby 1.9.
https://github.com/jmhodges/rchardet
If we use e.g. git gem which has this as dependency, there is conflict in some constants with rchardet19 version.
gems/rchardet19-1.3.7/lib/rchardet19.rb:59: warning: already initialized constant CharDet::VERSION
gems/rchardet-1.8.0/lib/rchardet/version.rb:2: warning: previous definition of VERSION was here
I know it's only warning, but I'd like to resolve it
Hey Bumi,
first of all: great job with cmxl :)
I was just about to play around with cmxl and received the following warning:
~/.rvm/gems/ruby-2.7.1@cmxl-test/gems/cmxl-1.4.6/lib/Cmxl.rb:39: warning: Using the last argument as keyword parameters is deprecated
I'm using ruby 2.7.1 and Cmxl 1.4.6.
For testing purpose I used your example described in Simple usage
part of the README with fixtures file mt940-iso8859-1.txt
.
Best wishes
As of commit d692be7 there is a check in place for statements spanning over a year boundary. If a statement is parsed, that has an entry date in february and a date in january, the entry_date
is wrongly set back to an earlier year. The following field 61 gets parsed with an entry_date
of 2018-02-01
but should be 2019-02-01
.
:61:1901310201DR1,6NMSCNONREF//XXXXXXXXXX
This happens, because there is the assumption that the entry_date
is always before the date
and not the other way around, but many banks date their monthly statements like that.
Instead of checking if entry_date.month
is bigger than date.month
, as a simple fix I suggest to check if entry_date.month
is bigger than date.month+6
. The same should be checked the other way around. if entry_date.month
is smaller than date.month-6
than add a year to entry_date, to catch constalations like the one above when having a year in between:
:61:1812310101DR3498,06NTRFNONREF//XXXXXXXXXX
A more sophisticated fix would be to make sure that the date lies between the opening and closing dates (60F and 62F) of the statetement, and if not, move it there. This would require access to the statement and can therefor not be done while parsing the 61 field without coupling the Statement class.
Perhaps its easier to not guess the year at all, and allow the client to add his own year determination function? The current behaviour could stay in place as a default for backwards compatibility reasons.
In MT940 (example here: https://www.kontopruef.de/mt940s.shtml) you need to accept up to two letters in the :61: funds_code
In your code (cmxl/lib/cmxl/fields/transaction.rb) you check for one letter 'C' or 'D', but in case of Storno you get a 'RC' or 'RD' which would cause an error in your cmxl-lib.
Can you fix these issue pls?
Greetings Ghost
SWIFT headers like {1:F01AXISINBBAXXXXXXXXXXXXX}{2:I940XXXXXXXXAXXXN}{4:
should be supported.
see also issue #8
the sepa['eref'] field contains the end to end id of SEPA transactions.
since this is a widely used identifier (that is also used to match the transaction with a debit/credit) we should add an easy reader to the transaction class.
they add some header lines in front of each statements
I am not sure I am allowed to attach file here
It is not possible to cover it by line parser as there is no :tag: here
1601 25V3241A1XAXXX00001
0000 30BMCIMAMCXXXX00001
940 02
:20:BMCI
...
first idea is to prefix file contents with "\r\n-\r\n" and use such /\r?\n-\r?\n([^:].*\r?\n)+/m statements separator which should consume this header but I still get wrong line format...
Scenario: I'm trying to parse a file with a header containing a line with a custom, non-digit tag, eg. :NS:some-description
.
Available options:
Cmxl.config[:raise_line_format_errors] = false
None of these is helpful. However, I think there are some solutions to the problem:
self.parse(line)
inside field.rb
I can make a PR, if you want so.
I have a case like this: :61:1908150815D-104,12NMSCNONREF//010F214191270328
It fails to parse correctly, likely because of the D-104,12
. I'm not certain that's correct or not though. I guess it comes down to the -
being allowed or not. Perhaps you've seen such a case before?
The current regex to parse the statement and sequence numbers doesn't follow the swift specs as declared here.
The examples :28C:235/1
and :28C:235/1
provided in the spec will not match because the regex expects 5 digits for the statement and 3 to 5 digits for the sequence number. The sequence number should be optional and the regex should allow shorter statement and sequence numbers. I would suggest:
/(?<statement_number>\d{1,5})(?:\/(?<sequence_number>\d{1,5}))?/
There are three acceptable codes for transaction type ('S', 'N', and 'F') in Field 61 of MT940, but the parser in Cmxl::Fields::Transaction
only handles two of them ('N' and 'F').
See this link for more information about the acceptable codes.
I believe this issue can be resolved simply by adding this missing third letter to the swift_code
group of the regex, so that it becomes as follows.
%r{^(?<date>\d{6})(?<entry_date>\d{4})?(?<storno_flag>R?)(?<funds_code>[CD]{1})(?<currency_letter>[a-zA-Z])?(?<amount>\d{1,12},\d{0,2})(?<swift_code>(?:S|N|F).{3})(?<reference>NONREF|(.(?!\/\/)){,16}([^\/]){,1})((?:\/\/)(?<bank_reference>[^\n]{,16}))?((?:\n)(?<supplementary>.{,34}))?$}
Here is an example line that is not currently handled correctly.
:61:1911181118CR653,00S445328556-76501096
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.