forcewake / flatfile Goto Github PK
View Code? Open in Web Editor NEWFlatFile is a library to work with flat files
License: MIT License
FlatFile is a library to work with flat files
License: MIT License
This is the continuation of the conversation from the #24 (comment)
// Init multi record file engine
var engine =
EngineFactory.GetEngine(
new[]
{
typeof (SettlementHeaderRecord),
typeof (TransactionRecord),
typeof (TransactionDetailRecord),
typeof (SettlementTrailerRecord)
},
s =>
{
if (String.IsNullOrEmpty(s) || s.Length < 1) return null;
switch (s[0])
{
case 'H':
return typeof(SettlementHeaderRecord);
case 'D':
return typeof(TransactionRecord);
case 'P':
return typeof(TransactionDetailRecord);
case 'V':
return typeof(SettlementTrailerRecord);
}
return null;
});
Is there a way to use attribute mapping with multiple record types?
[FixedLengthFile]
public abstract class MyDataBase
{
protected MyDataBase(char lineType)
{
if (lineType <= 0) throw new ArgumentOutOfRangeException(nameof(lineType));
_lineType = lineType;
}
private readonly char _lineType;
[FixedLengthField(1, 1)]
public char LineType
{
get { return _lineType; }
set
{
if (value != _lineType)
throw new NotSupportedException($"The value must be '{_lineType}'.");
}
}
}
[FixedLengthFile]
public sealed class MyDataHeader : MyDataBase
{
public MyDataHeader()
: base('H')
{ }
[FixedLengthField(2, 13, Padding = Padding.Right, PaddingChar = ' ')]
public string FormatVersion { get; set; }
[FixedLengthField(15, 22, Padding = Padding.Right, PaddingChar = ' ')]
public string Filename { get; set; }
[FixedLengthField(37, 6, PaddingChar = '0')]
public int JobId { get; set; }
[FixedLengthField(43, 10, PaddingChar = '0')]
public int NoOfTransactionLines { get; set; }
[FixedLengthField(53, 12, PaddingChar = '0')]
public int SumOfPoints { get; set; }
}
[FixedLengthFile]
public sealed class MyDataTransaction : MyDataBase
{
public MyDataTransaction()
: base('T')
{ }
[FixedLengthField(2, 6, PaddingChar = '0')]
public int JobId { get; set; }
[FixedLengthField(8, 10, PaddingChar = '0')]
public int TransactionEntryNo { get; set; }
[FixedLengthField(18, 2, PaddingChar = '0')]
public int TransactionType { get; set; }
[FixedLengthField(20, 10, PaddingChar = '0')]
public int Points { get; set; }
}
Now I have no idea how to read this. Could be something like:
var factory = new FixedLengthFileEngineFactory();
using (var stream = new MemoryStream(Encoding.UTF8.GetBytes(FixedFileSample)))
{
var flatFile = factory.GetEngine<MyDataBase>();
var records = flatFile.Read<MyDataBase>(stream).ToArray();
}
But I cannot make the type MyDataHeader
and MyDataTransaction
known to the engine class. The C# Xml serializer supports this with [XmlIncludeAttribute]
Great work here!
Using FixedLengthFile/FixedLengthField.
Would be helpful if ITypeConverter instances had access things like:
Right now I'm not seeing a way to provide the specific details about the exact thing causing the problem.
Within ITypeConverter I can only see something like:
HandleEntryReadError:
This feature request might just kind of be specific to FixedLengthFile with lots of fields - it sometimes can be a pain to figure out where the problem is when you have hundreds of characters of text smash, let alone provide a detailed/automatic report to a data source provider say via email.
Thanks for the help!
It would be great to expand the FixedLengthFileMultiEngine to include Write capability including master/detail records in a recursive manner.
Do you have any plan to move to .net standard? i see an issue (#55 ) with a PR that solves the problem. Do you have any plan to keep maintaining this library?
I noticed the existence of a method Read(StreamReader streamReader)
in the class FixedLengthFileMultiEngine
but it is not in the corresponding interface: FixedLengthFileMultiEngine
e.g. File with one row where Age is null
FirstName,LastName,Age,Date of Birth
John,Doh,,09/06/1947
Here's the map:
[DelimitedFile(Delimiter = ",", Quotes = """, HasHeader = true)]
public class PersonRecordWithAttributeMap
{
[DelimitedField(1)]
public string FirstName { get; set; }
[DelimitedField(2)]
public string LastName { get; set; }
[DelimitedField(3,NullValue = "", Name = "Age")]
public int? Age { get; set; }
}
For the age column NullValue="" produces a parse error. I've also tried NullValue=null. Same behaviour. Only seems to work if the raw file uses a magic string to represent null.
HI all
is it possible to define the EOL character(s)?
Thank you
Hey!
First off, very good library my friend! :)
Well, I have a request/observation and I'm not sure if this is the right place to do it, but here it goes:
A "Fixed Length" field by definition should not violate its length, either you pad it with something or you truncate it in order to achieve the expected length...however the current code is allowing strings with larger values to go through :(
Class: FlatFile.FixedLength.Implementation.FixedLengthLineBuilder
Method: TransformFieldValue
Block:
if (lineValue.Length >= field.Length)
{
return lineValue;
}
Can this be changed? :)
Thanks!
I can parse a file with lots of records. If I have a string representing just 1 record, there should be a shortcut to parse it. (I could use a memorystream but that's really overkill here)
i need the line number to get passed around in FixedLengthFileMultiEngine.cs
in the context of:
/// <summary>
/// The type selector function used to determine the layout for a given line
/// </summary>
readonly Func<string, Type> typeSelectorFunc;
current usage:
var type = typeSelectorFunc(line);
updated usage:
var type = typeSelectorFunc(line, lineNumber);
is that something that you would accept? asking so i know how i need to consume this library going forward. as of now, nuget works but will need to approach differently if this is something you would not want to include.
the use case here is switching record types off of detecting if we're parsing first and last lines in a file.
thanks! great library!
When working with fixedlengthlayout i need a property that indicates if a field must be truncated if its content exceeds the defined maximum length
just accept it as parameter and pass to underlying StreamReader. Or even better - accept TextReader|TextWriter instead of Stream.
Hi,
I really love this library specifically with the fluent syntax for attribute mapping.
Wanted to know if we have any plans for the .NetCore 3.1 support?
If there is already a plan , I would love to contribute to that.
if I have multiple empty field at the end of a row I get the separator as value:
Explained better here:
https://stackoverflow.com/questions/52017242/flatfile-library-delimited-layout-wrong-parsing-when-multiple-fields-are-empty
I don't know if I miss some attribute configurations...
I download the code, in debug it seems to be a problem in class DelimitedLineParser, when check:
if (line.Length > linePosition + delimiterSize)
at first glance could be a solution the next line???
if (line.Length >= linePosition + delimiterSize)
It's probably just me being thick, so apologies in advance.
With a fixed-width file, I can use attributes to describe the mappings on the object. I can then use factory.GetEngine, which uses the attributes to understand the layout.
With a delimited file, I can also use attributes - but I cannot do factory.GetEngine (I can see this was recently removed).
Does this mean that with delimited files I cannot use attributes? Do I have to manually create the layout when working with delimited files - as opposed to how it works with fixed width files? Or am I just missing something?
Thanks in advance.
I show a new nuget release has been created but I can't tell what's changed as there are no tags/release notes and the change log hasn't been updated in ages.
I've been playing around with some ideas for a multi-record file engine for fixed width files, something similar to the MultiRecordEngine in FileHelpers.
There are a couple of things to consider (mostly related to reading, haven't thought much about writing yet)
ArrayList
with all the parsed records. putting the burden on the user to loop through the list and determine what to do with each record one by one. That makes it easy on us, but harder on the user. I think we can do better with something like either:
T where T : interface
that serves as a simple marker interface. That allows us to return List<T>
rather than ArrayList
. The user still needs to go through the list item by item (or use a Linq query with Oftype<>
), but with the added benefit of more Linq functionality available on the returned list.List<object>
. Less runtime type safety, but doesn't require the user to create a marker interface.dictionary<Type, List<T>>
that contains a List<T>
for each record type passed to the multi-record engine. The user can then access results[typeof(SomeRecord)]
to get the results for that type.GetResults<T>()
on the engine after reading is complete. This then returns the appropriate List to the user and feels more polished from their perspective, I feel.List<T>
for each record type, automatically named something like MyRecordList. This has a certain "cool factor" to it, but ultimately I think I favor option iv as it requires less guessing on the user's part.I think I'm going to take a crack at implementing a fixed-width file engine (and factory) that take a param array of types to parse and implement the GetResults<T>
concept. However before I went too far down the rabbit hole I wanted to see what thoughts you had.
Thanks.
I might be missing a setting, but is there a way to verify the overall length of the line?
I was setting up some test cases for errors, and one of the ones I tried was making a line way longer than it is supposed to be, but it doesn't seem to hit any catches.
I would expect it at least hitting the truncateIfExceeded = false.
I am parsing using IFlatFileMultiEngine and [FixedLengthField] attributes.
There is a number of fields I don't care about. I thought I could just define the fields I need, simply skipping the irrelevant ones, like this:
[FixedLengthField(1, 10)]
public string ClientId { get; set; }
[FixedLengthField(100, 50)]
public string Address { get; set; }
I'd expect the 'gap '(data in positions 11..99) to be ignored. However, it looks like the 'index' parameter isn't used in the this case, and Address would start from 11 rather than 100.
I realized this is cannot be used in projects that require strong name assemblies. Are you accepting pull requests so I can make one?
For a fixed length file with multiple records, how do you keep track of the order that records appear?
For example, a file with a header record type and a detail record type:
Header
Detail
Detail
Detail
Header
Detail
Header
Detail
Detail
In this case the detail lines belong to the header rows above them.
GetRecords returns all records of a given type, how can we keep track of which detail lines belong to which headers?
When a property is null the resulting column is never padded. This is true whether or not AllowNull was used with a non null value or not. I would actually expect the output to be padded is both of those scenarios.
Idid not find any property to instruct how to convert dates in my flat file to a property of time DateTime and when I tried to read the file, an exception was thrown converting 20170712 to type DateTime.
public class BDIHeader
{
public int Tipo { get; set; }
public string NomeArquivo { get; set; }
public string Origem { get; set; }
public int Destino { get; set; }
public DateTime DataGeracao { get; set; }
public DateTime DataPregao { get; set; }
public string HoraMinuto { get; set; }
}
public sealed class BDIHeaderLayout : FixedLayout<BDIHeader>
{
public BDIHeaderLayout()
{
this.WithMember(x => x.Tipo, c => c.WithLength(2))
.WithMember(x => x.NomeArquivo, c => c.WithLength(8))
.WithMember(x => x.Origem, c => c.WithLength(8))
.WithMember(x => x.Destino, c => c.WithLength(4))
.WithMember(x => x.DataGeracao, c => c.WithLength(8))
.WithMember(x => x.DataPregao, c => c.WithLength(8))
.WithMember(x => x.HoraMinuto, c => c.WithLength(4));
}
}
public void Read()
{
//
var factory = new FixedLengthFileEngineFactory();
using (var stream = new FileInfo(enderecoArquivoBDI).Open(FileMode.Open, FileAccess.Read, FileShare.Read))
{
// If using attribute mapping, pass an array of record types
// rather than layout instances
var layouts = new ILayoutDescriptor<IFixedFieldSettingsContainer>[]
{
new BDIHeaderLayout(),new BDIIndiceLayout()
};
var flatFile = factory.GetEngine(layouts,
line =>
{
// For each line, return the proper record type.
// The mapping for this line will be loaded based on that type.
// In this simple example, the first character determines the
// record type.
if (String.IsNullOrEmpty(line) || line.Length < 1) return null;
switch (line.Substring(0, 2))
{
case "00":
return typeof(BDIHeader);
//case "01":
// return typeof(BDIIndice);
//case "02":
// return typeof(BDINegociosPapelLayout);
//case "99":
// return typeof(BDITrailerLayout);
}
return null;
});
flatFile.Read(stream);
var header = flatFile.GetRecords<BDIHeader>().FirstOrDefault();
//var indices = flatFile.GetRecords<BDIIndice>().ToList();
//var negocios = flatFile.GetRecords<BDINegociosPapelLayout>();
//var trailer = flatFile.GetRecords<BDITrailer>().FirstOrDefault();
}
}
line being read
00BDIN9999BOVESPA 999920170713201707131807
Objective:
Allow Delimited Files with multiple record types to be loaded via FlatFile.
Context:
We were happily using FlatFile to load all of our position dependent file with multiple record types, but then we tried to implement a delimited file with multiple record types and found we couldn't. So we figured we would have a stab at implementing this by following the approach used by the existing.
Disclaimers:
We need this to meet project deadlines.
I have never contributed to a opensource / git hub project before, so while I have attempted to follow convention and figure out what the rules were, please let me know of any oversights / mistakes.
Lets say I have a fixed width document with 7000 columns and I want to skip random sections say 750-950, 1000-4000.... Is there a better way to skip these sections than to create a single string Ignored { get; }
property and basically skip these sections via known length offsets?
I've got this class:
[DelimitedFile(Delimiter = "|", Quotes = "\"", HasHeader = true)]
class SchemaRow
{
[DelimitedField(1)] public int Ordinal {get; set;}
[DelimitedField(2)] public Guid GUID { get; set; }
[DelimitedField(3)] public string KeyItem { get; set; }
[DelimitedField(4)] public string SNLxlKeyField {get;set;}
[DelimitedField(5)] public string ProductCaption { get; set; }
[DelimitedField(6)] public string DataType { get; set; }
[DelimitedField(7, NullValue="|")] public string Magnitude { get; set; }
[DelimitedField(8, NullValue = "|")] public int? Length { get; set; }
}
It blows up parsing this row where the last 2 fields happen to be empty. I can't seem to find a value to set on the NullValue attribute parameter to make this work. Rows where the last field is populated but the second-to-last field is empty work fine.
5|40F7CE96-DC46-4EFC-AAC4-C76199C1E769|1403|132092|End of Fiscal Period Date|smalldatetime||
It would be nice to be able to export and read in a mapping so a user can map files and save that mapping to a database and pull up later. A simple json format might work.
The converter is null on the field settings at runtime. I'm going to take a look and see if I can fix it.
You mention attributes in this example but it's not clear how to use it:
// If using attribute mapping, pass an array of record types
// rather than layout instances
Could you clarify specifically using delimited attributes what this might look like?
I've run into a processing case where the file is following a non-standard approach of RPad and LPad.
This could be worked around with type converter, but was wondering if there was a built-in way to support this case.
Thanks!
\Implementation\FixedLengthLineBuilder.cs
protected override string TransformFieldValue(IFixedFieldSettingsContainer field, string lineValue)
This doesn't check for null fields and instead pads the field no matter what.
I ended up needing to add the following before the padding output to the above function:
if (field.IsNullable && lineValue == field.NullValue)
return lineValue;
Documentation and examples should be up-to-date, so it will be cool to add some examples for the multi-record file engine #10, #11
@petersondrew could you please help me with this?
I use the Fixed length attributes to define my structure.
I have a given format with yyyyMMdd hh:mm
. How can I specify this for a DateTime
property? I found no attribute and no sample how to map DateTime
types.
I have a file with a structure like this:
HEADER
RECORD TYPE1
RECORD TYPE2
RECORD TYPE3
RECORD TYPE3
RECORD TYPE1
RECORD TYPE2
RECORD TYPE4
RECORD TYPE5
TAIL
....
in which there are many groups that starts with HEADER and ends with TAIL
This is an example
Test.txt
and those are specs: CBI-RND-001 6_05_ENG.pdf
I could use the Read from stream With multiple fixed record types but when I get the list of RECORD TYPEx I don't know in which group the record was readed.
Is it possible at least to have the line number of each records?
Thank you
Regards
Are there any plans to support .Net Core 1.1/.2.0?
Using FixedLengthFileEngineFactory.
When there is a type conversion problem like "is not a valid value for Int32".
It doesn't get handled by "handleEntryReadError:" and instead throws a global exception.
at System.ComponentModel.BaseNumberConverter.ConvertFrom(ITypeDescriptorContext context, CultureInfo culture, Object value)
at System.ComponentModel.TypeConverter.ConvertFromString(String text)
at FlatFile.Core.Extensions.TypeChangeExtensions.Convert(String input, Type targetType) in TypeChangeExtensions.cs:line 13
at FlatFile.Core.Base.LineParserBase`2.GetFieldValueFromString(TFieldSettings fieldSettings, String memberValue) in LineParserBase.cs:line 44
at FlatFile.FixedLength.Implementation.FixedLengthLineParser.ParseLine[TEntity](String line, TEntity entity) in FixedLengthLineParser.cs:line 23
at FlatFile.Core.Base.FlatFileEngine`2.TryParseLine[TEntity](String line, Int32 lineNumber, TEntity& entity) in FlatFileEngine.cs:line 122
at FlatFile.Core.Base.FlatFileEngine`2.d__8`1.MoveNext() in FlatFileEngine.cs:line 93
I need to handle a record that has a variable length last field, while the previous fields in the record are all fixed length.
We can fairly easily handle this if I add an Unbounded
or VariableLength
boolean property to the FieldSettings and Attributes. Then if that value is set on the field, the fixed length line parser can read to the end of the string (or to the specified field length, whichever comes first).
The other option is to extend the file engine factory to allow passing a custom line parser, or a Func<> to select the line parser on a per-line basis, or we can register a number of line parsers with the engine itself (each tied to a particular layout or Type
) and let the FixedLengthLineParserFactory
determine the appropriate line parser on a per line basis.
Any thoughts on any of those approaches? Something like the second option is a more complex approach, but also adds a ton of flexibility without needing to constantly add support for edge cases (like this) to the library.
Hello,
I'm creating a FixedLengthFileEngineFactory as below (I copied the below from issue #30) but I no longer can do this:
var flatFile = factory.GetEngine();
seems like the GetEngine() call has no empty constructor and now wants a layout descriptor as such:
var container = new FieldsContainer();
var descriptor = new LayoutDescriptorBase(container);
var flatFile = factory.GetEngine(descriptor);
doing the above fails to write "MyRecords" file to the stream though and I get exceptions writing to the stream!! I'm using the latest version 0.2.51.0.
Any idea what's missing here and why am I not able to do it like the sample below shows?
Thank you!
-- Sample code below --
[FixedLengthFile]
class MyRecord
{
[FixedLengthField(1, 4)]
public string Prefix { get; set; }
}
var factory = new FixedLengthFileEngineFactory();
var flatFile = factory.GetEngine();
return flatFile.Read(stream);
Hi,
I use fixed length for attributes like following:
[FixedLengthFile]
public class MyFile
{
[FixedLengthField(1, 1)]
public string LineType { get; set; }
[FixedLengthField(2, 13, Padding = Padding.Right, PaddingChar = ' ')]
public string FormatVersion { get; set; }
// @ToDo: What to write here? .... i miss DecimalPlaces = 2 ...
[FixedLengthField(15, 15, PaddingChar = '0')]
public decimal Amount { get; set; }
}
How can I use fixed length with decimal values? - the specification allows 2 decimal places. I need to specify that somewhere...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.