iryndin / jdbf Goto Github PK
View Code? Open in Web Editor NEWJava utility to read/write DBF files
Java utility to read/write DBF files
Hi, there is an issue with Character field's length (byte 17 should be considered as hi part of length word for character, see http://www.autopark.ru/ASBProgrammerGuide/DBFSTRUC.HTM ), and here is how I've fixed it:
switch (type) {
case Character:
length = (fieldBytes[17] << 8) | (fieldBytes[16] & 0xff);
break;
default:
length = fieldBytes[16];
if (length <= 0)
length = 256 + length;
}
I have a Visual FoxPro dbf with fpt that results in the process being halted with no error message during MemoReader.read() processing as part of the DbfRecord.getMemoAsString() call. After some debugging, I have discovered that the InputStream.skip() call below was only skipping 8192 bytes instead of the number of bytes requested.
memoInputStream.skip(memoHeader.getBlockSize()*offsetInBlocks);
According to documentation at https://docs.oracle.com/javase/8/docs/api/java/io/InputStream.html#skip-long- the InputStream.skip() method may skip fewer bytes than requested:
Skips over and discards n bytes of data from this input stream. The skip method may, for a variety of reasons, end up skipping over some smaller number of bytes, possibly 0. This may result from any of a number of conditions; reaching end of file before n bytes have been skipped is only one possibility. The actual number of bytes skipped is returned. If n is negative, the skip method for class InputStream always returns 0, and no bytes are skipped. Subclasses may handle the negative value differently.
My workaround was to add a new method in IOUtils:
public static long inputStreamSkip( InputStream stream, long bytesToSkip ) throws IOException {
long bytesSkipped = stream.skip( bytesToSkip );
long totalBytesSkipped = bytesSkipped;
while ( totalBytesSkipped < bytesToSkip && bytesSkipped > 0 ) {
bytesSkipped = stream.skip( bytesToSkip - totalBytesSkipped );
totalBytesSkipped += bytesSkipped;
}
return totalBytesSkipped;
}
and then replace the above MemoReader.read() line with:
IOUtils.inputStreamSkip( memoInputStream, memoHeader.getBlockSize()*offsetInBlocks );
I have also experienced similar behavior with the DbfReader.seek() method but didn't know why until now. I suspect that all InputStream.skip() calls could potentially suffer from this same issue.
Please add support for dBase Level 7
http://www.dbase.com/Knowledgebase/INT/db7_file_fmt.htm
I had difficulties reading an Integer field from a file.
I tried with DbdRecord.getString(fieldName) and DbdRecord.getBigDecimal(fieldName) but i received an unexpected result.
So i found a solution with the method toMap(fieldName) but the easiest way seems to use the method getInteger from DbfRecord. Unfortunetly this method is private.
Error on read dbf files created with jdbf if there is one (or more) field of Integer type.
String fieldsInfo = "CODE,C,12,0|TITLE,C,40,0|CATEGORY,I,5,0";
DbfMetadata meta1 = DbfMetadataUtils.fromFieldsString(fieldsInfo);
meta1.setType(DbfFileTypeEnum.FoxBASEPlus1);
DbfWriter writer = new DbfWriter(meta1, out); // out is a OutputStream to send the file to the client side.
writer.setStringCharset(Charset.forName("ISO-8859-1")); // ISO-LATIN-1
Map<String, Object> map = new HashMap<>();
map.put("CODE", "1");
map.put("TITLE", "FIRST");
map.put("CATEGORY", 1);
writer.write(map);
writer.close();
No error is raised during creation process.
String fieldsInfo = "CODE,C,12,0|TITLE,C,40,0|CATEGORY,N,5,0";
Changing the definition to N, converts the field to Float, and dbf file can be readed (with an external dbf browser).
Many thanks.
Hi Ivan, I'm using you jar from 3 months and detected the next problem: the library JDBF repeated some registers in the read, the total read is different of the file. Attached the dbf file of test.
Thank You
The writes to a dbf file are broken - header is not written properly.
For now, deleted records are read together with not deleted ones.
However, there is now any way to check if record is deleted.
We should add such a way: this will be boolean flag that checks if record is deleted or not.
So, add flag that shows if record is deleted.
The reader.read() returns null despite there being records. Sometimes it can read the first few records before returning null. Other libraries function correctly.
.dbf files come from Paradox 11.
Other libraries were capable of returning my rows so I'll just use either one of the other two that I have tested.
I'm parsing a file and I get this error when I call reader.nextRecord()
while ((rowObjects = reader.nextRecord()) != null) {
for (int i = 0; i < rowObjects.length; i++) {
Double dataNumber = (Double.parseDouble((String.valueOf(rowObjects[0]).trim())));
System.out.println(new BigDecimal(dataNumber).toPlainString());
}
}
i am getting e values and that to wrong values i am getting
DBF Files : https://github.com/chinna1048/dbf_files
According to DBF file standard by dBase, 1-3 bytes describe the date of last update and YEAR add 1900 to determine the actual year.
Instead of:
DbfRecord rec;
while ((rec = reader.read()) != null) {
System.out.println("Record #" + rec.getRecordNumber() + ": " + rec.toMap());
}
it would be nicer to write:
for (final DbfRecord rec: reader.getRecordIterator()) {
System.out.println("Record #" + rec.getRecordNumber() + ": " + rec.toMap());
}
Using an iterator avoids keeping a mutable reference to DbfRecord.
Dear iryndin,
Thank you for this library it is very helpful.
Is it possible to add the ability to read and write the CDX file?
Don't load DBF and MEMO files into memory when reading it. Make load into memory optional
If file type flag is greter than 0x7F DbfFileTypeEnum does not correctly define DBF file type. This is due to incorrect byte -> int conversion
I have some problem reading float fields.
getFieldsStringRepresentation return all fields but toMap does not display the value
meta.getFieldsStringRepresentation() = TARGET_FID,N,9,0|nummer,C,12,0|k_superfic,F,19,11|Nome_zone_,C,12,0
rec.toMap = {TARGET_FID=327284, nummer=315, Nome_zone_=RE
Thanks or your help.
Paolo
While trying to read from an old database, I get this exception:
java.text.ParseException: Unparseable date: "
at java.text.DateFormat.parse(DateFormat.java:366)
at net.iryndin.jdbf.util.JdbfUtils.parseDate(Unknown Source)
at net.iryndin.jdbf.core.DbfRecord.getDate(Unknown Source)
at net.iryndin.jdbf.core.DbfRecord.toMap(Unknown Source)
I am using Visual FoxPro table (type: VisualFoxPro1) and want to read it in Java. I always get blank or weird value (wrong place or wrong column or not full value) when I use GetString(fieldname) or GetBytes(fieldname). Example: I have a column called COMPNAME which has value "World Bank N.V.".
To get the value based on the field name, I write the code, rec.getString("COMPNAME") and it returns blank.
However, if I tried to get full row value such as rec.getBytes(), I could see "World Bank N.V."
Please advice what I did do wrong. I have set charset to cp1252 correctly.
Prepare JDBF 3.0 to Maven Central release. This includes:
When I read my dbf, the last record always occurs twice
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.text.ParseException;
import net.iryndin.jdbf.core.DbfRecord;
import net.iryndin.jdbf.reader.DbfReader;
public class JDBFTest {
public static void main(String[] args) throws IOException, ParseException {
DbfRecord rec = null;
DbfReader reader = new DbfReader( new File("./src/gds_im.dbf") );
while( (rec = reader.read()) != null ) {
rec.setStringCharset( Charset.forName("Cp866") );
System.out.println( rec.toMap() );
}
reader.close();
}
}
Check this and make only single read of the last record
Date is shifted 100 years forward.
But for "Visual FoxPro" files date parsing is correct.
So, we need to parse dates based on file type.
Would be nice if jdbf could be add as a driver to Dbeaver.
I found DANS Dbf, which work with jodbc-csv but is read only. and are not more developed.
I think that the task is about interface, I'm not a java developer, but I'm can help with tests.
Is it possible to append records to existing dbf?
What is best way, if we want to write 200k rows?
I dont know for sure what version of FoxPro my client is running but its old and the type meta data gives me a value of FoxPro2x.
I got this error when using getMemoAsString(); it would show the first entry then fail on the second entry. After some debugging and googling I realized that BufferedInputStream would throw this error if the skip() value was greater then the buffer size. BUFFER_SIZE in the MemoReader is set at 8192 and memoHeader.getBlockSize()*offsetInBlocks was giving me huge numbers like 18510208. Now I didnt do well in school but 18510208 is greater then 8192. So, I set the BUFFER_SIZE to 185102080 and it went threw with no errors.
Now This is probably not the best solution to the problem but I wanted to mention it. Maybe there should be a set method for the buffer size.
Exception in thread "Main Thread" java.io.IOException: Resetting to invalid mark
at java.io.BufferedInputStream.reset(BufferedInputStream.java:416)
at net.iryndin.jdbf.reader.MemoReader.read(MemoReader.java:57)
at net.iryndin.jdbf.core.DbfRecord.getMemoAsString(DbfRecord.java:162)
at net.iryndin.jdbf.core.DbfRecord.getMemoAsString(DbfRecord.java:170)
at TestMain.test(TestMain.java:53)
at TestMain.main(TestMain.java:29)
dbf = new FileInputStream("/home/paul/Desktop/DBF/file.DBF");
InputStream memo = new FileInputStream("/home/paul/Desktop/DBF/file.FPT");
DbfRecord rec;
DbfReader reader = new DbfReader(dbf, memo);
DbfMetadata meta = reader.getMetadata();
System.out.println("Read DBF Metadata: " + meta);
int recCounter = 0;
while ((rec = reader.read()) != null) {
rec.setStringCharset(stringCharset);
System.out.println(rec.toMap());
System.out.println(rec.getMemoAsString("CL_SERVCOM"));
There are some problems with corrupted files. Two examples :
1) Empty file
a) Replace test/resources/data1/gds_im.dbf
with an empty file of the same name
b) Run mvn test
The program throws a NullPointerException
:
java.lang.NullPointerException
at net.iryndin.jdbf.util.DbfMetadataUtils.parseHeaderUpdateDate(DbfMetadataUtils.java:67)
at net.iryndin.jdbf.util.DbfMetadataUtils.fillHeaderFields(DbfMetadataUtils.java:56)
at net.iryndin.jdbf.reader.DbfReader.readHeader(DbfReader.java:59)
at net.iryndin.jdbf.reader.DbfReader.readMetadata(DbfReader.java:45)
at net.iryndin.jdbf.reader.DbfReader.<init>(DbfReader.java:35)
Actually, the return value of dbfInputStream.read(bytes)
in DbfReader.java
(line 57) is not checked. I suggest to throw an IOException
if that function returns less than the 16 expected bytes.
2) Small file
a) Replace test/resources/data1/gds_im.dbf
with a file of the same name, containing 16 times the 0x02
byte.
b) Run mvn test
The program runs in a infinite loop. The reason is almost the same as above : in DbfMetadataUtils.readFields
, line 83, the return value of inputStream.read(fieldBytes)
is not tested. It should return JdbfUtils.FIELD_RECORD_LENGTH
, but it returns 0
. At line 91, inputStream.read()
will then return -1
, which is different from JdbfUtils.HEADER_TERMINATOR
, so the loop never breaks.
It's not purely theoretical : I ran into those bugs with empty or corrupted files.
Hi,
First of all, thank you to iryndin for sharing his great work.
I was having problem reading Double values from a dbf, so I solved it adding this in "DbfRecord.java":
At toMap() method, I have added the following case:
case Double: map.put(name, getDouble(name)); break;
And these new methods:
public Double getDouble(String fieldName) {
byte[] bytes = getBytes(fieldName);
return Double.longBitsToDouble(readLong(bytes));
}
protected long readLong(byte[] bytes) {
long value = 0;
value += (long) (bytes[7] & 0x000000FF) << 56;
value += (long) (bytes[6] & 0x000000FF) << 48;
value += (long) (bytes[5] & 0x000000FF) << 40;
value += (long) (bytes[4] & 0x000000FF) << 32;
value += (bytes[3] & 0x000000FF) << 24;
value += (bytes[2] & 0x000000FF) << 16;
value += (bytes[1] & 0x000000FF) << 8;
value += (bytes[0] & 0x000000FF);
return value;
}
I hope this will help you.
Regards.
The source makes a very little use of Java SE 7 features, so why don't make a Java SE 6 compliant code?
I know that Java SE 6 was released on December 11, 2006. So Java SE 6 is almost ten years old. But some companies still use it (see https://plumbr.eu/blog/java/java-version-and-vendor-data-analyzed-2016-edition: 9.56% for Java SE 6).
I need a dbf library that works with Java SE 6, and I noticed that there are only two diamonds operators in the sources, and three try with resources statements in the tests. It took me five minutes to make jdbf code compliant with Java SE 6.
To sum up :
public BigDecimal getBigDecimal(String fieldName) {
DbfField f = getField(fieldName);
String s = getString(fieldName);
if (s == null||"".equals(s)) {
return null;
}
//MathContext mc = new MathContext(f.getNumberOfDecimalPlaces());
//return new BigDecimal(s, mc);
return new BigDecimal(s.trim());
}
When the ‘DBF’ file have Chinese headers,Can not recognize
Enhancement: to parse fractional parts.
my dirty hack is:
public BigDecimal getBigDecimal(String fieldName) {
DbfField f = getField(fieldName);
String s = getString(fieldName);
if (s == null || s.trim().length() == 0) {
return null;
} else {
s = s.trim();
}
if (s.contains(NUMERIC_OVERFLOW)) {
return null;
}
int a = f.getNumberOfDecimalPlaces();
if (a==0)
return new BigDecimal(s);
else {
s = s.replace(',', '.');
MathContext mc = new MathContext(a);
return new BigDecimal(s, mc);
}
}
Hello,
There is a dead loop problem in JDK 1.8 or newer version when jdbf loads DBF file which size is over Integer.MAX_VALUE(2^31 - 1).
Dead loop occurs at DbfMetadataUtils.readFields(), in this method, jdbf read the DBF header fields in a while loop, the main logic is:
The problem is the InputStream.available() method, which is overridden by BufferedInputStream. Check the source code:
public synchronized int available() throws IOException {
int n = count - pos;
int avail = getInIfOpen().available();
return n > (Integer.MAX_VALUE - avail)
? Integer.MAX_VALUE
: n + avail;
}
Obviously, when DBF size is over Integer.MAX_VALUE, available() always returns Integer.MAX_VALUE.
So actually the calculation in step.5 is (Integer.MAX_VALUE - Integer.MAX_VALUE) which means inputStream always skip 0 byte, reads the first 32 bytes data of the stream repeatedly and leads to dead loop.
To fix the problem, we should calculate the skip bytes in another way. Notice that there is a local variable headerLength defined and is used to record the readed bytes in while loop, it seems that inputStream.skip(headerLength) would be the solution.
However, readHeader() is called before readFields() in DbfReader.readMetadata(). Check readHeader() and we can find that it reads 32 bytes of the inputStream, so the skip bytes should includes the 32 bytes.
Finally, the fixed code in readFields() is:
public static void readFields(DbfMetadata metadata, InputStream inputStream) throws IOException {
...
while (true) {
...
//long oldAvailable = inputStream.available(); not needed anymore
int terminator = inputStream.read();
if (terminator == -1) {
throw new IOException("The file is corrupted or is not a dbf file");
} else if (terminator == JdbfUtils.HEADER_TERMINATOR) {
break;
} else {
inputStream.reset();
inpiutStream.skip(headerLength + JdbfUtils.FIELD_RECORD_LENGTH); //JdbfUtils.FIELD_RECORD_LENGTH defined in JdbfUtils and its value is 32
}
}
...
}
I found the jdbf works well before JDK 1.8, take JDK 1.6 for example.
The reason why there is no dead loop problem in JDK 1.6 is because the BufferedInputStream.available() implementation in JDK 1.6 is different:
public synchronized int available() throws IOException {
return getInIfOpen().available() + (count - pos);
}
There is no Integer.MAX_VALUE check in JDK 1.6, although the return type is int which doesn't matter the follow-up process.
the current master branch does not compile:
[ERROR] src/main/java/net/iryndin/jdbf/util/DbfMetadataUtils.java:[37,25] unreported exception java.io.IOException; must be caught or declared to be thrown
No FPT generated for memo
f
Error in visual fox pro:
"file is not marked with code page"
"not a table"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.