osiegmar / fastcsv Goto Github PK
View Code? Open in Web Editor NEWCSV library for Java that is fast, RFC-compliant and dependency-free.
Home Page: https://fastcsv.org/
License: MIT License
CSV library for Java that is fast, RFC-compliant and dependency-free.
Home Page: https://fastcsv.org/
License: MIT License
Hi, would you maybe consider making the api classes not final?
I want to use your library but I don't want to introduce Powermock to be able to write my tests or write wrapper classes just to circumvent the final
Thank you
I faced error when using this with kotlin for android version of 23. I found that android remove many class like java.time and java.nio. Can you fix this incompatible.
Sorry, this is probably not the best forum for this question, but I'm not quite sure where else to ask this - feel free to point me somewhere else for this discussion.
I see a number of references to a v2 which will probably introduce a number of improvements but also breaking changes. When are you thinking you might be releasing a v2? I see some commits from January in the version2-rewrite
branch, but can't quite get a sense of how far along you are with that.
Also, it looks like v2 will require Java 8 - is that correct?
Hi,
I know you've put some extra efforts to provide the CsvRow.getFields()
as unmodiable List, but it would also be nice if the String[]
is directly available.
The reason in my case is that my application was built upon another CSV parsing before, and handled all the rows as String[]
...so, swapping the CSV parsing with your library actually requires me to swap the List<String>
from getFields()
back to String[]
to avoid rewriting too much code. However, this feels both cumbersome and unnessary since they are already available within the CsvRow
...just not accessible. It would be nice to expose it directly.
Since many Libs deal with String[]
rows, I think it could actually help as drop-in replacement for several of them.
Greetings aus Stuttgart
Is your feature request related to a problem? Please describe.
Basically CsvWriter fails the Single Responsibility Principle by tackling both high level csv formatting and low level IO buffering, creating problems for non trivial uses.
Some use cases rely on a Writer obtained previously which is used to write more data than just a single CSV file. For example the same stream can contain multiple CSVs or some other data at the end.
The user can not do a writer.flush() because the writer is internally wrapped with CachingWriter so the last bytes may never get written until a csvWriter.close() is issued which closes the writer as well.
Once a Writer is passed its state is unknown until a call to CavWriter.close() is made!
Describe the solution you'd like
Let construct a CsvWriter without messing in any way with the writer passed. Let CsvWriter deal with building the csv data structure, and let the passed Writer deal with the low level stuff.
Describe alternatives you've considered
No alternative possible except a source code modification.
RFC 4180 compliance
Would this feature comply to RFC 4180?
Yes, but the important part is that the code will be more correct, because it won't mess with external code passed to it.
Currently CsvRow
has originalLineNumber
property which represents line number in file but not in table.
Example:
planet,text // line 1 in file and line 1 in table view
Earth,"opan // line 2 in file and line 2 in table view
adsad
sdfsdf
sdfsdfsd
sdfsdfsd
sdfsfdsf"
Mars,marscool // line 8 in file, but line 3 in table view
Null string default is written as "null". Should be present a setter to override the default null string
Hi,
I'm a java programmer and saw the statistics and it was awesome. I have used CSV reader as well.
How does it works very fast? Which change makes it very fast and missing in other libraries?
I'm very curious to know..Since i am unable to find the email address..putting as feature request. Please don't mistake.
Thanks
Thank you for your work on this great product! It's proven performance has significantly improved the performance of our application.
We had initially been using version 1.0.4
, which greatly improved the performance of our CSV parsing (over commons-csv
which we had been previously using). We recently tried upgrading to version 2.1.0
. I like the new API, however, we noticed that there was a significant performance degradation over 1.0.4
. We have a little bit of a unique data format that we deal with, which involves an embedded CSV list within a CSV column. This is how our data looks:
NAME,NUMBER,WIDGETS_LIST
john doe,123456,"""thequickbrownfoxjumpedoverthelazydog"""
john smith,7890123,"""thequickbrownfoxjumpedoverthelazydog1"",""thequickbrownfoxjumpedoverthelazydog2"""
The WIDGETS_LIST
column is a variable length list that is formatted as an embedded csv string. Each item in the list is usually around 200 characters long.
With fastcsv 1.0.4 we would parse the data with code like this:
class Parser {
Client parseCsv(Path file) {
List<Client> clients = new ArrayList<>();
CsvReader csvReader = new CsvReader();
try(var parser = csvReader.parse(file, StandardCharsets.UTF_8)) {
CsvRow row;
while( (row = parser.nextRow()) != null) {
String name = row.getField(0);
String number = row.getField(1);
List<String> widgets = parseWidgets(row.getField(2));
clients.add(new Client(name, number, widgets));
}
}
return clients;
}
List<String> parseWidgets(String data) {
CsvReader csvReader = new CsvReader();
CsvParser parser = csvParser.parser(new StringReader(data));
CsvRow row = parser.nextRow();
return row != null ? List.copyOf(row.getFields()) : List.of();
}
}
With fastcsv 2.1.0 we parse with code like this:
class Parser {
Client parseCsv(Path file) {
try(var parser = CsvReader.builder().build(file)) {
return parser.stream()
.map(row -> {
String name = row.getField(0);
String number = row.getField(1);
List<String> widgets = parseWidgets(row.getField(2));
return new Client(name, number, widgets));
})
.toList();
}
List<String> parseWidgets(String data) {
return CsvReader.builder().build(data)
.stream().flatMap(r -> r.getFields().stream())
.toList();
}
}
Very surprisingly, the fastcsv 2.1.0
code takes around twice as long to parse the CSV data than version 1.0.4
. It seems to be related to the embedded CSV string since for other data without the embedded CSV, 2.1.0
is actually faster than 1.0.4
. However, I cannot figure out why the embedded CSV is causing such a significant slow down. To get meaningful performance results we benchmarked with a CSV file containing about 1 million rows, and processed the same file 10 times per run.
Additional context
Java distribution and version to be used (output of java -version
).
openjdk version "17.0.2" 2022-01-18
OpenJDK Runtime Environment Temurin-17.0.2+8 (build 17.0.2+8)
OpenJDK 64-bit Server VM Temurin-17.0.2+8 (build 17.0.2+8, mixed mode, sharing)
In CsvAppender.appendField(final String value)
final char[] valueChars = value.toCharArray();
...
for (final char c : valueChars) {
This is creating a temporary array that is only used to be iterated. It is easy to avoid this:
for (int i = 0; i < value.length(); i++) {
final char c = value.charAt(i);
IMO the extra index checks in charAt() weight less than the "new char[length]" impact on GC. Maybe I am wrong.
BTW, thanks for this nice easy to use library!
You can remove the throws because there is nothing that can throw the exception.
public CsvParser parse(final Reader reader) / * nothing happens without throws IOException */{
return new CsvParser(Objects.requireNonNull(reader, "reader must not be null"),
fieldSeparator, textDelimiter, containsHeader, skipEmptyRows,
errorOnDifferentFieldCount);
}
We are getting OOM error when tries to read large file say 100MB with more than 1000 records in it.
Can anyone help in this? Can we read in chunks?
I cannot find any option to disable the textDelimiter.
Any idea?
The copyLen variable not being reset back to zero after used in CR and LF logic.
} else if (c == CR) {
if (copyLen > 0) {
localCurrentField.append(localBuf, localCopyStart, copyLen);
copyLen = 0; // FIXME missing this line <=======
}
localLine.addField(localCurrentField.toStringAndReset());
localPrevChar = c;
localCopyStart = localBufPos;
break;
} else if (c == LF) {
if (localPrevChar != CR) {
if (copyLen > 0) {
localCurrentField.append(localBuf, localCopyStart, copyLen);
copyLen = 0; // FIXME missing this here too! <========
}
localLine.addField(localCurrentField.toStringAndReset());
localPrevChar = c;
localCopyStart = localBufPos;
break;
}
localCopyStart = localBufPos;
} else {
Many thanks for such a great piece of software, we chose to use your library for some big-data processing because we found it had vastly better performance than anything else! Huge thanks!
We ran into a slightly obscure bug parsing some huge CSVs: a field that appears at the end of a row that is quoted but empty gets silently dropped. Here's an example:
"foo",""
If you run this test you'll see it fails:
public void handlesEmptyQuotedFieldsAtEndOfRow() throws IOException {
assertEquals(readCsvRow("foo,\"\"").getField(1), "");
}
We ran into this because we receive CSVs that have all fields quoted, even empty ones, and couldn't work out why accessing the final field would sometimes lead to an ArrayIndexOutOfBoundsException
.
I've had an attempt at a fix for this which I'll raise a PR for momentarily, and I've done my best to try to stick to the performance sensitive methods you are using, but eager for feedback if I've done anything not to your liking!
Please let me know if we can help in any other way!
@osiegmar
Unable to save to csv due to exception, below is my stack trace and my code.
No virtual method toPath()Ljava/nio/file/Path; in class Ljava/io/File;
if (entityList.size() > 0) {
File storageDir = new File(Environment.getExternalStorageDirectory() + "/"
+ this.getString(R.string.app_name));
boolean success = true;
if (!storageDir.exists()) {
success = storageDir.mkdirs();
}
if (success) {
// String baseDir = getExternalStorageDirectory().getAbsolutePath();
// String filePath = baseDir + "/" + "Demo.csv";
File file = new File(storageDir, "contacts.csv");
CsvWriter csvWriter = new CsvWriter();
Collection<String[]> data = new ArrayList<>();
for (ContactEntity d : entityList) {
data.add(new String[]{"Name", "Phone Number"});
data.add(new String[]{d.getName(), d.getName()});
}
try {
csvWriter.write(file, StandardCharsets.UTF_8, data);
Log.v(TAG, "csv file created");
} catch (IOException e) {
e.printStackTrace();
}
}else {
Toast.makeText(this, "Directory not exist", Toast.LENGTH_SHORT).show();
}
} else {
Toast.makeText(this, "No data to export csv", Toast.LENGTH_SHORT).show();
}
Hi,
I have a problem using the csvReader.
I have created the csv file using the full csv at once writer including header and custom settings as
csvWriter.setFieldSeparator(';');
csvWriter.setLineDelimiter("\r\n".toCharArray());
csvWriter.setAlwaysDelimitText(true);
However i am trying to read the csv file using the full CSV file with header at once reader.
At first my issue is i get an ArrayIndexOutOfBoundException, and i dont know if this is caused by a large csv file.
public Map<String, String> readingFileAtOnceHeader(File file) throws IOException {
Map<String, String> personMap = new HashMap<>();
CsvReader csvReader = new CsvReader();
csvReader.setContainsHeader(true);
CsvContainer csv = csvReader.read(file, StandardCharsets.UTF_8);
for (CsvRow row : csv.getRows()) {
personMap.put(row.getField("PersonID"), row.getField("CivilRegistrationNumber"));
}
return personMap;
}
Then i tried to add
csvReader.setTextDelimiter('\'');
which solved the problem (However this is not what i want to add).
The second issue is that i am trying to read the two of the header fields, as illustrated in the above code. However both of these a null and i cant figure out why. I tried with index as well, and when i try
row.getField(0)
it returns the whole row of data and not only data at index 0. While
row.getField(0)
return an indexOutOfBoundException.
Hi,
I am trying to read from a csv file containing a bit more than 2 million rows, then make a simple mapping to something i can use to finally insert it to a database. However, i am getting an erro: "GC limit overhead exceeded", as it creates a lot of temporary objects.
I read the other issue regarding temporary objects, however as i could understand, it is regarding writing to a csv file, but i am getting this error while reading from an csv file.
First column of line: id;name ;firstname;age;�������;�����
Not reading russian language values from rows and fields
When using the following methods:
read(final File file, final Charset charset)
at CsvReader.java:107
read(final Path path, final Charset charset)
at CsvReader.java:122
devices < API 26 receive this exception:
java.lang.NoSuchMethodError: No virtual method toPath()Ljava/nio/file/Path; in class Ljava/io/File; or its super classes (declaration of 'java.io.File' appears in /system/framework/core-oj.jar) at de.siegmar.fastcsv.reader.CsvReader.read(CsvReader.java:109)
Most likely because of some methods in the java.nio
package are not available for API <26
Call requires API level 26 (current min is 19): java.nio.file.Paths#get
Suggestion would be to add a minimum API requirement for these methods or to read the file in a different way internally if the API is <26
Hi!
I have a issue here. I know that you have write the code like this, but this line gives null
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
Even if I have a header. I understand if csvContainer
is null if the CSV file is empty, but it should not be null if csvContainer
has at least one row or a header.
Edit:
Found out that this can be done this way.
public void newRow(String rowText) {
try {
CsvParser csvParser = csvReader.parse(file, StandardCharsets.UTF_8);
Collection<String[]> data = new ArrayList<>();
csvParser.nextRow(); // Need to call this to get the header
data.add((String[]) csvParser.getHeader().toArray()); // Add header
CsvRow csvRow;
while((csvRow = csvParser.nextRow()) != null)
data.add((String[]) csvRow.getFields().toArray()); // Add existing lines to data
data.add(rowText.split(delimiter)); // Add the new line to data with a new line
csvWriter.write(file, StandardCharsets.UTF_8, data); // Auto close
} catch (IOException e) {
dialogs.exception("Cannot add new rows", e);
}
}
FastCSV should have a method where to append text to files. I know that you have such method, but that will first remove all data, then fill.
It's a great library and I will use it with Deeplearning4j. But I wonder if you could make an interface so it would be easier to use?
Easier I mean by a programmer should not need to write this much code for null exceptions
package se.danielmartensson.tools;
import java.io.File;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import de.siegmar.fastcsv.reader.CsvContainer;
import de.siegmar.fastcsv.reader.CsvParser;
import de.siegmar.fastcsv.reader.CsvReader;
import de.siegmar.fastcsv.reader.CsvRow;
import de.siegmar.fastcsv.writer.CsvAppender;
import de.siegmar.fastcsv.writer.CsvWriter;
import javafx.scene.control.Alert.AlertType;
/**
* The reason why we are using FastCSV and not SQLite, is due to memory use.
* @author Daniel Mårtensson
*
*/
public class CSVHandler {
private Dialogs dialogs = new Dialogs();
private CsvReader csvReader;
private CsvWriter csvWriter;;
private File file;
private String delimiter;
/**
* Constructor
* @param fileHandler File handler object
* @param filePath Path to our file
* @param delimiter Separator "," or ";" etc.
* @param headers String that contains name of columns with delimiter as separator
*/
public CSVHandler(FileHandler fileHandler, String filePath, String delimiter, String headers) {
file = fileHandler.loadFile(filePath);
this.delimiter = delimiter;
csvWriter = new CsvWriter();
csvReader = new CsvReader();
csvReader.setFieldSeparator(delimiter.charAt(0));
csvWriter.setFieldSeparator(delimiter.charAt(0));
/*
* Check if file has 0 rows = empty
*/
if(getTotalRows() == 0)
newHeader(headers); // Write our header if we don't have one
csvReader.setContainsHeader(true);
}
/**
* Get a single cell
* @param row Row index
* @param header Header name
* @return String
*/
public String getCell(int row, String header) {
try {
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
int totalRows = csvContainer.getRowCount();
if(row > totalRows)
dialogs.alertDialog(AlertType.WARNING, "Index", "Index out of bounds: " + row + " > " + totalRows);
else
for (int i = 0; i < totalRows; i++)
if(i == row)
return csvContainer.getRow(i).getField(header); // Success!
return ""; // Nothing happens!
} catch (IOException | NullPointerException e) {
dialogs.exception("Cannot get cell. Returning empty string", e);
return ""; // Empty
}
}
/**
* Return a complete row
* @param row Row number that we want to return
* @return
*/
public List<String> getRow(int row) {
try {
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
return csvContainer.getRow(row).getFields();
}catch(IOException | NullPointerException e) {
dialogs.exception("Cannot get rows. Return List<String> = null", e);
return null;
}
}
/**
* Set one value to a single cell
* @param row Row number
* @param header Our string header
* @param cellValue Our value that we want to insert
*/
public void setCell(int row, String header, String cellValue) {
try {
/*
* Get total columns and get the current cell value in a row
*/
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
CsvRow csvRow = csvContainer.getRow(row);
String currentCell = csvRow.getField(header);
int totalColumns = csvContainer.getRow(row).getFields().size();
/*
* Search for column index by searching for a know cell value
*/
int columIndex = 0;
while(columIndex < totalColumns)
if(csvRow.getField(columIndex).equals(currentCell))
break;
else
columIndex++;
/*
* Insert cellValue in column and insert row in container
*/
csvRow.getFields().set(columIndex, cellValue);
csvContainer.getRows().set(row, csvRow); // TODO: Testa ta bort denna rad
/*
* Collect and write all
*/
writeAll(csvContainer);
} catch (IOException | NullPointerException e) {
dialogs.exception("Cannot set cell", e);
}
}
/**
* Replace a whole row
* @param row row number
* @param text text with delimiter separator
*/
public void replaceRow(int row, String text) {
try {
/*
* Replace all items in a row
*/
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
CsvRow csvRow = csvContainer.getRow(row);
String[] list = text.split(String.valueOf(delimiter));
int totalColumns = csvContainer.getRow(row).getFields().size();
if(list.length == totalColumns) {
for(int i = 0; i < list.length; i++)
csvRow.getFields().set(i, list[i]);
/*
* Insert row in container
*/
csvContainer.getRows().set(row, csvRow); // TODO: Testa ta bort denna rad
/*
* Collect and write all
*/
writeAll(csvContainer);
}else {
dialogs.alertDialog(AlertType.ERROR, "Insert", "Not same dimension as CSV file");
}
} catch (IOException | NullPointerException e) {
dialogs.exception("Cannot replace row", e);
}
}
/**
* Search for a cell value in a g
* @param cellValue The cell in form of a string
* @param header Name of the column
* @return boolean
*/
public boolean exist(String cellValue, String header) {
try {
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
if(csvContainer == null)
return false; // Nothing has been added, except the header
for(int i = 0; i < csvContainer.getRowCount(); i++)
if(cellValue.equals(csvContainer.getRow(i).getField(header)) == true)
return true; // Yes
return false; // Nope
} catch (IOException | NullPointerException e) {
dialogs.exception("Cannot check existens. Returning false", e);
return false;
}
}
/**
* Find on which row cellValue is on a header
* @param cellValue
* @param header
* @return int
*/
public int findRow(String cellValue, String header) {
try {
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
for(int i = 0; i < csvContainer.getRowCount(); i++)
if(cellValue.equals(csvContainer.getRow(i).getField(header)) == true)
return i; // Yes
return 0; // Nope
} catch (IOException | NullPointerException e) {
dialogs.exception("Cannot find row index. Returning 0", e);
return 0;
}
}
/**
* Delete the whole row at least if we got a row
* @param row row number
*/
public void deleteRow(int row) {
try {
/*
* Remove a selected row
*/
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
csvContainer.getRows().remove(row);
writeAll(csvContainer);
} catch (IOException | NullPointerException e) {
dialogs.exception("Cannot delete row", e);
}
}
/**
* Write all to the file
* @param csvContainer The CsvContainer object
* @throws IOException
*/
private void writeAll(CsvContainer csvContainer) throws IOException {
/*
* Collect and write all
*/
Collection<String[]> data = new ArrayList<>();
for(CsvRow csvRow : csvContainer.getRows())
data.add((String[]) csvRow.getFields().toArray());
csvWriter.write(file, StandardCharsets.UTF_8, data); // Auto close
}
/**
* Write a new header to the CSV file - This won't give us csvAppender == null if we have empty file
* @param rowText Enter the string
*/
public void newHeader(String rowText) {
try {
Collection<String[]> data = new ArrayList<>();
data.add(rowText.split(delimiter)); // Add the header data
csvWriter.write(file, StandardCharsets.UTF_8, data); // Auto close
} catch (IOException | NullPointerException e) {
dialogs.exception("Cannot write now row", e);
}
}
/**
* Create a new row
* @param rowText
*/
public void newRow(String rowText) {
try {
CsvParser csvParser = csvReader.parse(file, StandardCharsets.UTF_8);
Collection<String[]> data = new ArrayList<>();
CsvRow csvRow = csvParser.nextRow(); // Need to call this to get the header
data.add((String[]) csvParser.getHeader().toArray()); // Add header
data.add((String[]) csvRow.getFields().toArray()); // Add the row under the header
while((csvRow = csvParser.nextRow()) != null) {
data.add((String[]) csvRow.getFields().toArray()); // Add existing lines to data
}
data.add(rowText.split(delimiter)); // Add the new line to data with a new line
csvWriter.write(file, StandardCharsets.UTF_8, data); // Auto close
} catch (IOException e) {
dialogs.exception("Cannot add new rows", e);
}
}
/**
* Return total rows
* @return int total rows
*/
public int getTotalRows() {
try {
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
if(csvContainer == null)
return 0; // Null means no rows here
return csvContainer.getRowCount();
} catch (IOException e) {
dialogs.exception("Cannot find total rows. Returning 0", e);
return 0;
}
}
/**
* Return total columns, in this case, it's on row index 0
* @return int total columns
*/
public int getTotalColumns() {
try {
CsvContainer csvContainer = csvReader.read(file, StandardCharsets.UTF_8);
if(csvContainer == null)
return 0; // Null means no rows here
return csvContainer.getRow(0).getFields().size();
} catch (IOException e) {
dialogs.exception("Cannot find total columns. Returning 0.", e);
return 0;
}
}
}
I noticed that FastBufferedWriter
has flush
method which is never called:
// https://github.com/osiegmar/FastCSV/blob/master/src/main/java/de/siegmar/fastcsv/writer/FastBufferedWriter.java#L70-L73
@Override
public void flush() throws IOException {
flushBuffer();
out.flush();
}
FastBufferedWriter
uses flushBuffer
:
@Override
public void write(final char[] cbuf, final int off, final int len) throws IOException {
if (pos + len >= buf.length) {
flushBuffer();
}
if (len >= buf.length) {
out.write(cbuf, off, len);
} else {
System.arraycopy(cbuf, off, buf, pos, len);
pos += len;
}
}
private void flushBuffer() throws IOException {
out.write(buf, 0, pos);
pos = 0;
}
Should FastBufferedWriter
use flush
instead of flushBuffer
? Or what is the purpose of flush
and when it should be used? Readme does not provide any information about it
I need a way to enforce per-field quoting in order to generate CSV for PostgreSQL COPY statement because quoted empty string is treated as NULL, while totally empty field is treated as an empty string:
1,,3 ---> ""
1,"",3 ---> NULL
The current CsvAppender
API doesn't support such behavior. Possible solutions:
appendField
appendDelimitedField
alwaysDelimitText
field mutable so that the consumer can turn it off/on before appending the specific fieldDescribe the bug
random access by offset is incorrect.
testfile: Item.csv
runtime log:
CsvRow[originalLineNumber=1, startingOffset=0, fields=[JHXMMLCARMY0926DYG0111RL, 139707794, Women’s V Neck Nightshirt Cotton Casual Sleepwear Short Sleeve Nightgown S-XXL, ACTIVE, PUBLISHED, , Clothing, 16.32, USD, 16.32, 0.0, , 2038356, VALUE, 0.48, "LB", , Seller Fulfilled, , 2A5KQQ6BAE5S, 05432968344899, , http://www.walmart.com/ip/Women-s-V-Neck-Nightshirt-Cotton-Casual-Sleepwear-Short-Sleeve-Nightgown-S-XXL/139707794, https://i5.walmartimages.com/asr/f277aaf6-4bf0-4635-be9b-9ecf8826bbfa.c175ab2fc00cdfa902ee3408d6d4c586.jpeg, UNNAV, ["UNNAV"], Carlendan, 10/29/2021, 12/31/2049, 10/29/2021, 10/29/2021, 0, , Y, , , , ], comment=false]
CsvRow[originalLineNumber=1, startingOffset=0, fields=[, ], comment=false]
To Reproduce
JUnit test to reproduce the behavior:
private static void randomAccessFile() {
try {
final Path path = Paths.get(System.getProperty("user.dir") + "/data/Item.csv");
// collect row offsets (could also be done in larger chunks)
final List<Long> offsets;
try (CsvReader csvReader = CsvReader.builder().build(path, UTF_8)) {
offsets = csvReader.stream()
.map(CsvRow::getStartingOffset)
.collect(Collectors.toList());
}
// random access read with offset seeking
try (RandomAccessFile raf = new RandomAccessFile(path.toFile(), "r");
FileInputStream fin = new FileInputStream(raf.getFD());
InputStreamReader isr = new InputStreamReader(fin, UTF_8);
CsvReader reader = CsvReader.builder().build(isr);
CloseableIterator<CsvRow> iterator = reader.iterator()) {
// seek to file offset of row 5
raf.seek(offsets.get(5));
reader.resetBuffer();
System.out.println(iterator.next());
// seek to file offset of row 8
raf.seek(offsets.get(8));
reader.resetBuffer();
System.out.println(iterator.next());
}
} catch (final IOException e) {
throw new UncheckedIOException(e);
}
}
Additional context
Java distribution and version to be used (output of java -version
).
Hi I trying to write using appendLine() function. But the file is empty after I call the function.
This is a bug?
#18 Happen to me too.
With or without endline()
I think that I found a bug inside the class CsvWriter:
"Caused by: java.lang.NoSuchMethodError: No virtual method toPath()Ljava/nio/file/Path; in class Ljava/io/File; or its super classes (declaration of 'java.io.File' appears in /system/framework/core-oj.jar)
at de.siegmar.fastcsv.writer.CsvWriter.append(CsvWriter.java:148)"
I doesn't work on android 7.0 but it works on android 8.0, same phone.
I already tried forcing Android Studio working with Java VERSION_1_8 and VERSION_1_7 but it's still the same.
QuoteStrategy.EMPTY is convenient if I want to differenciate empty strings from null values in the output file.
However there is no such parameter in CsvReader which means I cannot read back the original data.
Below is a unit test showing this:
/**
* Writes a single row of special values, reads back the file, and tests
* that read values exactly match the original values.
*/
@Test
public void test() throws IOException {
String[] values = new String[]{
"Simple text",
"Multiline\ntext",
// a string containing a comma
"1,2",
// a string with double quotes
"\"Hello\"",
// a string containing a single character: a double quote
"\"",
// an empty string
"",
// a null value
null
};
File tmp = new File("C:/tmp/csv.txt");
// write the csv file
try (CsvWriter csv = CsvWriter.builder()
.quoteStrategy(QuoteStrategy.EMPTY)
.build(tmp.toPath(), StandardCharsets.UTF_8)) {
csv.writeRow(values);
}
// read back the file
String[] readValues = null;
try (CsvReader csv = CsvReader.builder()
.skipEmptyRows(true)
.build(tmp.toPath(), StandardCharsets.UTF_8)) {
for (CsvRow row : csv) {
readValues = new String[row.getFieldCount()];
for (int i = 0; i < readValues.length; i++) {
readValues[i] = row.getField(i);
}
}
}
Assert.assertNotNull(readValues);
// this fails because of the null value read back as an empty string
Assert.assertArrayEquals(values, readValues);
}
}
It would be very nice to have the QuoteStrategy parameter in the reader.
When running the example on a device below Android 8.0 Oreo, this LogCat log is created after the crash:
06-14 08:57:12.906 21834-24375/? E/AndroidRuntime: FATAL EXCEPTION: AsyncTask #5
Process: com.app.my, PID: 21834
java.lang.NoSuchMethodError: No virtual method toPath()Ljava/nio/file/Path; in class Ljava/io/File; or its super classes (declaration of 'java.io.File' appears in /system/framework/core-oj.jar)
at de.siegmar.fastcsv.writer.CsvWriter.append(CsvWriter.java:149)
at com.app.my.utils.AirDataUtils.writePath(MyUtils.java:29)
at com.app.my.activities.MainActivity$56$1$1.run(MainActivity.java:2860)
at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:243)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
at java.lang.Thread.run(Thread.java:762)
After updating the device to Android 8.0 Oreo, it works fine.
My build config:
compileSdkVersion 27
buildToolsVersion '27.0.3'
minSdkVersion 19
targetSdkVersion 27
OutputStreamWriter is always run with WRITE option so the file content gets replaced everytime.
Many Writer implementations are already buffered and fast enough. Also, many use cases start with an already existing BufferedWriter or similar, adding an extra layer only adds another temporary buffer copy.
For example, you may have code like this working for different uses cases:
GZIPOutputStream zout = new GZIPOutputStream(
new CipherOutputStream(new FileOutputStream(file), c));
return new BufferedWriter(new OutputStreamWriter(zout, "utf-8"));
Assuming you can't change the code above or you don't want to be forced to do this:
if (usingFasctCsv)
return new OutputStreamWriter(zout, "utf-8")
else
return new BufferedWriter(new OutputStreamWriter(zout, "utf-8"));
Can you add some way to construct an appender without wrapping the writer?
Or I will try a pull request...
Hi,
Currently to write anything using csvWriter file, file path or writer is supported. I wanted to write a zipped csv file so I did something like below,
where stream
is GZIPOutputStream. But it only accepts a byte array. And as you can write(...) method is giving char[] so I had to convert it to byte array. I guess because of that It is taking same time as fasterXML.
Writer writer = new Writer() {
@Override
public void write(@NotNull char[] cbuf, int off, int len) throws IOException {
byte[] b = new byte[len];
for (int i = 0; i < len; i++) {
b[i] = (byte) cbuf[i];
bytes[0] += 8;
}
stream.write(b);
}
@Override
public void flush() throws IOException {
stream.flush();
}
@Override
public void close() throws IOException {
stream.close();
}
};
Thanks.
result is [aaa, a",c, ccc]
Hello! First of all, i'd like to both thank you and congratulate you, it's a great library and very straightforward to use.
The only not-so-straightforward part of using this to export data to a CSV on my Android application was a combination of:
This is my first Android app so maybe I'm making some rookie mistakes but i ended up doing this, based on this method:
// <Activity code>
// called from a Button click, prompts the 'save As' window for the user to pick a file name and location
private void getCSVExportingFile() {
Intent intent = new Intent(Intent.ACTION_CREATE_DOCUMENT);
intent.addCategory(Intent.CATEGORY_OPENABLE);
intent.setType("text/csv");
intent.putExtra(Intent.EXTRA_TITLE, CSVExporter.buildName());
return super.onOptionsItemSelected(item);
startActivityForResult(intent, Constants.CSV_CREATE_FILE_INTENT_CODE);
}
// this is called after the 'save As' window exits, with data == null if it fails or != if it succeeds.
@Override
protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) {
super.onActivityResult(requestCode, resultCode, data);
if (requestCode == Constants.CSV_CREATE_FILE_INTENT_CODE && resultCode == RESULT_OK) {
Uri uri = null;
if (data != null) {
uri = data.getData();
try {
CSVExporter.exportToCSV(this.getContentResolver().openOutputStream(uri)); // <- this (1/2)
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
}
// <I called this CSVExporter, but could be anything>
// the actual dumping function
public static void exportToCSV(OutputStream openOutputStream) {
new Thread(() -> {
try (CsvWriter csv = CsvWriter.builder().build(new OutputStreamWriter(openOutputStream))) { // <- this (2/2)
writeHeader(csv);
List<SomeRecord> data = someClass.getRecords();
for(SomeRecord item : data){
writeRow(csv,item);
}
} catch (IOException e) {
e.printStackTrace();
}
}).start();
}
So what bothers me is IMO the smoothest you can do is
Looking at the source code, the path version doesn't seem to do things much different than what I did. Now while this may be unnecesary, i think there's a quality of life improvement in adding a builder that takes either the Uri or the OutputStream as the parameter. Also with little change, I think the above code could be used as a more realistic version of the CsvWriter file() example, making this library suuuuper plug-n-play. Going from the current example to the final version of this code was a big leap IMO.
Thanks again!
I want to use the FastCSV for implementing a large file CSV editor, using JavaFX.
I plan to parse only separate rows, at render time. This means I need to use the CSV parser to parse a String line at a time.
Is it possible to have something like this?
CSVParser parser = new CSVParser();
parse.parse( line98);
parser.reset();
parser.parse(line99)
I mean I parse a line, then I may parse another, etc. For getting optimal memory usage, I would instantiate the CSVParser only one time.
Wanted to have support for the commented lines as well. This will help in ignoring the lines mentioned as comment in the csv format.
Hi, the Travis build is failing ;-(...
Describe the bug
I am trying to read a file having size of approx 340 MBs. After I reach line number 809, I get this error:
Caused by: java.io.IOException: Maximum buffer size 8388608 is not enough to read data
To Reproduce
Try reading a csv with 800k rows.
Code:
CsvReader reader = CsvReader.builder()
.fieldSeparator('\t')
.quoteCharacter('"')
.commentStrategy(CommentStrategy.NONE)
.skipEmptyRows(true)
.errorOnDifferentFieldCount(true)
.build(path, charset);
reader.forEach(System.out::println);
Additional context
java version "1.8.0_201"
Similar to OpenCSV's CsvToBeanBuilder
In the current version, some class names are not clear (see #7 (comment)).
Here are some suggestions for 2.x:
old name | new name |
---|---|
CsvReader |
CsvReaderFactory |
CsvParser |
CsvReader |
CsvWriter |
CsvWriterFactory |
CsvAppender |
CsvWriter |
CsvWriterFactory factory = new CsvWriterFactory().fieldSeparator(';');
try(StringWriter writer = new StringWriter()) {
try(CsvWriter csv = factory.create(writer)) {
...
}
}
Path file = ...;
try(CsvWriter csv = factory.create(file, StandardCharsets.UTF_8)) {
...
}
old name | new name |
---|---|
CsvReader |
CsvReaderSettings |
CsvParser |
CsvReader |
CsvWriter |
CsvWriterSettings |
CsvAppender |
CsvWriter |
CsvWriterSettings settings = new CsvWriterSettings();
settings.setFieldSeparator(';');
try(StringWriter writer = new StringWriter()) {
try(CsvWriter csv = CsvWriter.create(settings, writer)) {
...
}
}
Path file = ...;
try(CsvWriter csv = CsvWriter.create(settings, file, StandardCharsets.UTF_8)) {
...
}
old name | new name |
---|---|
CsvReader |
CsvReaderSettings |
CsvParser |
CsvReader |
CsvWriter |
CsvWriterSettings |
CsvAppender |
CsvWriter |
CsvWriterSettings settings = CsvWriterSettings.builder().fieldSeparator(';').buid();
try(StringWriter writer = new StringWriter()) {
try(CsvWriter csv = CsvWriter.create(settings, writer)) {
...
}
}
Path file = ...;
try(CsvWriter csv = CsvWriter.create(settings, file, StandardCharsets.UTF_8)) {
...
}
Hi, there are commits in master, but no releases were made in two years, is it possible for you to make a release?
Specifically interested in the following fix.
Thanks!
Hello!
Do you plan to add another method, where will the InputStream be the first parameter?
Thanks.
public CsvContainer read(final InputStream stream) throws IOException {
Objects.requireNonNull(stream, "stream must not be null");
try (final Reader reader = newInputStreamReader(stream)) {
return read(reader);
}
}
Hi, i'm looking for a fast csv writer java lib, in append mode.
I've discovered your lib and i wonder if you have some benchmark/proof that your implementation is faster than others...like apache commons for example (http://commons.apache.org/proper/commons-csv/) ?
Thanks in advance for your answer.
Regards,
when only have one row, the getField method returns null
Please help with this exception:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
at java.base/java.lang.System.arraycopy(Native Method)
at de.siegmar.fastcsv.reader.ReusableStringBuilder.append(ReusableString
Builder.java:65)
at de.siegmar.fastcsv.reader.RowReader.readLine(RowReader.java:74)
at de.siegmar.fastcsv.reader.CsvParser.nextRow(CsvParser.java:85)
at de.siegmar.fastcsv.reader.CsvReader.read(CsvReader.java:147)
at de.siegmar.fastcsv.reader.CsvReader.read(CsvReader.java:126)
at com.teamtrade.fundamental.report.screener.ReportScreener.readCsvFile(
ReportScreener.java:69)
at com.teamtrade.fundamental.report.screener.ReportScreener.readQuarterR
eports(ReportScreener.java:141)
at com.teamtrade.fundamental.report.screener.ReportScreener.main(ReportS
creener.java:45)
private static CsvContainer readCsvFile(Path path) throws IOException {
CsvReader csvReader = new CsvReader();
csvReader.setFieldSeparator('\t');
csvReader.setContainsHeader(true);
return csvReader.read(path, StandardCharsets.UTF_8); // line number 69
}
CSV file name 'txt.tsv'. It is in this archive: https://www.sec.gov/files/dera/data/financial-statement-and-notes-data-sets/2015q2_notes.zip
Can anyone please tell me the steps on how to use this in my android application?
Thank You!
Maybe I'm missing something obvious .. I want to simply add a row of data to a CSV file that already exists on disk.
Using CsvWriter and writeRow does not create a row at the bottom of the file. I noticed things changed in the version overhaul, and the CsvAppender and appendLine stuff is gone.
So, using the CsvWriter how can you open an existing CSV file and add a row of new data?
I want to create a big csv file using the CSV Appender. I'm using this code:
`for (int i = 0; i < writeBuffer.size(); i++) {
String array[] = writeBuffer.get(i).toArray(new String[writeBuffer.get(i).size()]);
csvAppender.appendLine(array);
}`
being writeBuffer a List<List>. This buffer can have more than 500 lines.
When I finish with the processing, the resulting file only has 148 lines and the last one is incomplete.
I also have try to flush at 100 lines, but then is not writting the next lines.
Maybe i am using the library in a incorrect way?
Thanks in advance.
I was wondering whether it might be possible for NamedCsvReader and CsvReader to share a common interface, or perhaps for NamedCsvReader to extend from CsvReader? (And similarly for NamedCsvRow and CsvRow
Here's the use case I have in my head...
I want to use FastCSV to process a user-specified CSV, which may or may not have headers (but the user will tell me whether it does or not). At the moment, I have to write essentially the same code twice because NamedCsvReader and CsvReader are completely separate classes, as are NamedCsvRow and CsvRow. What would make my code much neater and easier to manage would be something along the following lines...
ICsvReader csvReader;
if(hasHeaders){
csvReader = NamedCsvReader.builder().build(path, charset);
}else{
csvReader = CsvReader.builder().build(path, charset);
}
csvReader.stream().forEach(row -> {
//Do something here with each row, casting to NamedCsvRow where necessary and appropriate
});
Perhaps there's a reason why it hasn't been written this way (or perhaps I've missed an existing way of doing this), but perhaps something that could be considered in a future release?
Hello i have problem using CsvReader:
Csv file:
Field1 Field2 Field3 Field4 Field5
1Value1 1Value2 1Value3 1Value4 1Value5
2Value1 2Value2 2Value3 2Value4 2Value5
3Value1 3Value2 3Value3 3Value4 3Value5
Simple program:
CsvReader reader = new CsvReader();
reader.setTextDelimiter('\t');
reader.setContainsHeader(true);
CsvContainer csv = reader.read(Paths("test.csv"), StandardCharsets.UTF_8);
for (CsvRow row : csv.getRows()) {
System.out.println(row);
System.out.println(row.getField(0));
System.out.println(row.getField(1));
}
Output:
CsvRow{originalLineNumber=2, fields={Field1 Field2 Field3 Field4 Field5=1Value1 1Value2 1Value3 1Value4 1Value5}}
1Value1 1Value2 1Value3 1Value4 1Value5
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at java.util.Arrays$ArrayList.get(Arrays.java:3841)
at de.siegmar.fastcsv.reader.CsvRow.getField(CsvRow.java:66)
at org.mycompany.fastcsvtest.Main.readTest(Main.java:92)
at org.mycompany.fastcsvtest.Main.main(Main.java:28)
Any thoughts?
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.