Git Product home page Git Product logo

fastcsv's Introduction

FastCSV

FastCSV is a lightning-fast, dependency-free CSV library for Java that conforms to RFC standards.

build Codacy Badge codecov oss-fuzz javadoc Maven Central


The primary use cases of FastCSV include:

  • In big data applications: efficiently reading and writing data on a massive scale.
  • In small data applications: serving as a lightweight library without additional dependencies.

Benchmark & Compatibility

Benchmark

Note

This selected benchmark is based on the Java CSV library benchmark suite

While maintaining high performance, FastCSV serves as a strict RFC 4180 CSV writer while also exhibiting the ability to read somewhat garbled CSV data. See JavaCsvComparison for details.

Features

As one of the most popular CSV libraries for Java on GitHub, FastCSV comes with a wide range of features:

  • Crafted with natural intelligence, ❤️, and AI assistance ✨
  • Enables ultra-fast reading and writing of CSV data
  • Has zero runtime dependencies
  • Maintains a small footprint
  • Provides a null-free and developer-friendly API
  • Compatible with GraalVM Native Image
  • Delivered as an OSGi-compliant bundle
  • Actively developed and maintained
  • Well-tested and documented

CSV specific

  • Compliant to RFC 4180 – including:
    • Newline and field separator characters in fields
    • Quote escaping
  • Configurable field separator
  • Supports line endings CRLF (Windows), LF (Unix) and CR (old macOS)
  • Supports unicode characters

Reader specific

  • Supports reading of some non-compliant (real-world) data
  • Preserves line break character(s) within fields
  • Preserves the starting line number (even with skipped and multi-line records) – helpful for error messages
  • Auto-detection of line delimiters (CRLF, LF, or CR – can also be mixed)
  • Configurable data validation
  • Supports (optional) header records (get field based on field name)
  • Supports skipping empty lines
  • Supports commented lines (skipping & reading) with configurable comment character
  • Configurable field modifiers (e.g., to trim fields)
  • Flexible callback handlers (e.g., to directly map to domain objects)
  • BOM support (UTF-8, UTF-16 LE/BE, UTF-32 LE/BE)

Writer specific

  • Supports flexible quoting strategies (e.g., to differentiate between empty and null)
  • Supports writing comments

Requirements

  • for 3.x version: Java ⩾ 11 (Android 13 / API level 33)
  • for 2.x version: Java ⩾ 8 (Android 8 / API level 26)

Note

Android is not Java and is not officially supported. Nevertheless, some basic checks are included in the continuous integration pipeline to verify that the library should work with Android.

CsvReader examples

Iterative reading of some CSV data from a string

CsvReader.builder().ofCsvRecord("foo1,bar1\nfoo2,bar2")
    .forEach(System.out::println);

Iterative reading of a CSV file

try (CsvReader<CsvRecord> csv = CsvReader.builder().ofCsvRecord(file)) {
    csv.forEach(System.out::println);
}

Iterative reading of some CSV data with a header

CsvReader.builder().ofNamedCsvRecord("header 1,header 2\nfield 1,field 2")
    .forEach(rec -> System.out.println(rec.getField("header2")));

Iterative reading of some CSV data with a custom header

CsvCallbackHandler<NamedCsvRecord> callbackHandler =
    new NamedCsvRecordHandler("header1", "header2");

CsvReader.builder().build(callbackHandler, "field 1,field 2")
    .forEach(rec -> System.out.println(rec.getField("header2")));

Custom settings

CsvReader.builder()
    .fieldSeparator(';')
    .quoteCharacter('"')
    .commentStrategy(CommentStrategy.SKIP)
    .commentCharacter('#')
    .skipEmptyLines(true)
    .ignoreDifferentFieldCount(false)
    .acceptCharsAfterQuotes(false)
    .detectBomHeader(false);

IndexedCsvReader examples

Indexed reading of a CSV file

try (IndexedCsvReader<CsvRecord> csv = IndexedCsvReader.builder().ofCsvRecord(file)) {
    CsvIndex index = csv.getIndex();

    System.out.println("Items of last page:");
    int lastPage = index.getPageCount() - 1;
    List<CsvRecord> csvRecords = csv.readPage(lastPage);
    csvRecords.forEach(System.out::println);
}

CsvWriter examples

Iterative writing of some data to a writer

var sw = new StringWriter();
CsvWriter.builder().build(sw)
    .writeRecord("header1", "header2")
    .writeRecord("value1", "value2");

System.out.println(sw);

Iterative writing of a CSV file

try (CsvWriter csv = CsvWriter.builder().build(file)) {
    csv
        .writeRecord("header1", "header2")
        .writeRecord("value1", "value2");
}

Custom settings

CsvWriter.builder()
    .fieldSeparator(',')
    .quoteCharacter('"')
    .quoteStrategy(QuoteStrategies.ALWAYS)
    .commentCharacter('#')
    .lineDelimiter(LineDelimiter.LF);

Further reading


Sponsoring and partnerships

FastCSV

YourKit was used to optimize the performance and footprint of FastCSV. YourKit is the creator of YourKit Java Profiler, YourKit .NET Profiler, and YourKit YouMonitor.

fastcsv's People

Contributors

osiegmar avatar dependabot[bot] avatar charphi avatar juergen-albert avatar nathankleyn avatar richard-lionheart avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.