Java read csv file with OpenCSV for efficient data parsing

Java read csv file with OpenCSV for efficient data parsing

Reading a CSV file in Java seems simple until you hit a comma inside a quoted field, a Windows CRLF ending, or a 500 MB export from a legacy system. This guide shows the best ways to java read csv file correctly, from a quick BufferedReader solution to safer library-based parsing with OpenCSV and Apache Commons CSV.

Quick answer

  • Use BufferedReader for simple CSV files you fully control.
  • Use OpenCSV when you want quick setup and mapping rows directly to Java objects.
  • Use Apache Commons CSV when you need header-based access and more control over CSV formats.
  • Avoid split(",") for real-world CSV files with quoted values or multiline fields.
  • For large files, parse row by row instead of loading the whole file into memory.
CaseBest choice
Simple internal CSVBufferedReader
Quoted values / real-world importsOpenCSV
Headers and named-column accessApache Commons CSV
Very large filesStreaming, row-by-row parsing

What you’ll learn

  • Native Java I/O: Read simple CSV files with BufferedReader.
  • OpenCSV: Parse quoted fields, custom delimiters, and map rows directly to Java POJOs.
  • Apache Commons CSV: Handle headers, Excel-style exports, and custom CSV formats.
  • Large-file processing: Read lazily and avoid memory issues.
  • Production-safe parsing: Handle encoding, validation, malformed rows, and performance pitfalls.

Best way to read CSV file in Java

A CSV file looks simple until production data arrives. The main reason naive parsing fails is that CSV rules are not the same as plain string splitting. Quoted commas, embedded newlines, BOM prefixes, and non-comma delimiters all break fragile code.

ProblemExampleWhy naive parsing breaks
Comma inside a field"Smith, John",30Splits on the wrong comma
Embedded newline"line1\nline2",xBreaks line-by-line assumptions
Escaped quote"He said ""Hi"""Mismatches quote pairs
Semicolon delimiterEuropean Excel exportsWrong delimiter assumed
BOM prefixUTF-8 files from WindowsFirst field may contain invisible bytes
Mixed line endingsCRLF + LF in one fileCan leave \r in values

For simple internal files, native Java can be enough. For anything exported from Excel, a database, or a third-party system, a CSV library is the safer choice.

Read CSV file in Java with BufferedReader

If your CSV file is simple and predictable, BufferedReader is the fastest way to start. It has no external dependency and works well for internal files without quoted commas or multiline fields.

import java.io.BufferedReader;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;

public class BasicCSVReader {
    public static List<String[]> readCSV(String filename) throws IOException {
        List<String[]> records = new ArrayList<>();

        try (BufferedReader reader = Files.newBufferedReader(
                Paths.get(filename), StandardCharsets.UTF_8)) {

            reader.readLine(); // skip header
            String line;

            while ((line = reader.readLine()) != null) {
                String[] fields = line.split(",", -1);
                records.add(fields);
            }
        }

        return records;
    }
}
  1. Use Files.newBufferedReader() with an explicit charset.
  2. Skip the header if your file has one.
  3. Use split(",", -1) to preserve empty trailing fields.
  4. Wrap file access in try-with-resources.
  5. Use this only for simple CSV content you control.
  • split(",") cannot correctly parse quoted commas.
  • Multiline fields will silently break your data.
  • Manual parsing is fine for quick scripts, not for messy production input.

If your separator is not a comma, see Java split string by delimiter for regex-safe delimiter handling and common pitfalls.

Read CSV file in Java with OpenCSV

OpenCSV is a strong choice when you want quick setup, safer parsing, and direct mapping from CSV rows to Java objects. It is especially useful when your input contains quoted values, custom separators, or headers.

Maven:

<dependency>
    <groupId>com.opencsv</groupId>
    <artifactId>opencsv</artifactId>
    <version>LATEST_STABLE_VERSION</version>
</dependency>

Read row by row:

import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;

import java.io.FileReader;

public class OpenCSVBasic {
    public static void main(String[] args) throws Exception {
        try (CSVReader reader = new CSVReaderBuilder(new FileReader("users.csv"))
                .withSkipLines(1)
                .build()) {

            String[] row;
            while ((row = reader.readNext()) != null) {
                System.out.println(row[0] + " | " + row[1]);
            }
        }
    }
}

Use a custom separator:

import com.opencsv.CSVParser;
import com.opencsv.CSVParserBuilder;
import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;

import java.io.FileReader;
import java.util.Arrays;

public class OpenCSVCustomSeparator {
    public static void main(String[] args) throws Exception {
        CSVParser parser = new CSVParserBuilder()
                .withSeparator(';')
                .withQuoteChar('"')
                .build();

        try (CSVReader reader = new CSVReaderBuilder(new FileReader("data.csv"))
                .withCSVParser(parser)
                .withSkipLines(1)
                .build()) {

            String[] row;
            while ((row = reader.readNext()) != null) {
                System.out.println(Arrays.toString(row));
            }
        }
    }
}
  • Best for quick implementation and POJO mapping
  • Handles quoted values and multiline fields correctly
  • Good fit for Spring Boot import flows
  • Safer than manual parsing for real-world CSV input

Read CSV file in Java with Apache Commons CSV

Apache Commons CSV is a great option when you want clean header-based access, format presets, and predictable parsing behavior. It is especially useful for ETL code, batch imports, and files that come from Excel or external systems.

Maven:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-csv</artifactId>
    <version>LATEST_STABLE_VERSION</version>
</dependency>

Read CSV with headers:

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;

import java.io.Reader;
import java.nio.file.Files;
import java.nio.file.Path;

public class CommonsCsvExample {
    public static void main(String[] args) throws Exception {
        try (Reader reader = Files.newBufferedReader(Path.of("users.csv"));
             CSVParser parser = CSVFormat.DEFAULT
                     .builder()
                     .setHeader()
                     .setSkipHeaderRecord(true)
                     .setIgnoreHeaderCase(true)
                     .setTrim(true)
                     .build()
                     .parse(reader)) {

            for (CSVRecord record : parser) {
                System.out.println(record.get("name") + " - " + record.get("email"));
            }
        }
    }
}

Use a custom format:

import org.apache.commons.csv.CSVFormat;

CSVFormat format = CSVFormat.DEFAULT
        .builder()
        .setDelimiter('|')
        .setQuote('"')
        .setHeader()
        .setSkipHeaderRecord(true)
        .build();
  • Excellent for header-based parsing
  • Cleaner than index-based access when column order may change
  • Good choice for production parsing and data pipelines
  • Supports predefined and custom CSV formats

Which Java CSV library should you use?

ApproachChoose it whenMain strength
BufferedReaderYou control the file format and it is simpleNo dependency
OpenCSVYou want quick setup and bean mappingEasy to implement
Apache Commons CSVYou want named headers and format controlReadable production code

Read CSV header and map rows to Java objects

If you want cleaner code, map CSV rows to Java objects instead of working with raw string arrays. This makes your import code easier to validate, test, and pass into service classes.

import com.opencsv.bean.CsvBindByName;
import com.opencsv.bean.CsvToBeanBuilder;

import java.io.FileReader;
import java.util.List;

public class Product {
    @CsvBindByName(column = "name")
    private String name;

    @CsvBindByName(column = "price")
    private double price;

    @CsvBindByName(column = "category")
    private String category;

    public String getName() {
        return name;
    }

    public double getPrice() {
        return price;
    }

    public String getCategory() {
        return category;
    }
}

public class BeanReader {
    public static List<Product> read(String filename) throws Exception {
        return new CsvToBeanBuilder<Product>(new FileReader(filename))
                .withType(Product.class)
                .withIgnoreLeadingWhiteSpace(true)
                .build()
                .parse();
    }
}
  • Use @CsvBindByName when your CSV has headers
  • Use @CsvBindByPosition when it does not
  • Object mapping keeps your import layer cleaner than raw arrays
  • This pattern works especially well in Spring Boot services

Once you start mapping parsed data into Java classes, DTO design becomes important. See What is DTO in Spring Boot for a cleaner way to structure import and API layers.

Read CSV with different separators and encodings

One of the most common production issues is a file that looks correct in a text editor but breaks in Java. In most cases, the root cause is either the wrong delimiter or the wrong character encoding.

EncodingCommon use caseJava handling
UTF-8Modern systemsStandardCharsets.UTF_8
ISO-8859-1Legacy European dataStandardCharsets.ISO_8859_1
Windows-1252Older Windows exportsCharset.forName("windows-1252")
UTF-16Some spreadsheet toolsStandardCharsets.UTF_16
import java.io.BufferedReader;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Arrays;

public class EncodingExample {
    public static void main(String[] args) throws IOException {
        try (BufferedReader reader = Files.newBufferedReader(
                Path.of("data.csv"), StandardCharsets.UTF_8)) {

            String line;
            while ((line = reader.readLine()) != null) {
                String[] fields = line.split(";", -1);
                System.out.println(Arrays.toString(fields));
            }
        }
    }
}
  • Always specify the charset explicitly
  • Use UTF-8 for new projects whenever possible
  • Confirm the delimiter before writing parsing logic
  • Switch to a library when delimiters may appear inside quoted fields

When CSV data uses pipes, tabs, or semicolons, delimiter handling becomes a parsing issue, not just a string issue. See Java split string by delimiter for regex-safe splitting patterns.

How to read large CSV files in Java

For large CSV files, the main rule is simple: process rows as a stream instead of loading everything into memory. In practice, that means using BufferedReader, OpenCSV’s readNext(), or Apache Commons CSV iteration.

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;

import java.io.Reader;
import java.nio.file.Files;
import java.nio.file.Path;

public class LargeCsvExample {
    public static void main(String[] args) throws Exception {
        try (Reader reader = Files.newBufferedReader(Path.of("large-file.csv"));
             CSVParser parser = CSVFormat.DEFAULT
                     .builder()
                     .setHeader()
                     .setSkipHeaderRecord(true)
                     .build()
                     .parse(reader)) {

            for (CSVRecord record : parser) {
                // Process one row at a time
                System.out.println(record.get(0));
            }
        }
    }
}
  • Prefer row-by-row iteration over readAll() for large files
  • Keep processing logic close to the read loop
  • Use parallel processing only after profiling
  • Avoid storing the full file in memory unless you truly need it

Common CSV reading errors in Java

  • Using split(",") on real-world CSV — this fails on quoted commas
  • Loading the whole file into memory — risky for large imports
  • Skipping validation — malformed rows then move into business logic
  • Ignoring encoding — this breaks non-English text
  • Using hard-coded column indexes everywhere — fragile when column order changes

Production-safe CSV processing means reading row by row, validating field counts, logging malformed records, and mapping valid rows into structured Java objects as early as possible.

import java.io.BufferedReader;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;

public class ValidatingCSVReader {

    public record ParseResult(List<String[]> valid, List<String> errors) {}

    public static ParseResult parse(String filename) throws IOException {
        List<String[]> valid = new ArrayList<>();
        List<String> errors = new ArrayList<>();
        int lineNum = 0;

        try (BufferedReader reader = Files.newBufferedReader(Paths.get(filename))) {
            reader.readLine(); // skip header
            String raw;

            while ((raw = reader.readLine()) != null) {
                lineNum++;
                String[] fields = raw.split(",", -1);

                if (fields.length != 3) {
                    errors.add("Line " + lineNum + ": expected 3 fields, got " + fields.length);
                    continue;
                }

                valid.add(fields);
            }
        }

        return new ParseResult(valid, errors);
    }
}
  1. Track line numbers from the start
  2. Validate field count before indexing into arrays
  3. Log skipped rows instead of silently dropping them
  4. Separate valid records from parsing errors
  5. Keep your import layer observable and debuggable

Building real Java backends? CSV parsing is usually just the first step. In production systems, it quickly connects to DTO mapping, validation, Spring Boot endpoints, response design, and import pipelines. That is why strong Java developers do not just “parse files” — they design safe data flows end to end.

More Java guides

Frequently Asked Questions

Use BufferedReader with Files.newBufferedReader() to read the file line by line, then split each line with line.split(",", -1). This works well for simple CSV files but does not correctly handle quoted commas or multiline fields.

OpenCSV is a strong option when you want fast setup and bean mapping. Apache Commons CSV is excellent when you need headers, readable named-column access, and more control over parsing behavior.

Yes. Apache Commons CSV supports header-based access such as record.get("name"), and OpenCSV can map header names directly to Java fields with bean annotations.

Read the file row by row instead of loading it fully into memory. Use BufferedReader, OpenCSV’s readNext(), or Apache Commons CSV iteration for memory-safe processing.

Because CSV fields can contain commas inside quotes. A plain string split does not understand CSV rules, so valid rows get broken into the wrong number of columns.

Use OpenCSV’s bean mapping. Annotate fields with @CsvBindByName or @CsvBindByPosition, then parse with CsvToBeanBuilder to get typed Java objects instead of raw arrays.