Read Huge Excel File(500K Rows) in Java

Read very large Excel file with date and non-date numbers

Using latest version of Excel Streaming Reader, which is 2.1.0, this problem is gone.

Using your test-file.xlsx and following code:

import java.io.InputStream;

import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Cell;

import com.monitorjbl.xlsx.StreamingReader;

public class PoiStreamingExample {

private static final String FILE_NAME = "./test-file.xlsx";

public static void main(String[] args) {

try (
InputStream is = PoiStreamingExample.class.getResourceAsStream(FILE_NAME);
Workbook workbook = StreamingReader.builder()
.rowCacheSize(100)
.bufferSize(4096)
.open(is)) {
Sheet sheet = workbook.getSheetAt(0);
for (Row r : sheet) {
String rowString = "";
for (Cell c : r) {
if (rowString != "") {
rowString += ",";
}
rowString += c.getStringCellValue();
}
System.out.println(rowString);
}
} catch (Exception ex) {
ex.printStackTrace();
}

}
}

It prints:

Number,Date (mostly),Date (mostly)
123456,3/26/19
123456,8/8/19,7/7/20
123456,2/26/19,3/11/19
123456,2/7/19,3/14/19
123456,3/11/19,4/9/19
123456,3/12/19,4/19/19
7890123,3/29/19,8/23/19
7890123,7/29/20
7890123,2/25/19,3/26/19
7890123,4/3/19,4/25/19
7890123,4/12/19,5/14/19
7890123,6/17/19,6/25/19
7890123,4/18/19,5/30/19
7890123,4/22/19,5/21/19
7890123,9/11/19,10/16/19
7890123,6/18/19,6/25/19
123,43550
smith,43550
jones,43550
43550,43550

Writing a large resultset to an Excel file using POI

Oh. I think you're writing the workbook out 944,000 times. Your wb.write(bos) call is in the inner loop. I'm not sure this is quite consistent with the semantics of the Workbook class? From what I can tell in the Javadocs of that class, that method writes out the entire workbook to the output stream specified. And it's gonna write out every row you've added so far once for every row as the thing grows.

This explains why you're seeing exactly 1 row, too. The first workbook (with one row) to be written out to the file is all that is being displayed - and then 7GB of junk thereafter.

which is the best API to read large sized excel files in Java?

Try POI, I am not sure if they read .xlsx, maybe they do in newest versions.



Related Topics



Leave a reply



Submit