java.lang.OutOfMemoryError: GC overhead limit exceeded excel reader
I am getting a java.lang.OutOfMemoryError: GC overhead limit exceeded exception when I try to run the program below. This program's main method access' a specified directory and iterates over all the files that contain .xlsx. This works fine as I tested it before any of the other logic. And the method it is calling xlsx which basically converts the xlsx file into csv and appends it to an existing file works fine as well. But when I put that in the for loop, this is when I get this exception. I am guessing it there is a conflict when after it has opened the xlsx and converted it the csv and its time to open the second maybe I have to somehow close this line:
File inputFile = new File("C:Usersedennis.ADDesktoptest"+nameOfFile);
Thats my only guess right now, that it when this file is interfering when the second iteration of the loop comes. I am using the Apache POI libraries to manipulate the excel files. Thanks in Advance!
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class ExcelMan {
public static void main(String[] args) throws FileNotFoundException {
int i =0;
File dir = new File("C:Usersedennis.ADDesktoptest");
for (File child : dir.listFiles()) {
//initializing whether the sheet sent to method is first or not, and //counting iterations for each time the for loop as run
boolean firstSheet = true;
i++;
String nameOfFile = child.getName();
if (nameOfFile.contains(".xlsx")){
System.out.println(nameOfFile);
if (i != 0)
firstSheet = false;
File inputFile = new File("C:Usersedennis.ADDesktoptest"+nameOfFile);
// writing excel data to csv
File outputFile = new File("C:Usersedennis.ADDesktoptestmemb.csv");
xlsx(inputFile, outputFile, firstSheet);
}
// }
}
}
static void xlsx(File inputFile, File outputFile, boolean firstSheet) {
// For storing data into CSV files
StringBuffer data = new StringBuffer();
try {
FileOutputStream fos = new FileOutputStream(outputFile, true);
// Get the workbook object for XLSX file
XSSFWorkbook wBook = new XSSFWorkbook(new FileInputStream(inputFile));
// Get first sheet from the workbook
XSSFSheet sheet = wBook.getSheetAt(7);
Row row;
Cell cell;
// Iterate through each rows from first sheet
java.util.Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext()) {
if (firstSheet != true)
rowIterator.next();
row = rowIterator.next();
// For each row, iterate through each columns
java.util.Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
cell = cellIterator.next();
switch (cell.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
data.append(cell.getBooleanCellValue() + "^");
break;
case Cell.CELL_TYPE_NUMERIC:
data.append(cell.getNumericCellValue() + "^");
break;
case Cell.CELL_TYPE_STRING:
data.append(cell.getStringCellValue() + "^");
break;
case Cell.CELL_TYPE_BLANK:
data.append("" + "^");
break;
default:
data.append(cell + "^");
}
}
data.append("rn");
}
fos.write(data.toString().getBytes());
fos.close();
} catch (Exception ioe) {
ioe.printStackTrace();
}
}
}
Additional Info:
Below is the stacktrace
MR.xlsx
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3039)
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3060)
at org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale.java:3250)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.reportStartTag(Piccolo.java:1082)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseAttributesNS(PiccoloLexer.java:1802)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseOpenTagNS(PiccoloLexer.java:1521)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseTagNS(PiccoloLexer.java:1362)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseXMLNS(PiccoloLexer.java:1293)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseXML(PiccoloLexer.java:1261)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer.java:4808)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:1400)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3439)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1270)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1257)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.WorksheetDocument$Factory.parse(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:138)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:130)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:286)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:207)
at ExcelMan.xlsx(ExcelMan.java:71)
at ExcelMan.main(ExcelMan.java:47)
The excel files are pretty big, there is going to be around 30 or so in the directory and the biggest one is about 170 MB, with these file sizes should I change from POI ?
Whats the size of your excel file? I had a similar problem once, creating csv
out of xls
. In my case i had to switch to the event driven model, take a look at XSSF and SAX (Event API). I too ran out of memory (with -Xmx8g
)
A quote from the linked site:
Further effort on HSSF is going to focus on the following major areas:
Files do not need to be closed. As long as you aren't maintaining references to them they will be GCd as they fall out of scope.
The line if (i != 0)
will always evaluate to true since you are incrementing the variable i at least once before hitting this conditional. Thus firstSheet is always set to false.
The line
File inputFile = new File("C:Usersedennis.ADDesktoptest"+nameOfFile);
is creating new files. However, you already have a file object for this path represented by child
You are always writing to the same file, which you recreate a file object and new FileOutputStream for every time you iterate over the initial directories children even though all the writes are to the same file.
You are not closing your FileOutputStream in a finally block and it may not be properly closing your FileOutputStream under error conditions.
Use StringBuilder instead of StringBuffer unless you need synchronized methods for building the string.
Consider using a FileWriter instead of an intermediary StringBuilder. Instead of writing to a Builder use
PrintWriter writer = new PrintWriter(new BufferedWriter(new FileWriter(outputFile, true))))
Instead of doing data.append
use writer.print
or writer.println
Note: PrintWriter and Buffered Writer wrappers aren't strictly necessary, but useful.
If you refer to the XSSFWorkbook javadocs for the constructors options you will see it says "Using an InputStream requires more memory than using a File, so if a File is available then you should instead do something like 'example follows'" http://poi.apache.org/apidocs/org/apache/poi/xssf/usermodel/XSSFWorkbook.html#XSSFWorkbook(java.io.InputStream)
Increasing your heap size will likely be a workable solution if all else fails. Assuming you don't have the potential for significantly larger files than what you are currently testing with. Increase heap size in Java
链接地址: http://www.djcxy.com/p/75024.html