In order to do some statistical analysis I need to extract values in a column of an Excel sheet. I have been using the Apache POI package to read from Excel files, and it wo
Just wanted to add, in case you have headers in your file and you are not sure about the column index but want to pick columns under specific headers (column names) for eg, you can try something like this
for(Row r : datatypeSheet)
{
Iterator<Cell> headerIterator = r.cellIterator();
Cell header = null;
// table header row
if(r.getRowNum() == 0)
{
// getting specific column's index
while(headerIterator.hasNext())
{
header = headerIterator.next();
if(header.getStringCellValue().equalsIgnoreCase("column1Index"))
{
column1Index = header.getColumnIndex();
}
}
}
else
{
Cell column1Cells = r.getCell(column1);
if(column1Cells != null)
{
if(column1Cells.getCellType() == Cell.CELL_TYPE_NUMERIC)
{
// adding to a list
column1Data.add(column1Cells.getNumericCellValue());
}
else if(column1Cells.getCellType() == Cell.CELL_TYPE_FORMULA && column1Cells.getCachedFormulaResultType() == Cell.CELL_TYPE_NUMERIC)
{
// adding to a list
column1Data.add(column1Cells.getNumericCellValue());
}
}
}
}
Excel files are row based rather than column based, so the only way to get all the values in a column is to look at each row in turn. There's no quicker way to get at the columns, because cells in a column aren't stored together.
Your code probably wants to be something like:
List<Double> values = new ArrayList<Double>();
for(Row r : sheet) {
Cell c = r.getCell(columnNumber);
if(c != null) {
if(c.getCellType() == Cell.CELL_TYPE_NUMERIC) {
valuesadd(c.getNumericCellValue());
} else if(c.getCellType() == Cell.CELL_TYPE_FORMULA && c.getCachedFormulaResultType() == Cell.CELL_TYPE_NUMERIC) {
valuesadd(c.getNumericCellValue());
}
}
}
That'll then give you all the numeric cell values in that column.