问题
Please find a Code snippet below
public class DataDriven_GetDataExcel {
public static void main(String[] args) throws IOException, EncryptedDocumentException, InvalidFormatException {
//1 Getting Control over File
FileInputStream fis = new FileInputStream("C:\\Users\\bewosaurabh\\Documents\\GetDataFile.xlsx");
//2 Creating a Workbook
Workbook wb = WorkbookFactory.create(fis);
//3 Getting Control over Sheet
Sheet sh = wb.getSheet("Sheet1");
.
......
What I don't understand is why we need to create a Workbook before reading the Excel file? An Excel file is also called as Workbook (as we can see in below picture).
When we Create an excel file that means we are creating a Workbook. From there, we access the Sheets followed by rows and columns.
I don't understand why we write write WorkbookFactory.create(fis);
when we already have a 'Workbook'
We should have some methods to get the Workbook we have created like we have for Rows(getRow), Sheets (getSheet), Cells (getCell).
Can you help me understand POI?
回答1:
What Workbook wb = WorkbookFactory.create(fis);
does is:
Instantiating a Java
object, which implements Workbook, from the content of the file read using the file input stream. After that the Workbook
object is then located in memory. And only after having access to this Workbook
object we can use its methods.
If we would use Workbook wb = WorkbookFactory.create(file);
, that is using a File
instead of an InputStream
, then the WorkbookFactory
would create the Workbook
object directly from the file. The advantage of this is that not the whole file content must be read into the memory. So we have a lower memory footprint. The disadvantage is that the file which is opened for reading cannot be used for writing the same time. So we cannot write changings we have made using the methods of Workbook
into the same file we have read the Workbook
from.
If memory footprint is a bigger issue, then for XSSF (*.xlsx
), we can get at the underlying XML data and process it using XSSF and SAX (Event API). Using this we need not to instantiate a Workbook
object. Instead we are reading and parsing the XML
directly from the OPCPackage
which is a ZipPackage
in case of XSSF (*.xlsx
) since a *.xlsx
is simply a ZIP
archive containing a directory structure containing XML
files and other files.
Since a *.xlsx
is simply a ZIP
archive we also could opening it as FileSystem gotten from FileSystems and then process its content totally independent from third party libraries. But this will be the most challenging approach.
来源:https://stackoverflow.com/questions/45708114/why-we-need-to-create-a-workbook-before-reading-the-excel-file-using-apache-poi