如何使用Apache POI读取空的,但是格式化的Excel单元格?

时间:2021-05-18 20:23:52

I have a method for reading Excel cells using Apache POI, and it works fine. Well... almost fine.

我有一个使用Apache POI读取Excel单元格的方法,它工作正常。嗯......几乎没问题。

public static ArrayList readXLsXFile() throws FileNotFoundException, IOException {

        ArrayList outListaExcel = new ArrayList();

        FileInputStream fis;
        ptxf= new FileInputStream(pathToExcelFile);
        XSSFWorkbook workbook = new XSSFWorkbook(ptxf);
        XSSFSheet sheetAr = workbook.getSheetAt(0);
        Iterator rowsAr = sheetAr.rowIterator();
        while (rowsAr.hasNext()) {
            XSSFRow row1 = (XSSFRow) rowsAr.next();
            Iterator cellsAr = row1.cellIterator();
            ArrayList<String> arr;
            arr = new ArrayList();
            while (cellsAr.hasNext()) {
                XSSFCell cell1 = (XSSFCell) cellsAr.next();
                arr.add(String.valueOf(cell1));
            }
            outListaExcel.add(arr);
        }
        return outListaExcel;
    }

If cells are formatted, for example if whole A column have borders, then it will keep reading empty cells giving me empty strings. How to ignore those empty(formated) cells?

如果格式化单元格,例如,如果整个A列具有边框,那么它将继续读取空单元格,从而为我提供空字符串。如何忽略那些空(格式化)细胞?

So readXLsXFile will give me an ArryList with

所以readXLsXFile会给我一个ArryList

[0] -> [1][2]
[1] -> [3][4] 

But it will also give ten more nodes with empty strings,because coloumn A is formated with borders.

但是它还会给十个节点添加空字符串,因为coloumn A是用边框格式化的。

如何使用Apache POI读取空的,但是格式化的Excel单元格?

edit after Gagravarr answer.

Gagravarr回答后编辑。

I can avoid checking wether subList is empty and then do not add it to mainList. But in the case of some very large .xls files and if there is many of them it will take too long, and generaly I think it is not a good practice.

我可以避免检查subList是否为空,然后不将其添加到mainList。但是对于一些非常大的.xls文件,如果有很多这样的文件需要很长时间,而且我认为这不是一个好习惯。

My question was if there is something for rows, like it is for cells that I have overlooked.

我的问题是,是否有行的东西,就像我忽略的细胞一样。

 ArrayList<ArrayList<String>>mainLista = new ArrayList<ArrayList<String>>();
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
        Row r = sheet.getRow(rowNum);
        int lastColumn = r.getLastCellNum();
        ArrayList<String> subList = new ArrayList<String>();
        for (int cn = 0; cn < lastColumn; cn++) {
            Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);

            if (c != null) {
                subList.add(c.getStringCellValue());
            } else {
            }
        }
        if (!subList.isEmpty() ){  // I think it is not good way
        mainLista.add(subList);}   // to do this, because it still reads 
    }                              // an empty rows  

1 个解决方案

#1


2  

As explained in the Apache POI Documentation on Iterate over rows and cells, the iterators only give you the rows and cells which are defined and have/had content.

正如在迭代行和单元格上的迭代器中的Apache POI文档中所解释的那样,迭代器仅为您提供已定义且具有/具有内容的行和单元格。

If you want to fetch cells with full control over blank or empty cells, you need to instead use something like:

如果要获取完全控制空白或空单元格的单元格,则需要使用以下内容:

// Decide which rows to process
int rowStart = Math.min(15, sheet.getFirstRowNum());
int rowEnd = Math.max(1400, sheet.getLastRowNum());

for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
   Row r = sheet.getRow(rowNum);

   int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);

   for (int cn = 0; cn < lastColumn; cn++) {
      Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
      if (c == null) {
         // The spreadsheet is empty in this cell
      } else {
         // Do something useful with the cell's contents
      }
   }
}

If you want to fetch blank cells (typically those with styling but no values), play with the other Missing Cell Policies, eg RETURN_NULL_AND_BLANK

如果要获取空白单元格(通常是那些具有样式但没有值的单元格),请使用其他缺少单元格策略,例如RETURN_NULL_AND_BLANK

#1


2  

As explained in the Apache POI Documentation on Iterate over rows and cells, the iterators only give you the rows and cells which are defined and have/had content.

正如在迭代行和单元格上的迭代器中的Apache POI文档中所解释的那样,迭代器仅为您提供已定义且具有/具有内容的行和单元格。

If you want to fetch cells with full control over blank or empty cells, you need to instead use something like:

如果要获取完全控制空白或空单元格的单元格,则需要使用以下内容:

// Decide which rows to process
int rowStart = Math.min(15, sheet.getFirstRowNum());
int rowEnd = Math.max(1400, sheet.getLastRowNum());

for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
   Row r = sheet.getRow(rowNum);

   int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);

   for (int cn = 0; cn < lastColumn; cn++) {
      Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
      if (c == null) {
         // The spreadsheet is empty in this cell
      } else {
         // Do something useful with the cell's contents
      }
   }
}

If you want to fetch blank cells (typically those with styling but no values), play with the other Missing Cell Policies, eg RETURN_NULL_AND_BLANK

如果要获取空白单元格(通常是那些具有样式但没有值的单元格),请使用其他缺少单元格策略,例如RETURN_NULL_AND_BLANK