如何从大型Excel文件中复制整张表而不使用Apache POI解析它们?

时间:2021-09-15 15:40:19

I am new as an official user, I always found my answers here but now I have got to ask.

我是新官方用户,我总是在这里找到答案,但现在我必须要问。

I am working with the last apache-poi 3.8 release (from 2012/03/26) and I have to manipulate a single file with multiple sheets in which only one contains a big amount of data (over 1000 columns and 10 000 rows).

我正在使用最后一个apache-poi 3.8版本(从2012/03/26开始),我必须操作一个包含多个工作表的文件,其中只有一个包含大量数据(超过1000列和10 000行)。

I only need to had more columns to the big sheet. Therefore, with the current tools that exist I should use SAX to read it and SXSSF to rewrite it.

我只需要在大表上有更多的列。因此,使用当前存在的工具,我应该使用SAX来读取它,并使用SXSSF来重写它。

The Excel file is already preformatted with different styles and images in every sheet therefore it will be helpful to be able to make a copy of the file without the big sheet.

Excel文件已在每张工作表中预先格式化了不同的样式和图像,因此能够在没有大工作表的情况下制作文件的副本将会很有帮助。

There goes my question: How can I make a copy of sheet with SAX (from the input stream in ) without parsing it? I tried to do like in here but the field sheets in XSSFWorkbook has a visibility set to private.

我的问题是:如何在不解析的情况下使用SAX(来自输入流)制作工作表的副本?我尝试在这里做,但XSSFWorkbook中的字段表将可见性设置为私有。

The awesome thing would be to have something like a SXSSFWriter.SheetIterator if it is in future plans for POI Developers.

如果它是POI开发人员的未来计划,那么有一件事就是拥有像SXSSFWriter.SheetIterator这样的东西。

Thanks for reading,

谢谢阅读,

Arthur

**Update**

The file is too big to be able to open it as a common XSSFWorkbook (OutOfMemoryException). Could it be possible to create and XSSFSheet from an InputStream? Like in the following:

该文件太大,无法将其作为常见的XSSFWorkbook(OutOfMemoryException)打开。是否可以从InputStream创建和XSSFSheet?如下所示:

  XSSFReader.SheetIterator iter = (XSSFReader.SheetIterator) xssfReader
            .getSheetsData();
    int index = 0;
    while (iter.hasNext()) {
        InputStream stream = iter.next();
        String sheetName = iter.getSheetName();
        if (!sheetName.equalsIgnoreCase("BigSheetThatIDontWant")) {
            Sheet newSheet = new XSSFSheet(stream);
            stream.close();
        }
        ++index;
    }

Thanks a lot for your answers.

非常感谢你的回答。

2 个解决方案

#1


4  

You will have to read the file.

您必须阅读该文件。

Regarding second question see public java.util.Iterator<XSSFSheet> XSSFWorkbook.iterator()

关于第二个问题,请参阅public java.util.Iterator XSSFWorkbook.iterator()

Allows foreach loops:

 XSSFWorkbook wb = new XSSFWorkbook(package);
 for(XSSFSheet sheet : wb){

 }

#2


2  

I came to realize throughout my experience with POI and copy operations that, as far as performance is not critical, in order to copy one or more sheet it's safer and easier to load the whole workbook, delete the unnecessary sheets and then save the result in another file.

我开始意识到POI和复制操作的经验,就性能而言并不重要,为了复制一个或多个工作表,加载整个工作簿更安全,更容易,删除不必要的工作表,然后将结果保存到另一个文件。

And +1 for Andy for the iterator.

Andy为迭代器+1。

#1


4  

You will have to read the file.

您必须阅读该文件。

Regarding second question see public java.util.Iterator<XSSFSheet> XSSFWorkbook.iterator()

关于第二个问题,请参阅public java.util.Iterator XSSFWorkbook.iterator()

Allows foreach loops:

 XSSFWorkbook wb = new XSSFWorkbook(package);
 for(XSSFSheet sheet : wb){

 }

#2


2  

I came to realize throughout my experience with POI and copy operations that, as far as performance is not critical, in order to copy one or more sheet it's safer and easier to load the whole workbook, delete the unnecessary sheets and then save the result in another file.

我开始意识到POI和复制操作的经验,就性能而言并不重要,为了复制一个或多个工作表,加载整个工作簿更安全,更容易,删除不必要的工作表,然后将结果保存到另一个文件。

And +1 for Andy for the iterator.

Andy为迭代器+1。