不使用phpExcel将大型xlsx文件转换为csv

时间:2022-10-04 02:04:25

I have a large xlsx file that is 90MB using phpexcel it is giving me

我有一个很大的xlsx文件,它使用phpexcel提供的90MB

Warning: simplexml_load_string(): Memory allocation failed : growing buffer

I tried to load the file using every methods documented here, and also changed php.ini memory_limit = -1 .

我尝试使用这里记录的所有方法来加载文件,还更改了php。ini memory_limit = -1。

I am trying to convert the xlsx file to a csv file so it can be easily loaded.

我正在尝试将xlsx文件转换为csv文件,以便能够轻松加载。

Is there any way to convert xlsx file to csv without using phpexcel?

有没有办法不用phpexcel将xlsx文件转换成csv ?

4 个解决方案

#1


3  

You can use python:

您可以使用python:

wb = xlrd.open_workbook(os.path.join(filepath, 'result.xls'))
sheet = wb.sheet_by_index(0)
fp = open(os.path.join(filepath, 'result.csv'), 'wb')
wr = csv.writer(fp, quoting=csv.QUOTE_ALL)
for rownum in xrange(sheet.nrows):
  wr.writerow([unicode(val).encode('utf8') for val in sheet.row_values(rownum)])

#2


2  

XLSX files are compressed zip files. If you decompress your XLSX file, look at the folder xl/worksheets, which contains a xml file for each sheet of the file.

XLSX文件是压缩的zip文件。如果解压XLSX文件,请查看文件夹xl/worksheets,其中包含文件的每个表的xml文件。

You may want to extract these XML files first and then parse the (xml) content, element by element, so that the buffer to get each xml element does not need to be so big. This way, you can make your own script in php to read the extracted file, or use some xml parser, to transform the sheets into xml objects and them dump your csv.

您可能希望首先提取这些XML文件,然后解析(XML)内容,逐个元素,这样获取每个XML元素的缓冲区就不需要这么大。通过这种方式,您可以使用php编写自己的脚本来读取提取的文件,或者使用一些xml解析器将这些表转换为xml对象,并将它们转储为csv。

The structure of the resulting xml is something like this example (the important information is inside sheetData):

生成的xml的结构类似于这个示例(重要的信息在sheetData中):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
    <dimension ref="A1:J42"/>
    <sheetViews>
        <sheetView workbookViewId="0">
            <selection activeCell="C7" sqref="C7"/>
        </sheetView>
    </sheetViews>
    <sheetFormatPr defaultRowHeight="12.75" x14ac:dyDescent="0.2"/>
    <cols>
        <col min="1" max="1" width="18.140625" style="1" customWidth="1"/>
        <col min="2" max="16384" width="9.140625" style="1"/>
    </cols>
    <sheetData>
        <row r="1" spans="1:10" x14ac:dyDescent="0.2">
            <c r="B1" s="1" t="s"><v>0</v></c>
            <c r="C1" s="1" t="s"><v>1</v></c>
            <c r="D1" s="1" t="s"><v>2</v></c>
        </row>
        <row r="2" spans="1:10" x14ac:dyDescent="0.2">
            <c r="A2" s="1" t="s"><v>4</v></c><c r="B2" s="1"><v>200</v></c>
            <c r="C2" s="1"><v>200</v></c>
            <c r="D2" s="1"><v>100</v></c><c r="E2" s="1"><v>200</v></c>
        </row>
        <row r="3" spans="1:10" x14ac:dyDescent="0.2">
            <c r="A3" s="1" t="s"><v>10</v></c><c r="C3" s="1"><f>6*125</f><v>750</v></c>
            <c r="H3" s="1" t="s"><v>6</v></c><c r="I3" s="1"><v>130</v></c>
        </row>
    </sheetData>
    <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
    <pageSetup paperSize="0" orientation="portrait" horizontalDpi="0" verticalDpi="0" copies="0"/>
</worksheet>

That is, you need to look at each cell (c tag) of each row (row tag) the xml has:

也就是说,您需要查看xml中每一行(行标记)的每个单元格(c标记):

worksheet.sheetData.row[i].c[j].v

and take the content of the value (v tag).

取值(v标签)的内容。

#3


1  

Online converter up to 100MB file size:

在线转换器高达100MB文件大小:

http://www.zamzar.com/convert/xlsx-to-csv/

http://www.zamzar.com/convert/xlsx-to-csv/

3 way's tutorial:

3方法的教程:

http://www.ehow.com/how_6921725_convert-xlsx-file-csv.html

http://www.ehow.com/how_6921725_convert-xlsx-file-csv.html

hope this helps...

希望这有助于……

#4


0  

You can do this with ; with the below, you will scan A1 -> A10 and export the 5 first columns of the "DATA" tab of current workbook.

你可以使用excel-vba;使用以下命令,您将扫描A1 -> A10并导出当前工作簿的“DATA”选项卡的5个第一列。

Sub exportCSV()

  Dim wkRange As Range
  Dim cpSheet As Worksheet

  Dim myPath As String, myFileName As String
  Dim fn As Integer ' File number
  Dim cLine As String ' current line to be writen to file

  ' create output file:
  myPath = "C:\local\"
  myFileName = "out.csv"
  fn = FreeFile
  Open myPath & myFileName For Append As #fn
  Set wkRange = ThisWorkbook.Sheets("DATA").Range("$A1:$A10")
  For Each c In wkRange
  ' select your columns with "offset"
    cLine = c.Offset(0, 0).Value & ","
    cLine = cLine & c.Offset(0, 1).Value & ","
    cLine = cLine & c.Offset(0, 2).Value & ","
    cLine = cLine & c.Offset(0, 3).Value & ","
    cLine = cLine & c.Offset(0, 4).Value
    Print #fn, cLine
  Next
  Close #fn
  MsgBox "done!"

End Sub

#1


3  

You can use python:

您可以使用python:

wb = xlrd.open_workbook(os.path.join(filepath, 'result.xls'))
sheet = wb.sheet_by_index(0)
fp = open(os.path.join(filepath, 'result.csv'), 'wb')
wr = csv.writer(fp, quoting=csv.QUOTE_ALL)
for rownum in xrange(sheet.nrows):
  wr.writerow([unicode(val).encode('utf8') for val in sheet.row_values(rownum)])

#2


2  

XLSX files are compressed zip files. If you decompress your XLSX file, look at the folder xl/worksheets, which contains a xml file for each sheet of the file.

XLSX文件是压缩的zip文件。如果解压XLSX文件,请查看文件夹xl/worksheets,其中包含文件的每个表的xml文件。

You may want to extract these XML files first and then parse the (xml) content, element by element, so that the buffer to get each xml element does not need to be so big. This way, you can make your own script in php to read the extracted file, or use some xml parser, to transform the sheets into xml objects and them dump your csv.

您可能希望首先提取这些XML文件,然后解析(XML)内容,逐个元素,这样获取每个XML元素的缓冲区就不需要这么大。通过这种方式,您可以使用php编写自己的脚本来读取提取的文件,或者使用一些xml解析器将这些表转换为xml对象,并将它们转储为csv。

The structure of the resulting xml is something like this example (the important information is inside sheetData):

生成的xml的结构类似于这个示例(重要的信息在sheetData中):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
    <dimension ref="A1:J42"/>
    <sheetViews>
        <sheetView workbookViewId="0">
            <selection activeCell="C7" sqref="C7"/>
        </sheetView>
    </sheetViews>
    <sheetFormatPr defaultRowHeight="12.75" x14ac:dyDescent="0.2"/>
    <cols>
        <col min="1" max="1" width="18.140625" style="1" customWidth="1"/>
        <col min="2" max="16384" width="9.140625" style="1"/>
    </cols>
    <sheetData>
        <row r="1" spans="1:10" x14ac:dyDescent="0.2">
            <c r="B1" s="1" t="s"><v>0</v></c>
            <c r="C1" s="1" t="s"><v>1</v></c>
            <c r="D1" s="1" t="s"><v>2</v></c>
        </row>
        <row r="2" spans="1:10" x14ac:dyDescent="0.2">
            <c r="A2" s="1" t="s"><v>4</v></c><c r="B2" s="1"><v>200</v></c>
            <c r="C2" s="1"><v>200</v></c>
            <c r="D2" s="1"><v>100</v></c><c r="E2" s="1"><v>200</v></c>
        </row>
        <row r="3" spans="1:10" x14ac:dyDescent="0.2">
            <c r="A3" s="1" t="s"><v>10</v></c><c r="C3" s="1"><f>6*125</f><v>750</v></c>
            <c r="H3" s="1" t="s"><v>6</v></c><c r="I3" s="1"><v>130</v></c>
        </row>
    </sheetData>
    <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
    <pageSetup paperSize="0" orientation="portrait" horizontalDpi="0" verticalDpi="0" copies="0"/>
</worksheet>

That is, you need to look at each cell (c tag) of each row (row tag) the xml has:

也就是说,您需要查看xml中每一行(行标记)的每个单元格(c标记):

worksheet.sheetData.row[i].c[j].v

and take the content of the value (v tag).

取值(v标签)的内容。

#3


1  

Online converter up to 100MB file size:

在线转换器高达100MB文件大小:

http://www.zamzar.com/convert/xlsx-to-csv/

http://www.zamzar.com/convert/xlsx-to-csv/

3 way's tutorial:

3方法的教程:

http://www.ehow.com/how_6921725_convert-xlsx-file-csv.html

http://www.ehow.com/how_6921725_convert-xlsx-file-csv.html

hope this helps...

希望这有助于……

#4


0  

You can do this with ; with the below, you will scan A1 -> A10 and export the 5 first columns of the "DATA" tab of current workbook.

你可以使用excel-vba;使用以下命令,您将扫描A1 -> A10并导出当前工作簿的“DATA”选项卡的5个第一列。

Sub exportCSV()

  Dim wkRange As Range
  Dim cpSheet As Worksheet

  Dim myPath As String, myFileName As String
  Dim fn As Integer ' File number
  Dim cLine As String ' current line to be writen to file

  ' create output file:
  myPath = "C:\local\"
  myFileName = "out.csv"
  fn = FreeFile
  Open myPath & myFileName For Append As #fn
  Set wkRange = ThisWorkbook.Sheets("DATA").Range("$A1:$A10")
  For Each c In wkRange
  ' select your columns with "offset"
    cLine = c.Offset(0, 0).Value & ","
    cLine = cLine & c.Offset(0, 1).Value & ","
    cLine = cLine & c.Offset(0, 2).Value & ","
    cLine = cLine & c.Offset(0, 3).Value & ","
    cLine = cLine & c.Offset(0, 4).Value
    Print #fn, cLine
  Next
  Close #fn
  MsgBox "done!"

End Sub