如何使用合并的单元格检索复杂的Excel文件并使用vb.net保存为xml文件?

时间:2021-06-24 14:01:19

I have this that can retrieve excel file and save as xml file.

我有这个可以检索excel文件并保存为xml文件。

Imports Microsoft.Office.Interop.Excel
Imports System.Xml
Imports System.IO

Module Module1
Sub Main()
    Try
        Dim excel As Application = New Application
        Dim filename As String = "person"
        Dim file_extension As String
        Dim path As String = "C:\Users\"
        Dim w As Workbook
        Try
            file_extension = "xlsx"
            w = excel.Workbooks.Open(path & filename + "." & file_extension)
        Catch ex As Exception
            file_extension = "xls"
            w = excel.Workbooks.Open(path & filename + "." & file_extension)
        End Try

        For i As Integer = 1 To w.Sheets.Count
            Dim sheet As Worksheet = w.Sheets(i)
            Dim r As Range = sheet.UsedRange
            Dim array(,) As Object = r.Value(XlRangeValueDataType.xlRangeValueDefault)

            If array IsNot Nothing Then

                Dim bound0 As Integer = array.GetUpperBound(0)
                Dim bound1 As Integer = array.GetUpperBound(1)

                Dim settings As XmlWriterSettings = New XmlWriterSettings()
                settings.Indent = True

                Using writer As XmlWriter = XmlWriter.Create(filename + ".xml", settings)
                    writer.WriteStartDocument()
                    writer.WriteStartElement(filename)
                    For j As Integer = 2 To bound0
                        writer.WriteStartElement(sheet.Name)
                        For x As Integer = 1 To bound1
                            writer.WriteElementString(array(1, x), array(j, x))
                        Next
                        writer.WriteEndElement()
                    Next
                    writer.WriteEndElement()
                    writer.WriteEndDocument()
                End Using
            End If
        Next
        w.Close()
    Catch ex As Exception
        Console.WriteLine("MS Excel file is invalid.")
        Console.WriteLine(ex.Message)
        Console.ReadKey()
    End Try
End Sub
End Module

When I have this, for example, as my excel file:

当我有这个时,例如,作为我的excel文件:

filename: person.xlsx sheet name: personfile

filename:person.xlsx表名:personfile

Name     Age     Gender
John     5       M
Jane     4       F

Then the xml file will return this way.

然后xml文件将以这种方式返回。

<person>
 <personfile>
  <Name>John</Name>
  <Age>5</Age>
  <Gender>M</Gender>
 </personfile>
 <personfile>
  <Name>Jane</Name>
  <Age>4</Age>
  <Gender>F</Gender>
 </personfile>
</person>

which is saved as person.xml

保存为person.xml

Now my question is... what if the excel file has merged cells? How to solve the error? When the excel file has merged cells, it returns

现在我的问题是......如果excel文件合并了单元格怎么办?如何解决错误?当excel文件合并了单元格时,它返回

ERROR: Index and length must refer to a location within the string
Parameter name: length

Here's the sample excel file that I am supposed to retrieve. 如何使用合并的单元格检索复杂的Excel文件并使用vb.net保存为xml文件?

这是我应该检索的示例excel文件。

P.S. There are combo boxes too.

附:还有组合框。

2 个解决方案

#1


0  

This works on a test sheet I made with a couple of different merged cell situations:

这适用于我使用几个不同的合并单元格情况制作的测试表:

Private Sub Main
    Try
        Dim excel As Application = New Application
        Dim filename As String = "person"
        Dim file_extension As String
        Dim path As String = "C:\Users\"
        Dim w As Workbook
        Try
            file_extension = "xlsx"
            w = excel.Workbooks.Open(path & filename + "." & file_extension)
        Catch ex As Exception
            file_extension = "xls"
            w = excel.Workbooks.Open(path & filename + "." & file_extension)
        End Try

        For i As Integer = 1 To w.Sheets.Count
            Dim sheet As Object = w.Sheets(i)
            Dim r As Object = sheet.UsedRange

            'Changes to your original code begin here

            Dim bound0 As Integer = r.Rows.Count
            Dim bound1 As Integer = r.Columns.Count
            Dim array(bound0, bound1) As Object
            For a As Integer = 1 To bound0
                For b As Integer = 1 To bound1
                    Try
                        array(a, b) = r.Cells(a, b).Value
                    Catch
                        array(a, b) = Nothing
                    End Try
                Next
            Next

            If array IsNot Nothing Then 'I left this in, though I can't imagine how it could be needed now

                Dim settings As XmlWriterSettings = New XmlWriterSettings()
                settings.Indent = True

                Using writer As XmlWriter = XmlWriter.Create(filename + ".xml", settings)
                    writer.WriteStartDocument()
                    writer.WriteStartElement(filename)
                    For j As Integer = 2 To bound0
                        writer.WriteStartElement(sheet.Name)
                        For x As Integer = 1 To bound1
                            If array(j, x) IsNot Nothing Then
                                Dim h As Integer = x
                                Do Until array(1, h) IsNot Nothing
                                    h -= 1
                                Loop
                                writer.WriteElementString(array(1, h), array(j, x))

                                'No more changes to your code after this point

                            End If
                        Next
                        writer.WriteEndElement()
                    Next
                    writer.WriteEndElement()
                    writer.WriteEndDocument()
                End Using
            End If
        Next
        w.Close()
    Catch ex As Exception
        Console.WriteLine("MS Excel file is invalid.")
        Console.WriteLine(ex.Message)
        Console.ReadKey()
    End Try
End Sub

#2


0  

The code treats the table as a two-dimensional array with no merged cells. The best approach would be to apply it to part(s) of the table that fit those criteria, e.g. don't have merged cells in them.

该代码将表视为没有合并单元格的二维数组。最好的方法是将其应用于符合这些标准的表的部分,例如:没有合并的单元格。

Depending on how fixed or varied the structure is from document to document, this can be easy or very hard.

根据结构从文档到文档的固定或变化的方式,这可能很容易或非常困难。

Assuming the data you need is always in the same fixed place, you can set the r variable to the relevant range instead of the whole sheet.

假设您需要的数据始终位于相同的固定位置,则可以将r变量设置为相关范围而不是整个工作表。

#1


0  

This works on a test sheet I made with a couple of different merged cell situations:

这适用于我使用几个不同的合并单元格情况制作的测试表:

Private Sub Main
    Try
        Dim excel As Application = New Application
        Dim filename As String = "person"
        Dim file_extension As String
        Dim path As String = "C:\Users\"
        Dim w As Workbook
        Try
            file_extension = "xlsx"
            w = excel.Workbooks.Open(path & filename + "." & file_extension)
        Catch ex As Exception
            file_extension = "xls"
            w = excel.Workbooks.Open(path & filename + "." & file_extension)
        End Try

        For i As Integer = 1 To w.Sheets.Count
            Dim sheet As Object = w.Sheets(i)
            Dim r As Object = sheet.UsedRange

            'Changes to your original code begin here

            Dim bound0 As Integer = r.Rows.Count
            Dim bound1 As Integer = r.Columns.Count
            Dim array(bound0, bound1) As Object
            For a As Integer = 1 To bound0
                For b As Integer = 1 To bound1
                    Try
                        array(a, b) = r.Cells(a, b).Value
                    Catch
                        array(a, b) = Nothing
                    End Try
                Next
            Next

            If array IsNot Nothing Then 'I left this in, though I can't imagine how it could be needed now

                Dim settings As XmlWriterSettings = New XmlWriterSettings()
                settings.Indent = True

                Using writer As XmlWriter = XmlWriter.Create(filename + ".xml", settings)
                    writer.WriteStartDocument()
                    writer.WriteStartElement(filename)
                    For j As Integer = 2 To bound0
                        writer.WriteStartElement(sheet.Name)
                        For x As Integer = 1 To bound1
                            If array(j, x) IsNot Nothing Then
                                Dim h As Integer = x
                                Do Until array(1, h) IsNot Nothing
                                    h -= 1
                                Loop
                                writer.WriteElementString(array(1, h), array(j, x))

                                'No more changes to your code after this point

                            End If
                        Next
                        writer.WriteEndElement()
                    Next
                    writer.WriteEndElement()
                    writer.WriteEndDocument()
                End Using
            End If
        Next
        w.Close()
    Catch ex As Exception
        Console.WriteLine("MS Excel file is invalid.")
        Console.WriteLine(ex.Message)
        Console.ReadKey()
    End Try
End Sub

#2


0  

The code treats the table as a two-dimensional array with no merged cells. The best approach would be to apply it to part(s) of the table that fit those criteria, e.g. don't have merged cells in them.

该代码将表视为没有合并单元格的二维数组。最好的方法是将其应用于符合这些标准的表的部分,例如:没有合并的单元格。

Depending on how fixed or varied the structure is from document to document, this can be easy or very hard.

根据结构从文档到文档的固定或变化的方式,这可能很容易或非常困难。

Assuming the data you need is always in the same fixed place, you can set the r variable to the relevant range instead of the whole sheet.

假设您需要的数据始终位于相同的固定位置,则可以将r变量设置为相关范围而不是整个工作表。