将字符串转换为xml并插入Sql Server

时间:2022-05-17 21:28:30

We have a SQL Server 2008 R2 database table with XML stored in a column of VARCHAR data type.

我们有一个SQL Server 2008 R2数据库表,其中XML存储在VARCHAR数据类型的列中。

I now have to fetch some of the elements of the xml.

我现在必须获取xml的一些元素。

So I want to first convert the xml stored as a VARCHAR data type, to an xml stored as xml data type.

所以我想首先将存储为VARCHAR数据类型的xml转换为存储为xml数据类型的xml。

Example :

示例:

Table A

表A.

Id(int) , ProductXML (varchar(max))

Table B

表B.

Id(int), ProductXML(XML)

I want to convert the ProductXML from Table A into XML data type and insert into Table B.

我想将表A中的ProductXML转换为XML数据类型并插入表B.

I tried using the CAST() and CONVERT() function as shown below :

我尝试使用CAST()和CONVERT()函数,如下所示:

insert into TableB (ProductXML)
select CAST(ProductXML as XML) from TableA;

Similarly tried convert but I get an error

同样尝试转换,但我收到一个错误

XML Parsing : unable to switch encoding

XML解析:无法切换编码

Is there any way I can convert the varchar entries in the table into XML entries ?

有什么办法可以将表中的varchar条目转换为XML条目吗?

About the XML: it is huge with many nodes, and its structure changes dynamically.

关于XML:它有很多节点,其结构动态变化。

Example : One row can have and XML entry for 1 product and another row can have an xml entry for multiple products.

示例:一行可以包含1个产品的XML条目,另一行可以包含多个产品的xml条目。

2 个解决方案

#1


15  

Give us a sample of your XML as all these would work:

给我们一个XML样本,因为所有这些都可行:

CONVERT(XML, '<root><child/></root>')
CONVERT(XML, '<root>          <child/>         </root>', 1)
CAST('<Name><FName>Carol</FName><LName>Elliot</LName></Name>'  AS XML)

Also you might have to cast it to nvarchar or varbinary first (from Microsoft documentation):

您也可能必须先将其强制转换为nvarchar或varbinary(来自Microsoft文档):

You can parse any of the SQL Server string data types, such as [n][var]char, [n]text, varbinary,and image, into the xml data type by casting (CAST) or converting (CONVERT) the string to the xml data type. Untyped XML is checked to confirm that it is well formed. If there is a schema associated with the xml type, validation is also performed. For more information, see Compare Typed XML to Untyped XML.

您可以通过转换(CAST)或转换(CONVERT)字符串将任何SQL Server字符串数据类型(例如[n] [var] char,[n] text,varbinary和image)解析为xml数据类型xml数据类型。检查未分类的XML以确认它是否格式正确。如果存在与xml类型关联的模式,则还会执行验证。有关更多信息,请参阅将Typed XML与Untyped XML进行比较。

XML documents can be encoded with different encodings (for example, UTF-8, UTF-16, windows-1252). The following outlines the rules on how the string and binary source types interact with the XML document encoding and how the parser behaves.

XML文档可以使用不同的编码进行编码(例如,UTF-8,UTF-16,windows-1252)。下面概述了字符串和二进制源类型如何与XML文档编码交互以及解析器的行为方式的规则。

Since nvarchar assumes a two-byte unicode encoding such as UTF-16 or UCS-2, the XML parser will treat the string value as a two-byte Unicode encoded XML document or fragment. This means that the XML document needs to be encoded in a two-byte Unicode encoding as well to be compatible with the source data type. A UTF-16 encoded XML document can have a UTF-16 byte order mark (BOM), but it does not need to, since the context of the source type makes it clear that it can only be a two-byte Unicode encoded document.

由于nvarchar采用双字节unicode编码(如UTF-16或UCS-2),因此XML解析器会将字符串值视为双字节Unicode编码的XML文档或片段。这意味着XML文档需要以双字节Unicode编码进行编码,以便与源数据类型兼容。 UTF-16编码的XML文档可以具有UTF-16字节顺序标记(BOM),但它不需要,因为源类型的上下文清楚地表明它只能是一个双字节的Unicode编码文档。

The content of a varchar string is treated as a one-byte encoded XML document/fragment by the XML parser. Since the varchar source string has a code page associated, the parser will use that code page for the encoding if no explicit encoding is specified in the XML itself If an XML instance has a BOM or an encoding declaration, the BOM or declaration needs to be consistent with the code page, otherwise the parser will report an error.

通过XML解析器将varchar字符串的内容视为单字节编码的XML文档/片段。由于varchar源字符串具有关联的代码页,因此如果XML本身未指定显式编码,则解析器将使用该代码页进行编码如果XML实例具有BOM或编码声明,则BOM或声明需要为与代码页一致,否则解析器将报告错误。

The content of varbinary is treated as a codepoint stream that is passed directly to the XML parser. Thus, the XML document or fragment needs to provide the BOM or other encoding information inline. The parser will only look at the stream to determine the encoding. This means that UTF-16 encoded XML needs to provide the UTF-16 BOM and an instance without BOM and without a declaration encoding will be interpreted as UTF-8.

varbinary的内容被视为直接传递给XML解析器的代码点流。因此,XML文档或片段需要内联提供BOM或其他编码信息。解析器只会查看流以确定编码。这意味着UTF-16编码的XML需要提供UTF-16 BOM和没有BOM的实例,并且没有声明编码将被解释为UTF-8。

If the encoding of the XML document is not known in advance and the data is passed as string or binary data instead of XML data before casting to XML, it is recommended to treat the data as varbinary. For example, when reading data from an XML file using OpenRowset(), one should specify the data to be read as a varbinary(max) value:

如果事先不知道XML文档的编码,并且在转换为XML之前将数据作为字符串或二进制数据而不是XML数据传递,则建议将数据视为varbinary。例如,当使用OpenRowset()从XML文件读取数据时,应该将要读取的数据指定为varbinary(max)值:

select CAST(x as XML) 
from OpenRowset(BULK 'filename.xml', SINGLE_BLOB) R(x)

SQL Server internally represents XML in an efficient binary representation that uses UTF-16 encoding. User-provided encoding is not preserved, but is considered during the parse process.

SQL Server在内部使用UTF-16编码的高效二进制表示形式表示XML。用户提供的编码不会保留,但会在解析过程中考虑。

Solution:

解:

CONVERT(XML, CONVERT(NVARCHAR(max), ProductXML))

#2


2  

This worked for me:

这对我有用:

select CAST(REPLACE(CAST(column3 AS NVARCHAR(MAX)),'utf-8','utf-16') AS XML) from table

#1


15  

Give us a sample of your XML as all these would work:

给我们一个XML样本,因为所有这些都可行:

CONVERT(XML, '<root><child/></root>')
CONVERT(XML, '<root>          <child/>         </root>', 1)
CAST('<Name><FName>Carol</FName><LName>Elliot</LName></Name>'  AS XML)

Also you might have to cast it to nvarchar or varbinary first (from Microsoft documentation):

您也可能必须先将其强制转换为nvarchar或varbinary(来自Microsoft文档):

You can parse any of the SQL Server string data types, such as [n][var]char, [n]text, varbinary,and image, into the xml data type by casting (CAST) or converting (CONVERT) the string to the xml data type. Untyped XML is checked to confirm that it is well formed. If there is a schema associated with the xml type, validation is also performed. For more information, see Compare Typed XML to Untyped XML.

您可以通过转换(CAST)或转换(CONVERT)字符串将任何SQL Server字符串数据类型(例如[n] [var] char,[n] text,varbinary和image)解析为xml数据类型xml数据类型。检查未分类的XML以确认它是否格式正确。如果存在与xml类型关联的模式,则还会执行验证。有关更多信息,请参阅将Typed XML与Untyped XML进行比较。

XML documents can be encoded with different encodings (for example, UTF-8, UTF-16, windows-1252). The following outlines the rules on how the string and binary source types interact with the XML document encoding and how the parser behaves.

XML文档可以使用不同的编码进行编码(例如,UTF-8,UTF-16,windows-1252)。下面概述了字符串和二进制源类型如何与XML文档编码交互以及解析器的行为方式的规则。

Since nvarchar assumes a two-byte unicode encoding such as UTF-16 or UCS-2, the XML parser will treat the string value as a two-byte Unicode encoded XML document or fragment. This means that the XML document needs to be encoded in a two-byte Unicode encoding as well to be compatible with the source data type. A UTF-16 encoded XML document can have a UTF-16 byte order mark (BOM), but it does not need to, since the context of the source type makes it clear that it can only be a two-byte Unicode encoded document.

由于nvarchar采用双字节unicode编码(如UTF-16或UCS-2),因此XML解析器会将字符串值视为双字节Unicode编码的XML文档或片段。这意味着XML文档需要以双字节Unicode编码进行编码,以便与源数据类型兼容。 UTF-16编码的XML文档可以具有UTF-16字节顺序标记(BOM),但它不需要,因为源类型的上下文清楚地表明它只能是一个双字节的Unicode编码文档。

The content of a varchar string is treated as a one-byte encoded XML document/fragment by the XML parser. Since the varchar source string has a code page associated, the parser will use that code page for the encoding if no explicit encoding is specified in the XML itself If an XML instance has a BOM or an encoding declaration, the BOM or declaration needs to be consistent with the code page, otherwise the parser will report an error.

通过XML解析器将varchar字符串的内容视为单字节编码的XML文档/片段。由于varchar源字符串具有关联的代码页,因此如果XML本身未指定显式编码,则解析器将使用该代码页进行编码如果XML实例具有BOM或编码声明,则BOM或声明需要为与代码页一致,否则解析器将报告错误。

The content of varbinary is treated as a codepoint stream that is passed directly to the XML parser. Thus, the XML document or fragment needs to provide the BOM or other encoding information inline. The parser will only look at the stream to determine the encoding. This means that UTF-16 encoded XML needs to provide the UTF-16 BOM and an instance without BOM and without a declaration encoding will be interpreted as UTF-8.

varbinary的内容被视为直接传递给XML解析器的代码点流。因此,XML文档或片段需要内联提供BOM或其他编码信息。解析器只会查看流以确定编码。这意味着UTF-16编码的XML需要提供UTF-16 BOM和没有BOM的实例,并且没有声明编码将被解释为UTF-8。

If the encoding of the XML document is not known in advance and the data is passed as string or binary data instead of XML data before casting to XML, it is recommended to treat the data as varbinary. For example, when reading data from an XML file using OpenRowset(), one should specify the data to be read as a varbinary(max) value:

如果事先不知道XML文档的编码,并且在转换为XML之前将数据作为字符串或二进制数据而不是XML数据传递,则建议将数据视为varbinary。例如,当使用OpenRowset()从XML文件读取数据时,应该将要读取的数据指定为varbinary(max)值:

select CAST(x as XML) 
from OpenRowset(BULK 'filename.xml', SINGLE_BLOB) R(x)

SQL Server internally represents XML in an efficient binary representation that uses UTF-16 encoding. User-provided encoding is not preserved, but is considered during the parse process.

SQL Server在内部使用UTF-16编码的高效二进制表示形式表示XML。用户提供的编码不会保留,但会在解析过程中考虑。

Solution:

解:

CONVERT(XML, CONVERT(NVARCHAR(max), ProductXML))

#2


2  

This worked for me:

这对我有用:

select CAST(REPLACE(CAST(column3 AS NVARCHAR(MAX)),'utf-8','utf-16') AS XML) from table