CakePHP Xml实用程序库触发DOMDocument警告

时间:2022-06-01 18:59:20

I'm generating XML in a view with CakePHP's Xml core library:

我使用CakePHP的XML核心库在视图中生成XML:

$xml = Xml::build($data, array('return' => 'domdocument'));echo $xml->saveXML();

View is fed from the controller with an array:

用数组从控制器输入视图:

$this->set(    array(        'data' => array(            'root' => array(                array(                    '@id' => 'A & B: OK',                    'name' => 'C & D: OK',                    'sub1' => array(                        '@id' => 'E & F: OK',                        'name' => 'G & H: OK',                        'sub2' => array(                            array(                                '@id' => 'I & J: OK',                                'name' => 'K & L: OK',                                'sub3' => array(                                    '@id' => 'M & N: OK',                                    'name' => 'O & P: OK',                                    'sub4' => array(                                        '@id' => 'Q & R: OK',                                        '@'   => 'S & T: ERROR',                                    ),                                ),                            ),                        ),                    ),                ),            ),        ),    ));

For whatever the reason, CakePHP is issuing an internal call like this:

无论出于什么原因,CakePHP正在发出这样的内部调用:

$dom = new DOMDocument;$key = 'sub4';$childValue = 'S & T: ERROR';$dom->createElement($key, $childValue);

... which triggers a PHP warning:

…触发PHP警告:

Warning (2): DOMDocument::createElement(): unterminated entity reference               T [CORE\Cake\Utility\Xml.php, line 292

... because (as documented), DOMDocument::createElement does not escape values. However, it only does it in certain nodes, as the test case illustrates.

…因为(如文档所示),DOMDocument: createElement没有转义值。但是,它只在某些节点上执行,如测试用例所示。

Am I doing something wrong or I just hit a bug in CakePHP?

我是不是做错了什么,或者我只是在CakePHP中遇到了错误?

4 个解决方案

#1


15  

This might a bug in PHPs DOMDocument::createElement() method. You can avoid it. Create the textnode separately and append it to the element node.

这可能是PHPs DOMDocument中的一个bug::createElement()方法。你可以避免它。分别创建textnode并将其附加到元素节点。

$dom = new DOMDocument;$dom  ->appendChild($dom->createElement('element'))  ->appendChild($dom->createTextNode('S & T: ERROR'));var_dump($dom->saveXml());

Output: https://eval.in/134277

输出:https://eval.in/134277

string(58) "<?xml version="1.0"?><element>S &amp; T: ERROR</element>"

This is the intended way to add text nodes to a DOM. You always create a node (element, text , cdata, ...) and append it to its parent node. You can add more then one node and different kind of nodes to one parent. Like in the following example:

这是向DOM添加文本节点的预期方式。您总是创建一个节点(元素、文本、cdata、…)并将其附加到它的父节点。可以将多个节点和不同类型的节点添加到父节点中。就像下面这个例子:

$dom = new DOMDocument;$p = $dom->appendChild($dom->createElement('p'));$p->appendChild($dom->createTextNode('Hello '));$b = $p->appendChild($dom->createElement('b'));$b->appendChild($dom->createTextNode('World!'));echo $dom->saveXml();

Output:

输出:

<?xml version="1.0"?><p>Hello <b>World!</b></p>

#2


4  

This is in fact because the DOMDocument methods wants correct characters to be outputted in html; that is, characters such as & will break content and generate a unterminated entity reference error

这实际上是因为DOMDocument方法希望在html中输出正确的字符;也就是说,诸如&之类的字符将破坏内容并生成一个未终止的实体引用错误

just htmlentities() it before using it to create elements:

只是htmlentities()在使用它创建元素之前:

$dom = new DOMDocument;$key = 'sub4';$childValue = htmlentities('S & T: ERROR');$dom->createElement($key ,$childValue);

#3


0  

it is because of this character: & You need to replace that with the relevant HTML entity. &amp; To perform the translation, you can use the htmlspecialchars function. You have to escape the value when writing writing to the nodeValue property. As quoted from a bug report in 2005 located here

因为这个字符:&您需要用相关的HTML实体替换它。,要执行转换,可以使用htmlspecialchars函数。在写入nodeValue属性时,必须转义值。引用2005年的bug报告

ampersands ARE properly encoded when setting the property textContent. Unfortunately they are not encoded when the text string is passed as the optional second arguement to DOMElement::createElement You must create a text node, set the textContent, then append the text node to the new element.

在设置属性textContent时,符号被正确编码。不幸的是,当文本字符串作为可选的第二个论述传递给DOMElement::createElement时,您必须创建一个文本节点,设置textContent,然后将文本节点附加到新元素。

htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

This is the translation table:

这是翻译表:

'&' (ampersand) becomes '&amp;''"' (double quote) becomes '&quot;' when ENT_NOQUOTES is not set."'" (single quote) becomes '&#039;' (or &apos;) only when ENT_QUOTES is set.'<' (less than) becomes '&lt;''>' (greater than) becomes '&gt;'

This script will do the translations recursively:

这个脚本将递归地进行翻译:

<?phpfunction clean($type) {  if(is_array($type)) {    foreach($type as $key => $value){        $type[$key] = clean($value);    }    return $type;  } else {    $string = htmlspecialchars($type, ENT_QUOTES, 'UTF-8');    return $string;  }}$data = array(    'data' => array(        'root' => array(            array(                '@id' => 'A & B: OK',                'name' => 'C & D: OK',                'sub1' => array(                    '@id' => 'E & F: OK',                    'name' => 'G & H: OK',                    'sub2' => array(                        array(                            '@id' => 'I & J: OK',                            'name' => 'K & L: OK',                            'sub3' => array(                                '@id' => 'M & N: OK',                                'name' => 'O & P: OK',                                'sub4' => array(                                    '@id' => 'Q & R: OK',                                    '@' => 'S & T: ERROR',                                ) ,                            ) ,                        ) ,                    ) ,                ) ,            ) ,        ) ,    ) ,);$data = clean($data);

Output

输出

Array(    [data] => Array        (            [root] => Array                (                    [0] => Array                        (                            [@id] => A &amp; B: OK                            [name] => C &amp; D: OK                            [sub1] => Array                                (                                    [@id] => E &amp; F: OK                                    [name] => G &amp; H: OK                                    [sub2] => Array                                        (                                            [0] => Array                                                (                                                    [@id] => I &amp; J: OK                                                    [name] => K &amp; L: OK                                                    [sub3] => Array                                                        (                                                            [@id] => M &amp; N: OK                                                            [name] => O &amp; P: OK                                                            [sub4] => Array                                                                (                                                                    [@id] => Q &amp; R: OK                                                                    [@] => S &amp; T: ERROR                                                                )                                                        )                                                )                                        )                                )                        )                )        ))

#4


-1  

The problem seems to be in nodes that have both attributes and values thus need to use the @ syntax:

问题似乎出现在同时具有属性和值的节点上,因此需要使用@语法:

'@id' => 'A & B: OK',  // <-- Handled as plain text'name' => 'C & D: OK', // <-- Handled as plain text'@' => 'S & T: ERROR', // <-- Handled as raw XML

I've written a little helper function:

我写了一个辅助函数

protected function escapeXmlValue($value){    return is_null($value) ? null : htmlspecialchars($value, ENT_XML1, 'UTF-8');}

... and take care of calling it manually when I create the array:

…当我创建数组时,请注意手动调用:

'@id' => 'A & B: OK','name' => 'C & D: OK','@' => $this->escapeXmlValue('S & T: NOW WORKS FINE'),

It's hard to say if it's bug or feature since the documentation doesn't mention it.

很难说它是bug还是特性,因为文档中没有提到它。

#1


15  

This might a bug in PHPs DOMDocument::createElement() method. You can avoid it. Create the textnode separately and append it to the element node.

这可能是PHPs DOMDocument中的一个bug::createElement()方法。你可以避免它。分别创建textnode并将其附加到元素节点。

$dom = new DOMDocument;$dom  ->appendChild($dom->createElement('element'))  ->appendChild($dom->createTextNode('S & T: ERROR'));var_dump($dom->saveXml());

Output: https://eval.in/134277

输出:https://eval.in/134277

string(58) "<?xml version="1.0"?><element>S &amp; T: ERROR</element>"

This is the intended way to add text nodes to a DOM. You always create a node (element, text , cdata, ...) and append it to its parent node. You can add more then one node and different kind of nodes to one parent. Like in the following example:

这是向DOM添加文本节点的预期方式。您总是创建一个节点(元素、文本、cdata、…)并将其附加到它的父节点。可以将多个节点和不同类型的节点添加到父节点中。就像下面这个例子:

$dom = new DOMDocument;$p = $dom->appendChild($dom->createElement('p'));$p->appendChild($dom->createTextNode('Hello '));$b = $p->appendChild($dom->createElement('b'));$b->appendChild($dom->createTextNode('World!'));echo $dom->saveXml();

Output:

输出:

<?xml version="1.0"?><p>Hello <b>World!</b></p>

#2


4  

This is in fact because the DOMDocument methods wants correct characters to be outputted in html; that is, characters such as & will break content and generate a unterminated entity reference error

这实际上是因为DOMDocument方法希望在html中输出正确的字符;也就是说,诸如&之类的字符将破坏内容并生成一个未终止的实体引用错误

just htmlentities() it before using it to create elements:

只是htmlentities()在使用它创建元素之前:

$dom = new DOMDocument;$key = 'sub4';$childValue = htmlentities('S & T: ERROR');$dom->createElement($key ,$childValue);

#3


0  

it is because of this character: & You need to replace that with the relevant HTML entity. &amp; To perform the translation, you can use the htmlspecialchars function. You have to escape the value when writing writing to the nodeValue property. As quoted from a bug report in 2005 located here

因为这个字符:&您需要用相关的HTML实体替换它。,要执行转换,可以使用htmlspecialchars函数。在写入nodeValue属性时,必须转义值。引用2005年的bug报告

ampersands ARE properly encoded when setting the property textContent. Unfortunately they are not encoded when the text string is passed as the optional second arguement to DOMElement::createElement You must create a text node, set the textContent, then append the text node to the new element.

在设置属性textContent时,符号被正确编码。不幸的是,当文本字符串作为可选的第二个论述传递给DOMElement::createElement时,您必须创建一个文本节点,设置textContent,然后将文本节点附加到新元素。

htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

This is the translation table:

这是翻译表:

'&' (ampersand) becomes '&amp;''"' (double quote) becomes '&quot;' when ENT_NOQUOTES is not set."'" (single quote) becomes '&#039;' (or &apos;) only when ENT_QUOTES is set.'<' (less than) becomes '&lt;''>' (greater than) becomes '&gt;'

This script will do the translations recursively:

这个脚本将递归地进行翻译:

<?phpfunction clean($type) {  if(is_array($type)) {    foreach($type as $key => $value){        $type[$key] = clean($value);    }    return $type;  } else {    $string = htmlspecialchars($type, ENT_QUOTES, 'UTF-8');    return $string;  }}$data = array(    'data' => array(        'root' => array(            array(                '@id' => 'A & B: OK',                'name' => 'C & D: OK',                'sub1' => array(                    '@id' => 'E & F: OK',                    'name' => 'G & H: OK',                    'sub2' => array(                        array(                            '@id' => 'I & J: OK',                            'name' => 'K & L: OK',                            'sub3' => array(                                '@id' => 'M & N: OK',                                'name' => 'O & P: OK',                                'sub4' => array(                                    '@id' => 'Q & R: OK',                                    '@' => 'S & T: ERROR',                                ) ,                            ) ,                        ) ,                    ) ,                ) ,            ) ,        ) ,    ) ,);$data = clean($data);

Output

输出

Array(    [data] => Array        (            [root] => Array                (                    [0] => Array                        (                            [@id] => A &amp; B: OK                            [name] => C &amp; D: OK                            [sub1] => Array                                (                                    [@id] => E &amp; F: OK                                    [name] => G &amp; H: OK                                    [sub2] => Array                                        (                                            [0] => Array                                                (                                                    [@id] => I &amp; J: OK                                                    [name] => K &amp; L: OK                                                    [sub3] => Array                                                        (                                                            [@id] => M &amp; N: OK                                                            [name] => O &amp; P: OK                                                            [sub4] => Array                                                                (                                                                    [@id] => Q &amp; R: OK                                                                    [@] => S &amp; T: ERROR                                                                )                                                        )                                                )                                        )                                )                        )                )        ))

#4


-1  

The problem seems to be in nodes that have both attributes and values thus need to use the @ syntax:

问题似乎出现在同时具有属性和值的节点上,因此需要使用@语法:

'@id' => 'A & B: OK',  // <-- Handled as plain text'name' => 'C & D: OK', // <-- Handled as plain text'@' => 'S & T: ERROR', // <-- Handled as raw XML

I've written a little helper function:

我写了一个辅助函数

protected function escapeXmlValue($value){    return is_null($value) ? null : htmlspecialchars($value, ENT_XML1, 'UTF-8');}

... and take care of calling it manually when I create the array:

…当我创建数组时,请注意手动调用:

'@id' => 'A & B: OK','name' => 'C & D: OK','@' => $this->escapeXmlValue('S & T: NOW WORKS FINE'),

It's hard to say if it's bug or feature since the documentation doesn't mention it.

很难说它是bug还是特性,因为文档中没有提到它。