如何使用PHP DOMDocument解析HTML?

时间:2022-10-30 09:32:47

I have an HTML block here:

我这里有一个HTML块:

<div class="title">
    <a href="http://test.com/asus_rt-n53/p195257/">
        Asus RT-N53
    </a>
</div>
<table>
    <tbody>
        <tr>
            <td class="price-status">
                <div class="status">
                    <span class="available">Yes</span>
                </div>
                <div name="price" class="price">
                    <div class="uah">758<span> ua.</span></div>
                    <div class="usd">$&nbsp;62</div>
                </div>

How do I parse the link (http://test.com/asus_rt-n53/p195257/), title (Asus RT-N53) and price (758)?

如何解析链接(http://test.com/asus_rt-n53/p195257/),标题(华硕RT-N53)和价格(758)?

Curl code here:

卷曲代码在这里:

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->loadHTML($content);
$xpath = new DOMXPath($dom);
$models = $xpath->query('//div[@class="title"]/a');
foreach ($models as $model) {
    echo $model->nodeValue;
    $prices = $xpath->query('//div[@class="uah"]');
    foreach ($prices as $price) {
        echo $price->nodeValue;
    }
}

1 个解决方案

#1


0  

One ugly solution is to cast the price result to keep only numbers:

一个丑陋的解决方案是抛出价格结果只保留数字:

echo (int) $price->nodeValue;

Or, you can query to find the span inside the div, and remove it from the price (inside the prices foreach):

或者,您可以查询以查找div中的范围,并将其从价格中删除(在foreach中的价格内):

$span = $xpath->query('//div[@class="uah"]/span')->item(0);
$price->removeChild($span);
echo $price->nodeValue;

Edit:

To retrieve the link, simply use getAttribute() and get the href one:

要检索链接,只需使用getAttribute()并获取href:

$model->getAttribute('href')

#1


0  

One ugly solution is to cast the price result to keep only numbers:

一个丑陋的解决方案是抛出价格结果只保留数字:

echo (int) $price->nodeValue;

Or, you can query to find the span inside the div, and remove it from the price (inside the prices foreach):

或者,您可以查询以查找div中的范围,并将其从价格中删除(在foreach中的价格内):

$span = $xpath->query('//div[@class="uah"]/span')->item(0);
$price->removeChild($span);
echo $price->nodeValue;

Edit:

To retrieve the link, simply use getAttribute() and get the href one:

要检索链接,只需使用getAttribute()并获取href:

$model->getAttribute('href')