使用XPath:如何排除嵌套元素中的文本

时间:2022-04-30 14:30:53

if I have some html like the following

如果我有一些类似下面的HTML

<div class=unique_id>    
  <h1 class="parseasinTitle">
    <span> Game Title </span>
 </h1>
 Game Developer
</div>

Is there a way I can use xpath to get JUST the "Game Developer" part of the text? From searching around I tried:

有没有办法可以使用xpath来获取文本的“游戏开发者”部分?从搜索周围我试过:

//div[@class='unique_id' and not(self::h1/span)]

But that still gives me the entire text "Game Title Game Developer".

但这仍然给了我整个文本“游戏标题游戏开发者”。

2 个解决方案

#1


5  

div[@class = 'unique_id']/text()[not(normalize-space() = '')]

or

div[@class = 'unique_id']/text()[last()]

depending on context.

取决于具体情况。

Note that you still have to trim the resulting text node.

请注意,您仍然需要修剪生成的文本节点。

#2


0  

The conditions in square brackets ("predicate") specify conditions for the node. The div node is not h1 at the same time, so the negation is satisfied. But if you used child instead of self, which was probably your original intent, you would not get the expected text - you would get nothing, because it means "Search for a div with unique_id tah does not have a h1/span child".

方括号中的条件(“谓词”)指定节点的条件。 div节点不是同时为h1,因此满足否定。但是如果你使用的是孩子而不是自己,这可能是你最初的意图,你就不会得到预期的文本 - 你什么也得不到,因为这意味着“搜索具有unique_id tah的div并没有h1 / span孩子”。

If you want text, specify text():

如果需要文本,请指定text():

//div/text()[last()]

#1


5  

div[@class = 'unique_id']/text()[not(normalize-space() = '')]

or

div[@class = 'unique_id']/text()[last()]

depending on context.

取决于具体情况。

Note that you still have to trim the resulting text node.

请注意,您仍然需要修剪生成的文本节点。

#2


0  

The conditions in square brackets ("predicate") specify conditions for the node. The div node is not h1 at the same time, so the negation is satisfied. But if you used child instead of self, which was probably your original intent, you would not get the expected text - you would get nothing, because it means "Search for a div with unique_id tah does not have a h1/span child".

方括号中的条件(“谓词”)指定节点的条件。 div节点不是同时为h1,因此满足否定。但是如果你使用的是孩子而不是自己,这可能是你最初的意图,你就不会得到预期的文本 - 你什么也得不到,因为这意味着“搜索具有unique_id tah的div并没有h1 / span孩子”。

If you want text, specify text():

如果需要文本,请指定text():

//div/text()[last()]