如何在JSOUP中选择此元素?

时间:2022-10-30 14:21:35

This is the HTML structure:

这是HTML结构:

如何在JSOUP中选择此元素?

Element link = doc.select("div.subtabs p").first();

That does not seem to work. How do I select that p?

这似乎行不通。如何选择p?

5 个解决方案

#1


23  

The DIV with the class="subtabs" is not in fact the parent of the p element but instead is the sibling of p. To retrieve the p, you'll need to first get a reference to the parent DIV that has the id="content":

与class="subtabs"的DIV实际上不是p元素的父元素,而是p的兄弟元素。要检索p,首先需要得到一个包含id="content"的父DIV的引用:

Element link = doc.select("div#content > p").first();

Additionally, you'll need the > symbol to indicate that you're selecting a child of div#content.

此外,您还需要>符号来表示您正在选择div#内容的子元素。

parent > child: child elements that descend directly from parent, e.g. div.content > p finds p elements; and body > * finds the direct children of the body tag

父>子:直接从父节点派生的子元素,例如div.content > p找到p元素;而body > *找到了body标签的直接子元素。

If you get stuck with a JSOUP CSS selector in the future, check out the JSOUP Selector Syntax cookbook, which has some nice examples and explanations.

如果您在以后遇到了JSOUP CSS选择器,请查看JSOUP选择器语法cookbook,它有一些很好的例子和解释。

#2


4  

div#content p. It is not a child of .subtabs.

div#内容p。它不是。子选项卡的子元素。

#3


1  

The p tag you are trying to extract is not a child of the div. It is a sibling. The parent div's id is content and the p tag you want is the first p tag within its parent. So use doc.select("div#content > p").first();

您要提取的p标记不是div的子元素,它是一个兄弟元素。父div的id是内容,您想要的p标签是其父类中的第一个p标记。所以使用doc。选择(" div #内容> p”)当代();

The # means id and > means RHS is a child to LHS. So the statement means get first paragraph which is child to div with id as content

# id和>表示RHS是LHS的孩子。因此,语句的意思是把第一个段落作为内容的id。

#4


1  

The Chrome SelectorGadget is very helpful in constructing CSS selectors for jSoup, simply by point and click. It has saved me hours of development time when trying to target specific fields.

Chrome SelectorGadget在为jSoup构建CSS选择器时非常有用,只需点一下就可以了。在尝试针对特定字段时,它节省了我的开发时间。

#5


0  

Element link = doc.select("div.subtabs + p") It finds element immediately preceded by sibling

= doc.select(" div元素链接。子选项卡+ p“)它发现元素紧跟在兄弟之前。

#1


23  

The DIV with the class="subtabs" is not in fact the parent of the p element but instead is the sibling of p. To retrieve the p, you'll need to first get a reference to the parent DIV that has the id="content":

与class="subtabs"的DIV实际上不是p元素的父元素,而是p的兄弟元素。要检索p,首先需要得到一个包含id="content"的父DIV的引用:

Element link = doc.select("div#content > p").first();

Additionally, you'll need the > symbol to indicate that you're selecting a child of div#content.

此外,您还需要>符号来表示您正在选择div#内容的子元素。

parent > child: child elements that descend directly from parent, e.g. div.content > p finds p elements; and body > * finds the direct children of the body tag

父>子:直接从父节点派生的子元素,例如div.content > p找到p元素;而body > *找到了body标签的直接子元素。

If you get stuck with a JSOUP CSS selector in the future, check out the JSOUP Selector Syntax cookbook, which has some nice examples and explanations.

如果您在以后遇到了JSOUP CSS选择器,请查看JSOUP选择器语法cookbook,它有一些很好的例子和解释。

#2


4  

div#content p. It is not a child of .subtabs.

div#内容p。它不是。子选项卡的子元素。

#3


1  

The p tag you are trying to extract is not a child of the div. It is a sibling. The parent div's id is content and the p tag you want is the first p tag within its parent. So use doc.select("div#content > p").first();

您要提取的p标记不是div的子元素,它是一个兄弟元素。父div的id是内容,您想要的p标签是其父类中的第一个p标记。所以使用doc。选择(" div #内容> p”)当代();

The # means id and > means RHS is a child to LHS. So the statement means get first paragraph which is child to div with id as content

# id和>表示RHS是LHS的孩子。因此,语句的意思是把第一个段落作为内容的id。

#4


1  

The Chrome SelectorGadget is very helpful in constructing CSS selectors for jSoup, simply by point and click. It has saved me hours of development time when trying to target specific fields.

Chrome SelectorGadget在为jSoup构建CSS选择器时非常有用,只需点一下就可以了。在尝试针对特定字段时,它节省了我的开发时间。

#5


0  

Element link = doc.select("div.subtabs + p") It finds element immediately preceded by sibling

= doc.select(" div元素链接。子选项卡+ p“)它发现元素紧跟在兄弟之前。