I'm trying to get a piece of html, something like:
我正在尝试获取一段HTML,例如:
<tr class="myclass-1234" rel="5678">
<td class="lst top">foo 1</td>
<td class="lst top">foo 2</td>
<td class="lst top">foo-5678</td>
<td class="lst top nw" style="text-align:right;">
<span class="nw">1.00</span> foo
</td>
<td class="top">01.05.2015</td>
</tr>
I'm completely new to JSOUP, and first what came to mind is to get it by the class name but, the thing is that number 1234 is dynamically generated. Is there a way to get it by part of the class name or there is better approach?
我是JSOUP的新手,首先想到的是通过类名来获取它,但事实是数字1234是动态生成的。有没有办法通过类名的一部分来获得它还是有更好的方法?
2 个解决方案
#1
Assuming a simple html containing two tr, but only one tr has the class you mentioned, this code shows how to get the tr using CSS selector:
假设一个简单的html包含两个tr,但只有一个tr有你提到的类,这段代码展示了如何使用CSS选择器获取tr:
CSS selector tr[class^=myclass]
explained:
CSS选择器tr [class ^ = myclass]解释说:
Select all elements of type "tr" with a class
attribute that starts (^) with myclass
:
选择类型为“tr”的所有元素,其类属性以myclass开头(^):
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
public class Example {
public static void main(String[] args) {
String html = "<html><body><table><tr class=\"myclass-1234\" rel=\"5678\">"
+ "<td class=\"lst top\">foo 1</td>"
+ "<td class=\"lst top\">foo 2</td>"
+ "<td class=\"lst top\">foo-5678</td>"
+ "<td class=\"lst top nw\" style=\"text-align:right;\">"
+ "<span class=\"nw\">1.00</span> foo"
+ "</td>"
+ "<td class=\"top\">01.05.2015</td>"
+ "</tr><tr><td>Not to be selected</td></tr></table></body></html>";
Document doc = Jsoup.parse(html);
Elements selectAllTr = doc.select("tr");
// Should be 2
System.out.println("tr elements in html: " + selectAllTr.size());
Elements trWithStartingClassMyClass = doc.select("tr[class^=myclass]");
// Should be 1
System.out.println("tr elements with class \"myclass*\" in html: " + trWithStartingClassMyClass.size());
System.out.println(trWithStartingClassMyClass);
}
}
#2
doc.select("tr[class~=myclass.*]");
Will select any div where the content of theclass
attribute starts with myclass
.
将选择任何div类,其中class属性的内容以myclass开头。
#1
Assuming a simple html containing two tr, but only one tr has the class you mentioned, this code shows how to get the tr using CSS selector:
假设一个简单的html包含两个tr,但只有一个tr有你提到的类,这段代码展示了如何使用CSS选择器获取tr:
CSS selector tr[class^=myclass]
explained:
CSS选择器tr [class ^ = myclass]解释说:
Select all elements of type "tr" with a class
attribute that starts (^) with myclass
:
选择类型为“tr”的所有元素,其类属性以myclass开头(^):
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
public class Example {
public static void main(String[] args) {
String html = "<html><body><table><tr class=\"myclass-1234\" rel=\"5678\">"
+ "<td class=\"lst top\">foo 1</td>"
+ "<td class=\"lst top\">foo 2</td>"
+ "<td class=\"lst top\">foo-5678</td>"
+ "<td class=\"lst top nw\" style=\"text-align:right;\">"
+ "<span class=\"nw\">1.00</span> foo"
+ "</td>"
+ "<td class=\"top\">01.05.2015</td>"
+ "</tr><tr><td>Not to be selected</td></tr></table></body></html>";
Document doc = Jsoup.parse(html);
Elements selectAllTr = doc.select("tr");
// Should be 2
System.out.println("tr elements in html: " + selectAllTr.size());
Elements trWithStartingClassMyClass = doc.select("tr[class^=myclass]");
// Should be 1
System.out.println("tr elements with class \"myclass*\" in html: " + trWithStartingClassMyClass.size());
System.out.println(trWithStartingClassMyClass);
}
}
#2
doc.select("tr[class~=myclass.*]");
Will select any div where the content of theclass
attribute starts with myclass
.
将选择任何div类,其中class属性的内容以myclass开头。