如何获取DIV中的文本

时间:2022-11-13 20:22:34

I'm building a scraper and I've come across some HTML I don't know how to parse. I have a piece of code like this.

我正在构建一个scraper,我遇到了一些我不知道如何解析的HTML。我有一段这样的代码。

<div>
  <span>SomeHeader</span>
  "Some text"

  <span>SomeOtherHeader</span>
  "More text"
</div>

In JS or JQuery, I want to find "SomeHeader", and look for the "Sometext" after it without the "More Text".

在JS或JQuery中,我希望找到“SomeHeader”,然后在后面查找“Sometext”,而不是“More Text”。

Any help is appreciated!

任何帮助都是赞赏!

3 个解决方案

#1


1  

You can use :contains() selector to find element contain some text but this selector isn't exact. For example $("span:contains(Text)") select both of span in bottom.

您可以使用:contains()选择器来查找包含一些文本的元素,但是这个选择器并不精确。例如$(“span:contains(Text)”)在底部选择这两个span。

<span>Text</span>
<span>Text text</span>

You need to use .filter( function ) method to check text of element exactly, then select element. When you selected element, use nextSibling property to get sibling text of element.

您需要使用.filter(函数)方法准确地检查元素的文本,然后选择element。选择元素时,使用nextSibling属性获取元素的同级文本。

var targetSpan = $("div > span").filter(function() {
    return $(this).text() === "SomeHeader";
});
var text = targetSpan[0].nextSibling.nodeValue.trim();

console.log(text);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div>
  <span>SomeHeader</span>
  "Some text"
  <span>SomeOtherHeader</span>
  "More text"
</div>

#2


1  

After you get reference to the DIV element, you can use its "textContent()" method to extract all the text in the DIV element and its children. Then it's just a matter of finding the occurrence of what you're looking for. You could use a Regular Expression, like "/SomeHeader*/", then "/SomeOtherHeader/", to extract what you want...

在获得对DIV元素的引用之后,可以使用它的“textContent()”方法提取DIV元素及其子元素中的所有文本。然后就是找到你要找的东西的发生。您可以使用一个正则表达式,例如“/SomeHeader*/”,然后“/SomeOtherHeader/”,来提取您想要的……

#3


1  

You may try something like this :

你可以试试这样的方法:

$('div')
    .contents()
    .filter(function () {
        if($(this).text() == "SomeHeader") {
          alert($(this)[0].nextSibling.nodeValue);
        }
    });

Example : https://jsfiddle.net/DinoMyte/bko2wsbu/1/

例如:https://jsfiddle.net/DinoMyte/bko2wsbu/1/

#1


1  

You can use :contains() selector to find element contain some text but this selector isn't exact. For example $("span:contains(Text)") select both of span in bottom.

您可以使用:contains()选择器来查找包含一些文本的元素,但是这个选择器并不精确。例如$(“span:contains(Text)”)在底部选择这两个span。

<span>Text</span>
<span>Text text</span>

You need to use .filter( function ) method to check text of element exactly, then select element. When you selected element, use nextSibling property to get sibling text of element.

您需要使用.filter(函数)方法准确地检查元素的文本,然后选择element。选择元素时,使用nextSibling属性获取元素的同级文本。

var targetSpan = $("div > span").filter(function() {
    return $(this).text() === "SomeHeader";
});
var text = targetSpan[0].nextSibling.nodeValue.trim();

console.log(text);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div>
  <span>SomeHeader</span>
  "Some text"
  <span>SomeOtherHeader</span>
  "More text"
</div>

#2


1  

After you get reference to the DIV element, you can use its "textContent()" method to extract all the text in the DIV element and its children. Then it's just a matter of finding the occurrence of what you're looking for. You could use a Regular Expression, like "/SomeHeader*/", then "/SomeOtherHeader/", to extract what you want...

在获得对DIV元素的引用之后,可以使用它的“textContent()”方法提取DIV元素及其子元素中的所有文本。然后就是找到你要找的东西的发生。您可以使用一个正则表达式,例如“/SomeHeader*/”,然后“/SomeOtherHeader/”,来提取您想要的……

#3


1  

You may try something like this :

你可以试试这样的方法:

$('div')
    .contents()
    .filter(function () {
        if($(this).text() == "SomeHeader") {
          alert($(this)[0].nextSibling.nodeValue);
        }
    });

Example : https://jsfiddle.net/DinoMyte/bko2wsbu/1/

例如:https://jsfiddle.net/DinoMyte/bko2wsbu/1/