JavaScript:如何从字符串中去掉HTML标记?(复制)

时间:2022-08-27 17:08:14

Possible Duplicate:
Strip HTML from Text JavaScript

可能的副本:从文本JavaScript中删除HTML。

How can I strip the HTML from a string in JavaScript?

如何从JavaScript的字符串中删除HTML ?

4 个解决方案

#1


176  

Using the browser's parser is the probably the best bet in current browsers. The following will work, with the following caveats:

使用浏览器的解析器可能是当前浏览器中最好的选择。以下是工作内容:

  • Your HTML is valid within a <div> element. HTML contained within <body> or <html> or <head> tags is not valid within a <div> and may therefore not be parsed correctly.
  • 您的HTML在
    元素中是有效的。包含在或< HTML >或标签内的HTML在
    中无效,因此可能不能正确解析。
  • textContent (the DOM standard property) and innerText (non-standard) properties are not identical. For example, textContent will include text within a <script> element while innerText will not (in most browsers). This only affects IE <=8, which is the only major browser not to support textContent.
  • textContent (DOM标准属性)和innerText(非标准)属性不相同。例如,textContent将在
  • The HTML does not contain <script> elements.
  • HTML不包含
  • The HTML is not null
  • HTML不是null。
  • The HTML comes from a trusted source. Using this with arbitrary HTML allows arbitrary untrusted JavaScript to be executed. This example is from a comment by Mike Samuel on the duplicate question: <img onerror='alert(\"could run arbitrary JS here\")' src=bogus>
  • HTML来自一个可信的源。使用这个任意的HTML可以执行任意的不受信任的JavaScript。这个例子来自Mike Samuel在重复的问题上的评论:JavaScript:如何从字符串中去掉HTML标记?(复制)

Code:

代码:

var html = "<p>Some HTML</p>";
var div = document.createElement("div");
div.innerHTML = html;
var text = div.textContent || div.innerText || "";

#2


173  

cleanText = strInputCode.replace(/<\/?[^>]+(>|$)/g, "");

Distilled from this website (web.achive).

从这个网站(web.achive)中提炼出来的。

#3


39  

var html = "<p>Hello, <b>World</b>";
var div = document.createElement("div");
div.innerHTML = html;
alert(div.innerText); // Hello, World

That pretty much the best way of doing it, you're letting the browser do what it does best -- parse HTML.

这是最好的方法,你让浏览器做它最擅长的——解析HTML。


Edit: As noted in the comments below, this is not the most cross-browser solution. The most cross-browser solution would be to recursively go through all the children of the element and concatenate all text nodes that you find. However, if you're using jQuery, it already does it for you:

编辑:正如下面评论所指出的,这不是最跨浏览器的解决方案。最跨浏览器的解决方案是递归地遍历元素的所有子元素,并将找到的所有文本节点连接起来。但是,如果您使用jQuery,它已经为您完成了:

alert($("<p>Hello, <b>World</b></p>").text());

Check out the text method.

检查文本方法。

#4


20  

I know this question has an accepted answer, but I feel that it doesn't work in all cases.

我知道这个问题有一个公认的答案,但我觉得它在所有情况下都不起作用。

For completeness and since I spent too much time on this, here is what we did: we ended up using a function from php.js (which is a pretty nice library for those more familiar with PHP but also doing a little JavaScript every now and then):

为了完整性,因为我在这上面花了太多的时间,这就是我们所做的:我们最终使用了php的函数。js(对于那些熟悉PHP的人来说,这是一个相当不错的库,但偶尔也会做一些JavaScript):

http://phpjs.org/functions/strip_tags:535

http://phpjs.org/functions/strip_tags:535

It seemed to be the only piece of JavaScript code which successfully dealt with all the different kinds of input I stuffed into my application. That is, without breaking it – see my comments about the <script /> tag above.

它似乎是唯一的一段JavaScript代码,它成功地处理了我在应用程序中填充的所有不同类型的输入。也就是说,在不破坏它的情况下,请参见我对上面的标记的注释。

#1


176  

Using the browser's parser is the probably the best bet in current browsers. The following will work, with the following caveats:

使用浏览器的解析器可能是当前浏览器中最好的选择。以下是工作内容:

  • Your HTML is valid within a <div> element. HTML contained within <body> or <html> or <head> tags is not valid within a <div> and may therefore not be parsed correctly.
  • 您的HTML在
    元素中是有效的。包含在或< HTML >或标签内的HTML在
    中无效,因此可能不能正确解析。
  • textContent (the DOM standard property) and innerText (non-standard) properties are not identical. For example, textContent will include text within a <script> element while innerText will not (in most browsers). This only affects IE <=8, which is the only major browser not to support textContent.
  • textContent (DOM标准属性)和innerText(非标准)属性不相同。例如,textContent将在
  • The HTML does not contain <script> elements.
  • HTML不包含
  • The HTML is not null
  • HTML不是null。
  • The HTML comes from a trusted source. Using this with arbitrary HTML allows arbitrary untrusted JavaScript to be executed. This example is from a comment by Mike Samuel on the duplicate question: <img onerror='alert(\"could run arbitrary JS here\")' src=bogus>
  • HTML来自一个可信的源。使用这个任意的HTML可以执行任意的不受信任的JavaScript。这个例子来自Mike Samuel在重复的问题上的评论:JavaScript:如何从字符串中去掉HTML标记?(复制)

Code:

代码:

var html = "<p>Some HTML</p>";
var div = document.createElement("div");
div.innerHTML = html;
var text = div.textContent || div.innerText || "";

#2


173  

cleanText = strInputCode.replace(/<\/?[^>]+(>|$)/g, "");

Distilled from this website (web.achive).

从这个网站(web.achive)中提炼出来的。

#3


39  

var html = "<p>Hello, <b>World</b>";
var div = document.createElement("div");
div.innerHTML = html;
alert(div.innerText); // Hello, World

That pretty much the best way of doing it, you're letting the browser do what it does best -- parse HTML.

这是最好的方法,你让浏览器做它最擅长的——解析HTML。


Edit: As noted in the comments below, this is not the most cross-browser solution. The most cross-browser solution would be to recursively go through all the children of the element and concatenate all text nodes that you find. However, if you're using jQuery, it already does it for you:

编辑:正如下面评论所指出的,这不是最跨浏览器的解决方案。最跨浏览器的解决方案是递归地遍历元素的所有子元素,并将找到的所有文本节点连接起来。但是,如果您使用jQuery,它已经为您完成了:

alert($("<p>Hello, <b>World</b></p>").text());

Check out the text method.

检查文本方法。

#4


20  

I know this question has an accepted answer, but I feel that it doesn't work in all cases.

我知道这个问题有一个公认的答案,但我觉得它在所有情况下都不起作用。

For completeness and since I spent too much time on this, here is what we did: we ended up using a function from php.js (which is a pretty nice library for those more familiar with PHP but also doing a little JavaScript every now and then):

为了完整性,因为我在这上面花了太多的时间,这就是我们所做的:我们最终使用了php的函数。js(对于那些熟悉PHP的人来说,这是一个相当不错的库,但偶尔也会做一些JavaScript):

http://phpjs.org/functions/strip_tags:535

http://phpjs.org/functions/strip_tags:535

It seemed to be the only piece of JavaScript code which successfully dealt with all the different kinds of input I stuffed into my application. That is, without breaking it – see my comments about the <script /> tag above.

它似乎是唯一的一段JavaScript代码,它成功地处理了我在应用程序中填充的所有不同类型的输入。也就是说,在不破坏它的情况下,请参见我对上面的标记的注释。