从字符串中提取所有链接

时间:2023-02-05 16:55:30

I have a javascript variable containing the HTML source code of a page (not the source of the current page), I need to extract all links from this variable. Any clues as to what's the best way of doing this?

我有一个包含页面HTML源代码的javascript变量(不是当前页面的源代码),我需要从这个变量中提取所有链接。关于这样做的最佳方式的任何线索?

Is it possible to create a DOM for the HTML in the variable and then walk that?

是否可以在变量中为HTML创建DOM然后再进行操作?

4 个解决方案

#1


7  

I don't know if this is the recommended way, but it works: (JavaScript only)

我不知道这是否是推荐的方式,但它有效:(仅限JavaScript)

var rawHTML = '<html><body><a href="foo">bar</a><a href="narf">zort</a></body></html>';

var doc = document.createElement("html");
doc.innerHTML = rawHTML;
var links = doc.getElementsByTagName("a")
var urls = [];

for (var i=0; i<links.length; i++) {
    urls.push(links[i].getAttribute("href"));
}
alert(urls)

#2


6  

If you're using jQuery, you can really easily I believe:

如果您正在使用jQuery,我可以非常轻松地相信:

var doc = $(rawHTML);
var links = $('a', doc);

http://docs.jquery.com/Core/jQuery#htmlownerDocument

#3


3  

This is useful esepcially if you need to replace links...

如果您需要替换链接,这非常有用...

var linkReg = /(<[Aa]\s(.*)<\/[Aa]>)/g;

var linksInText = text.match(linkReg);

#4


1  

If you're running Firefox YES YOU CAN ! It's called DOMParser , check it out:

如果你正在运行Firefox,你可以!它被称为DOMParser,请查看:

DOMParser is mainly useful for applications and extensions based on Mozilla platform. While it's available to web pages, it's not part of any standard and level of support in other browsers is unknown.

#1


7  

I don't know if this is the recommended way, but it works: (JavaScript only)

我不知道这是否是推荐的方式,但它有效:(仅限JavaScript)

var rawHTML = '<html><body><a href="foo">bar</a><a href="narf">zort</a></body></html>';

var doc = document.createElement("html");
doc.innerHTML = rawHTML;
var links = doc.getElementsByTagName("a")
var urls = [];

for (var i=0; i<links.length; i++) {
    urls.push(links[i].getAttribute("href"));
}
alert(urls)

#2


6  

If you're using jQuery, you can really easily I believe:

如果您正在使用jQuery,我可以非常轻松地相信:

var doc = $(rawHTML);
var links = $('a', doc);

http://docs.jquery.com/Core/jQuery#htmlownerDocument

#3


3  

This is useful esepcially if you need to replace links...

如果您需要替换链接,这非常有用...

var linkReg = /(<[Aa]\s(.*)<\/[Aa]>)/g;

var linksInText = text.match(linkReg);

#4


1  

If you're running Firefox YES YOU CAN ! It's called DOMParser , check it out:

如果你正在运行Firefox,你可以!它被称为DOMParser,请查看:

DOMParser is mainly useful for applications and extensions based on Mozilla platform. While it's available to web pages, it's not part of any standard and level of support in other browsers is unknown.