如何从网站上提取图像链接并使用wget下载它们?

时间:2022-09-03 21:47:10

I really want to download images from a website, but I don't know a lot of wget to do so. They host the images on a seperate website, how I do pull the image link from the website using cat or something, so I could use wget to download them all. All I know is the wget part. Example would be Reddit.com

我真的想从网站上下载图片,但我不知道要做多少工作。他们在一个单独的网站上托管图像,我如何使用猫或其他东西从网站上拉图像链接,所以我可以使用wget将它们全部下载。我所知道的只是wget部分。示例是Reddit.com

  wget -i download-file-list.txt

2 个解决方案

#1


9  

Try this:

wget -r -l 1 -A jpg,jpeg,png,gif,bmp -nd -H http://reddit.com/some/path

It will recurse 1 level deep starting from the page http://reddit.com/some/path, and it will not create a directory structure (if you want directories, remove the -nd), and it will only download files ending in "jpg", "jpeg", "png", "gif", or "bmp". And it will span hosts.

它将从页面http://reddit.com/some/path开始递归1级深度,并且它不会创建目录结构(如果你想要目录,删除-nd),它只会下载以文件结尾的文件“jpg”,“jpeg”,“png”,“gif”或“bmp”。它将跨越主机。

#2


2  

I would use the perl module WWW::Mechanize. The following dumps all links to stdout:

我会使用perl模块WWW :: Mechanize。以下转储到stdout的所有链接:

use WWW::Mechanize;

$mech = WWW::Mechanize->new();
$mech->get("URL");
$mech->dump_links(undef, 'absolute' => 1);

Replace URL with the actual url you want.

将URL替换为您想要的实际网址。

#1


9  

Try this:

wget -r -l 1 -A jpg,jpeg,png,gif,bmp -nd -H http://reddit.com/some/path

It will recurse 1 level deep starting from the page http://reddit.com/some/path, and it will not create a directory structure (if you want directories, remove the -nd), and it will only download files ending in "jpg", "jpeg", "png", "gif", or "bmp". And it will span hosts.

它将从页面http://reddit.com/some/path开始递归1级深度,并且它不会创建目录结构(如果你想要目录,删除-nd),它只会下载以文件结尾的文件“jpg”,“jpeg”,“png”,“gif”或“bmp”。它将跨越主机。

#2


2  

I would use the perl module WWW::Mechanize. The following dumps all links to stdout:

我会使用perl模块WWW :: Mechanize。以下转储到stdout的所有链接:

use WWW::Mechanize;

$mech = WWW::Mechanize->new();
$mech->get("URL");
$mech->dump_links(undef, 'absolute' => 1);

Replace URL with the actual url you want.

将URL替换为您想要的实际网址。