Python BeautifulSoup为findAll提供了多个标签

时间:2021-03-26 22:37:33

I'm looking for a way to use findAll to get two tags, in the order they appear on the page.

我正在寻找一种方法来使用findAll按照它们在页面上显示的顺序获取两个标签。

Currently I have:

目前我有:

import requests
import BeautifulSoup

def get_soup(url):
    request = requests.get(url)
    page = request.text
    soup = BeautifulSoup(page)
    get_tags = soup.findAll('hr' and 'strong')
    for each in get_tags:
        print each

If I use that on a page with only 'em' or 'strong' in it then it will get me all of those tags, if I use on one with both it will get 'strong' tags.

如果我在一个只有'em'或'strong'的页面上使用它,那么它将为我提供所有这些标签,如果我在两者上使用它将获得'强'标签。

Is there a way to do this? My main concern is preserving the order in which the tags are found.

有没有办法做到这一点?我主要关注的是保留标签的查找顺序。

2 个解决方案

#1


55  

You could pass a list, to find either hr or strong tags:

您可以传递一个列表,以查找hr或strong标记:

tags = soup.find_all(['hr', 'strong'])

#2


4  

Use regular expressions:

使用正则表达式:

import re
get_tags = soup.findAll(re.compile(r'(hr|strong)'))

The expression r'(hr|strong)' will find either hr tags or strong tags.

表达式r'(hr | strong)'将找到hr标签或强标签。

#1


55  

You could pass a list, to find either hr or strong tags:

您可以传递一个列表,以查找hr或strong标记:

tags = soup.find_all(['hr', 'strong'])

#2


4  

Use regular expressions:

使用正则表达式:

import re
get_tags = soup.findAll(re.compile(r'(hr|strong)'))

The expression r'(hr|strong)' will find either hr tags or strong tags.

表达式r'(hr | strong)'将找到hr标签或强标签。