Scraper不会停止点击下一页按钮

时间:2021-07-29 00:57:53

I've written a script in python in combination with selenium to get some names and corresponding addresses displayed upon a search and the search keyword is "Saskatoon". However, the data, in this case, traverse multiple pages. My script almost does everything except for one thing.

我在python中编写了一个与selenium结合使用的脚本,以便在搜索时显示一些名称和相应的地址,搜索关键字是“Saskatoon”。但是,在这种情况下,数据遍历多个页面。除了一件事,我的剧本几乎可以做所有事情。

  1. It still runs even though there are no more pages to traverse. The last page also holds ">" sign for next page option and is not grayed out.
  2. 即使没有更多页面可以遍历,它仍然可以运行。最后一页还为下一页选项保留“>”符号,并且不会显示为灰色。

Here is the link: Page_link

这是链接:Page_link

Search_keyword: Saskatoon (in the city/town field).

Search_keyword:萨斯卡通(在城市/城镇地区)。

Here is what I've written:

这是我写的:

from selenium import webdriver; import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get("above_link")
time.sleep(3)

search_input = driver.find_element_by_id("cityField")
search_input.clear()
search_input.send_keys("Saskatoon")
search_input.send_keys(Keys.ENTER)

while True:
    try:
        wait.until(EC.visibility_of_element_located((By.LINK_TEXT, "›"))).click()
        time.sleep(2)
    except:
        break
driver.quit()

BTW, I've just taken out the name and address part form this script which I suppose is not relevant here. Thanks.

顺便说一句,我刚刚从这个脚本中取出了名称和地址部分,我想这里不相关。谢谢。

1 个解决方案

#1


3  

You can use class attribute of > button as on last page it is "ng-scope disabled" while on rest pages - "ng-scope":

您可以使用>按钮的类属性,如在最后一页上“禁用范围” - 在其余页面上 - “ng-scope”:

wait.until(EC.visibility_of_element_located((By.XPATH, "//li[@class='ng-scope']/a[.='›']"))).click()

#1


3  

You can use class attribute of > button as on last page it is "ng-scope disabled" while on rest pages - "ng-scope":

您可以使用>按钮的类属性,如在最后一页上“禁用范围” - 在其余页面上 - “ng-scope”:

wait.until(EC.visibility_of_element_located((By.XPATH, "//li[@class='ng-scope']/a[.='›']"))).click()