Selenium:检索向下滚动时加载的数据

时间:2022-08-24 16:50:14

I'm trying to retrieve elements in a page that has an ajax-load scroll-down functionality alla Twitter. For some reason this isn't working properly. I added some print statements to debug it and I always get the same amount of items and then the function returns. What am I doing wrong here?

我正在尝试检索具有ajax-load向下滚动功能的页面中的元素。由于某种原因,这不能正常工作。我添加了一些打印语句来调试它,我总是得到相同数量的项目,然后函数返回。我在这做错了什么?

wd = webdriver.Firefox()
wd.implicitly_wait(3)

def get_items(items):
    print len(items)
    wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    # len(items) and len(wd.find_elements-by...()) both always seem to return the same number
    # if I were to start the loop with while True: it would work, but of course... never end
    while len(wd.find_elements_by_class_name('stream-item')) > len(items):
        items = wd.find_elements_by_class_name('stream-item')
        print items
        wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    return items

def test():
    get_page('http://twitter.com/')
    get_items(wd.find_elements_by_class_name('stream-item'))

2 个解决方案

#1


4  

Try putting a sleep in between

试着在两者之间睡一觉

wd = webdriver.Firefox()
wd.implicitly_wait(3)

def get_items(items):
    print len(items)
    wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    # len(items) and len(wd.find_elements-by...()) both always seem to return the same number
    # if I were to start the loop with while True: it would work, but of course... never end

    sleep(5) #seconds
    while len(wd.find_elements_by_class_name('stream-item')) > len(items):
        items = wd.find_elements_by_class_name('stream-item')
        print items
        wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    return items

def test():
    get_page('http://twitter.com/')
    get_items(wd.find_elements_by_class_name('stream-item'))

Note: The hard sleep is just for demonstrating that it works. Please use the waits package to wait for a smart condition instead.

注意:艰难的睡眠只是为了证明它有效。请使用等待包来等待智能状态。

#2


0  

The condition in the while loop was the issue for my use case. It was an infinite loop. I fixed the problem by using a counter :

while循环中的条件是我的用例的问题。这是一个无限循环。我用一个计数器解决了这个问题:

def get_items(items):

    item_nb = [0, 1] # initializing a counter of number of items found in page

    while(item_nb[-1] > item_nb[-2]):   # exiting the loop when no more new items can be found in the page

        items = wd.find_elements_by_class_name('stream-item')
        time.sleep(5)
        browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        item_nb.append(len(items))

    return items

```

#1


4  

Try putting a sleep in between

试着在两者之间睡一觉

wd = webdriver.Firefox()
wd.implicitly_wait(3)

def get_items(items):
    print len(items)
    wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    # len(items) and len(wd.find_elements-by...()) both always seem to return the same number
    # if I were to start the loop with while True: it would work, but of course... never end

    sleep(5) #seconds
    while len(wd.find_elements_by_class_name('stream-item')) > len(items):
        items = wd.find_elements_by_class_name('stream-item')
        print items
        wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    return items

def test():
    get_page('http://twitter.com/')
    get_items(wd.find_elements_by_class_name('stream-item'))

Note: The hard sleep is just for demonstrating that it works. Please use the waits package to wait for a smart condition instead.

注意:艰难的睡眠只是为了证明它有效。请使用等待包来等待智能状态。

#2


0  

The condition in the while loop was the issue for my use case. It was an infinite loop. I fixed the problem by using a counter :

while循环中的条件是我的用例的问题。这是一个无限循环。我用一个计数器解决了这个问题:

def get_items(items):

    item_nb = [0, 1] # initializing a counter of number of items found in page

    while(item_nb[-1] > item_nb[-2]):   # exiting the loop when no more new items can be found in the page

        items = wd.find_elements_by_class_name('stream-item')
        time.sleep(5)
        browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        item_nb.append(len(items))

    return items

```