使用Selenium Python按类名查找第n个元素

使用Selenium Python按类名查找第n个元素

问题描述:

昨天我才刚开始使用硒来帮助抓取一些数据,我很难把头放在选择器引擎上.我知道lxml,BeautifulSoup,jQuery和Sizzle具有类似的引擎.但是我想做的是:

I just started using selenium yesterday to help scrape some data and I'm having a difficult time wrapping my head around the selector engine. I know lxml, BeautifulSoup, jQuery and Sizzle have similar engines. But what I'm trying to do is:

  1. 等待10秒钟以使页面完全加载
  2. 确保存在十个或更多span.eN元素(初始页面加载两次,之后加载更多)
  3. 然后开始使用beautifulsoup处理数据

我正在为硒条件而苦苦挣扎,要么找到第n个元素,要么找到仅存在于第n个元素中的特定文本.我不断收到错误消息(超时,NoSuchElement等)

I am struggling with the selenium conditions of either finding the nth element or locating the specific text that only exists in an nth element. I keep getting errors (timeout, NoSuchElement, etc)

    url = "http://someajaxiandomain.com/that-injects-html-after-pageload.aspx"
    wd = webdriver.Chrome()
    wd.implicitly_wait(10)
    wd.get(url)
    # what I've tried
    # .find_element_by_xpath("//span[@class='eN'][10]"))
    # .until(EC.text_to_be_present_in_element(By.CSS_SELECTOR, "css=span[class='eN']:contains('foo')"))

您需要了解

You need to understand the concept of Explicit Waits and Expected Conditions to wait for.

在您的情况下,您可以编写自定义预期条件,以等待定位符发现的元素计数等于n:

In your case, you can write a custom Expected Condition to wait for elements count found by a locator being equal to n:

from selenium.webdriver.support import expected_conditions as EC

class wait_for_n_elements_to_be_present(object):
    def __init__(self, locator, count):
        self.locator = locator
        self.count = count

    def __call__(self, driver):
        try:
            elements = EC._find_elements(driver, self.locator)
            return len(elements) >= self.count
        except StaleElementReferenceException:
            return False

用法:

n = 10  # specify how many elements to wait for

wait = WebDriverWait(driver, 10)
wait.until(wait_for_n_elements_to_be_present((By.CSS_SELECTOR, 'span.eN'), n))


可能您还可以使用内置的预期条件"(例如presence_of_element_locatedvisibility_of_element_located)并等待单个span.eN元素出现或可见,例如:


Probably, you could have also just used a built-in Expected Condition such as presence_of_element_located or visibility_of_element_located and wait for a single span.eN element to be present or visible, example:

wait = WebDriverWait(driver, 10)
wait.until(presence_of_element_located((By.CSS_SELECTOR, 'span.eN')))