使用Selenium Python按类名查找第n个元素
昨天我才刚开始使用硒来帮助抓取一些数据,我很难把头放在选择器引擎上.我知道lxml,BeautifulSoup,jQuery和Sizzle具有类似的引擎.但是我想做的是:
I just started using selenium yesterday to help scrape some data and I'm having a difficult time wrapping my head around the selector engine. I know lxml, BeautifulSoup, jQuery and Sizzle have similar engines. But what I'm trying to do is:
- 等待10秒钟以使页面完全加载
- 确保存在十个或更多span.eN元素(初始页面加载两次,之后加载更多)
- 然后开始使用beautifulsoup处理数据
我正在为硒条件而苦苦挣扎,要么找到第n个元素,要么找到仅存在于第n个元素中的特定文本.我不断收到错误消息(超时,NoSuchElement等)
I am struggling with the selenium conditions of either finding the nth element or locating the specific text that only exists in an nth element. I keep getting errors (timeout, NoSuchElement, etc)
url = "http://someajaxiandomain.com/that-injects-html-after-pageload.aspx"
wd = webdriver.Chrome()
wd.implicitly_wait(10)
wd.get(url)
# what I've tried
# .find_element_by_xpath("//span[@class='eN'][10]"))
# .until(EC.text_to_be_present_in_element(By.CSS_SELECTOR, "css=span[class='eN']:contains('foo')"))
You need to understand the concept of Explicit Waits and Expected Conditions to wait for.
在您的情况下,您可以编写自定义预期条件,以等待定位符发现的元素计数等于n
:
In your case, you can write a custom Expected Condition to wait for elements count found by a locator being equal to n
:
from selenium.webdriver.support import expected_conditions as EC
class wait_for_n_elements_to_be_present(object):
def __init__(self, locator, count):
self.locator = locator
self.count = count
def __call__(self, driver):
try:
elements = EC._find_elements(driver, self.locator)
return len(elements) >= self.count
except StaleElementReferenceException:
return False
用法:
n = 10 # specify how many elements to wait for
wait = WebDriverWait(driver, 10)
wait.until(wait_for_n_elements_to_be_present((By.CSS_SELECTOR, 'span.eN'), n))
可能您还可以使用内置的预期条件"(例如presence_of_element_located
或visibility_of_element_located
)并等待单个span.eN
元素出现或可见,例如:
Probably, you could have also just used a built-in Expected Condition such as presence_of_element_located
or visibility_of_element_located
and wait for a single span.eN
element to be present or visible, example:
wait = WebDriverWait(driver, 10)
wait.until(presence_of_element_located((By.CSS_SELECTOR, 'span.eN')))