URL Selenium超过了最大重试次数

URL Selenium超过了最大重试次数

问题描述:

所以我正在寻找遍历一个URL数组并使用Selenium打开不同的URL进行Web抓取.问题是,当我点击第二个browser.get(url)时,我得到"URL超过最大重试次数"和由于目标计算机主动拒绝它而无法建立连接".

So i'm looking to traverse a URL array and open different URL's for web scraping with Selenium. The problem is, as soon as I hit the second browser.get(url), I get a 'Max retries exceeded with URL' and 'No connection could be made because the target machine actively refused it'.

添加了其余的代码,尽管这只是BeautifulSoup的内容.

Added the rest of the code, although it's just BeautifulSoup stuff.

from bs4 import BeautifulSoup
import time
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
import json

chrome_options = Options()  
chromedriver = webdriver.Chrome(executable_path='C:/Users/andre/Downloads/chromedriver_win32/chromedriver.exe', options=chrome_options)
urlArr = ['https://link1', 'https://link2', '...']

for url in urlArr:
   with chromedriver as browser:
      browser.get(url)
      time.sleep(5)
      # Click a button
      chromedriver.find_elements_by_tag_name('a')[7].click()

      chromedriver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
      time.sleep(2)
      for i in range (0, 2):
         chromedriver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
         time.sleep(5)

      html = browser.page_source
      page_soup = BeautifulSoup(html, 'html.parser')
      boxes = page_soup.find("div", {"class": "rpBJOHq2PR60pnwJlUyP0"})
      videos = page_soup.findAll("video", {"class": "_1EQJpXY7ExS04odI1YBBlj"})

这里的其他帖子说,当您一次使用太多页面并且服务器将我拒之门外时,会发生这种情况,但这不是我的问题.每当我多次调用browser.get(url)时,都会发生上述错误.

The other posts on here say this happens when you use too many pages at once and the server shuts me out, but that's not my issue. Whenever I call browser.get(url) more than once, the error above happens.

怎么回事?谢谢.

解决了该问题.您必须再次重新创建网络驱动程序.

Solved the problem. You have to recreate the webdriver again.

from bs4 import BeautifulSoup
import time
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
import json


urlArr = ['https://link1', 'https://link2', '...']

for url in urlArr:
   chrome_options = Options()  
   chromedriver = webdriver.Chrome(executable_path='C:/Users/andre/Downloads/chromedriver_win32/chromedriver.exe', options=chrome_options)
   with chromedriver as browser:
      browser.get(url)
      time.sleep(5)
      # Click a button
      chromedriver.find_elements_by_tag_name('a')[7].click()

      chromedriver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
      time.sleep(2)
      for i in range (0, 2):
         chromedriver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
         time.sleep(5)

      html = browser.page_source
      page_soup = BeautifulSoup(html, 'html.parser')
      boxes = page_soup.find("div", {"class": "rpBJOHq2PR60pnwJlUyP0"})
      videos = page_soup.findAll("video", {"class": "_1EQJpXY7ExS04odI1YBBlj"})