Python 3:使用请求无法获取网页的全部内容

问题描述:

我正在测试使用 requests 模块来获取网页的内容.但是当我查看内容时,我发现它没有获得页面的全部内容.

I am testing using the requests module to get the content of a webpage. But when I look at the content I see that it does not get the full content of the page.

这是我的代码:

import requests
from bs4 import BeautifulSoup

url = "https://shop.nordstrom.com/c/womens-dresses-shop?origin=topnav&cm_sp=Top%20Navigation-_-Women-_-Dresses&offset=11&page=3&top=72"
page = requests.get(url)

soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())

同样在 chrome 网络浏览器上,如果我查看页面源代码,我看不到完整内容.

Also on the chrome web-browser if I look at the page source I do not see the full content.

有没有办法获得我提供的示例页面的完整内容?

Is there a way to get the full content of the example page that I have provided?

页面使用 JavaScript 呈现,提出更多请求以获取额外数据.您可以使用 selenium 获取完整页面.

The page is rendered with JavaScript making more requests to fetch additional data. You can fetch the complete page with selenium.

from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome()
url = "https://shop.nordstrom.com/c/womens-dresses-shop?origin=topnav&cm_sp=Top%20Navigation-_-Women-_-Dresses&offset=11&page=3&top=72"
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'html.parser')
driver.quit()
print(soup.prettify())

有关其他解决方案,请参阅我对 Google 财经 (BeautifulSoup) 的回答

For other solutions see my answer to Scraping Google Finance (BeautifulSoup)