使用python从网页中提取特定单词后面的单词

问题描述：

我正在编写一个简单的网络抓取脚本来从网页中提取一个单词.单词 I require 定期更改，但出现在一个永远不会更改的单词之后，因此我可以搜索它.

I am writing a simple web scraper script to extract a single word from a web page. The word I require changes regularly, but comes after a word that never changes, so I can search for it.

这是我目前的脚本:

#!/bin/python

import requests
response = requests.get('http://vpnbook.com/freevpn')
print(response.text)

这显然打印了页面的整个 HTML.但我需要的是密码:

Which obviously prints the whole HTML of the page. But the bit I need is the password:

<li>All bundles include UDP53, UDP 25000, TCP 80, TCP 443 profile</li>
<li>Username: <strong>vpnbook</strong></li>
<li>Password: <strong>binbd5ar</strong></li>
</ul>

如何将 ONLY 'binbd5ar'(或任何替代它的东西)打印到 STOUT?

How could I print ONLY 'binbd5ar' (or whatever replaces it) to STOUT?

答

from bs4 import BeautifulSoup
import requests

response = requests.get('http://vpnbook.com/freevpn')
soup = BeautifulSoup(response.text, 'html.parser')
pricing = soup.find(id = 'pricing')
first_column = pricing.find('div', {'class': 'one-third'})
for li in first_column.find('ul', {'class': 'disc'}):
    if 'password' in str(li).lower():
        password = li.find('strong').text
print(password)

使用python从网页中提取特定单词后面的单词

相关推荐