属性错误:'NoneType' 对象没有属性 'parent'
from urllib.request import urlopen
from bs4 import BeautifulSoup
html= urlopen("http://www.pythonscraping.com/pages/page3.html")
soup= BeautifulSoup(html.read())
print(soup.find("img",{"src":"../img/gifts/img1.jpg"
}).parent.previous_sibling.get_text())
上面的代码可以正常工作,但下面的代码不能正常工作.它给出了如上所述的属性错误.谁能告诉我原因?
The above code works fine but not the one below.It gives an attribute error as stated above. Can anyone tell me the reason?
from urllib.request import urlopen
from bs4 import BeautifulSoup
html= urlopen("http://www.pythonscraping.com/pages/page3.html")
soup= BeautifulSoup(html.read())
price =soup.find("img",{"src=":"../img/gifts/img1.jpg"
}).parent.previous_sibling.get_text()
print(price)
谢谢!:)
如果比较第一个和第二个版本,您会注意到:
If you compare the first and the second version, you'll notice that:
第一: soup.find("img",{"src":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text()
- 注意:
"src"
第二: soup.find("img","src=":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text()
- 注意:
"src="
第二个代码返回 Attribute Error:'NoneType' object has no attribute 'parent'
因为它找不到 src=="../img/gifts/img1.jpg"
在提供的汤中.
The second code returns Attribute Error:'NoneType' object has no attribute 'parent'
because it couldn't find src=="../img/gifts/img1.jpg"
in the provided soup.
因此,如果您在第二个版本中删除 =
,它应该可以工作.
So, if you remove the =
in the second version, it should work.
顺便说一句,你应该明确你想使用哪个解析器,否则 bs4
将返回以下警告:
Btw, you should explicitly which parser you want to use, otherwise bs4
will return the following warning:
用户警告:没有明确指定解析器,所以我使用了最好的此系统可用的 HTML 解析器(lxml").这通常不是问题,但如果您在另一个系统上运行此代码,或在不同的虚拟环境,它可能使用不同的解析器和行为不一样.
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
要消除此警告,请更改如下所示的代码:
To get rid of this warning, change code that looks like this:
BeautifulSoup([你的标记])
BeautifulSoup([your markup])
为此:
BeautifulSoup([你的标记], "lxml")
BeautifulSoup([your markup], "lxml")
因此,如警告消息中所述,您只需将 soup = BeautifulSoup(html.read())
更改为 soup = BeautifulSoup(html.read(), 'lxml')
,例如.
So, as stated in the warning message, you just have to change soup = BeautifulSoup(html.read())
to soup = BeautifulSoup(html.read(), 'lxml')
, for example.