按行解析HTML
问题描述:
我正在使用Python和Beautiful Soup解析HTML网页(不过,我可以接受其他解决方案).我想知道是否可以基于HTML一行(get the td tag from line3
)来解析文件.这可能吗?
I am parsing an HTML webpage with Python and Beautiful Soup (I am open to other solutions, though). I am wondering if it is possible to parse the file based on a line of HTML, i.e., get the td tag from line3
. Is this possible?
答
请考虑以下示例:http://www.pythonforbeginners.com/python-on-the-web/web-scraping-with-beautifulsoup/
逐行处理和匹配href(您需要td)
consider this example: http://www.pythonforbeginners.com/python-on-the-web/web-scraping-with-beautifulsoup/
there is line-by-line processing and matching of href(you need td)
另外考虑:soup.find_all("td", limit=3)