如何从亚马逊获取产品的图像和标题?
我正在尝试根据亚马逊的独特产品代码列出产品列表.
I am trying to make a list of products based on the unique product codes of Amazon.
例如: https://www.amazon.in/gp/product/B00F2GPN36
其中B00F2GPN36是唯一代码.
Where B00F2GPN36 is the unique code.
我想在产品图片和产品名称列下将图片和产品标题提取到Excel列表中.
I want to fetch the image and the title of the product into an Excel list under the columns product image and product name.
我尝试了html.getElementsById("productTitle")
和html.getElementsByTagName
.
当我尝试声明Object
类型和HtmlHtmlElement
时,我还不确定要描述哪种类型的变量来存储上述信息.
I also have doubt on what kind of variable to describe for storing the above mentioned info as I have tried declaration of Object
type and HtmlHtmlElement
.
我试图提取html文档并将其用于数据搜索.
I tried to pull the html doc and use it for the data search.
代码:
Enum READYSTATE
READYSTATE_UNINITIALIZED = 0
READYSTATE_LOADING = 1
READYSTATE_LOADED = 2
READYSTATE_INTERACTIVE = 3
READYSTATE_COMPLETE = 4
End Enum
Sub parsehtml()
Dim ie As InternetExplorer
Dim topics As Object
Dim html As HTMLDocument
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate "https://www.amazon.in/gp/product/B00F2GPN36"
Do While ie.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to Amazon.in...."
DoEvents
Loop
Application.StatusBar = ""
Set html = ie.document
Set topics = html.getElementsById("productTitle")
Sheets(1).Cells(1, 1).Value = topics.innerText
Set ie = Nothing
End Sub
我希望输出是单元格A1中的输出:
"Milton热钢瓶玻璃水瓶,2升,银色"应反映(不带引号),并且类似地,我也想拉出图像.
I expect the output to be that in cell A1:
"Milton Thermosteel Carafe Flask, 2 litres, Silver" should reflect (without quotation marks) and similarly I want to pull the image as well.
但是总会有一些错误,例如:
1.运行时错误"13":
我使用将主题设为HTMLHtmlElement时"类型不匹配
2.运行时错误"438":
对象不支持此属性或方法
But there is always some error like:
1. Run-time error '13':
Type mismatch when I used "Dim topics As HTMLHtmlElement"
2. Run-time error '438':
Object doesn't support this property or method
注意:我从工具>引用中添加了引用,即必需的库.
Note: I added references from Tools > References i.e. the required libraries.
更快的方法是使用 xhr 并避免使用浏览器,并将结果从数组写到表中
Faster would be to use xhr and avoid browser and write out results from an array to sheet
Option Explicit
Public Sub GetInfo()
Dim html As HTMLDocument, results()
Set html = New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.amazon.in/gp/product/B00F2GPN36", False
.send
html.body.innerHTML = .responseText
With html
results = Array(.querySelector("#productTitle").innerText, .querySelector("#landingImage").getAttribute("data-old-hires"))
End With
End With
With ThisWorkbook.Worksheets("Sheet1")
.Cells(1, 1) = results(0)
Dim file As String
file = DownloadFile("C:\Users\User\Desktop\", results(1)) 'your path to download file
With .Pictures.Insert(file)
.Left = ThisWorkbook.Worksheets("Sheet1").Cells(1, 2).Left
.Top = ThisWorkbook.Worksheets("Sheet1").Cells(1, 2).Top
.Width = 75
.Height = 100
.Placement = 1
End With
End With
Kill file
End Sub