问题
I am trying to track the status of shipping delivery and display it on an Excel tab.
This website https://webcsw.ocs.co.jp/csw/ECSWG0201R00003P.do, displays data when the "Air wayBill No." is entered.
I managed to open Internet Explorer, enter the Air WayBill number, then click the search button.
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "https://webcsw.ocs.co.jp/csw/ECSWG0201R00000P.do"
IE.Visible = True
While IE.busy
DoEvents
Wend
Set document = IE.document
With document
.getElementsByName("edtAirWayBillNo")(0).Value = ThisWorkbook.Sheets("Sheet3").Range("B2")
.getElementsByClassName("button btn_ex").Item.Click
End With
I couldn't find any flags like name, id or class.
How do I retrieve data from the chart section where they are just marked by 'tbody', 'tr' and 'td'?
I tried to use the .getElementsByTagName
method.
The section of the website's html where I need to retrieve data.
<table border="0" cellpadding="0" cellspacing="0" id="" style="border:#d0d0d0 1px dotted;" width="100%">
<tbody id="chart_header">
<tr>
<td rowspan="1" colspan="1" width="90px">Air WayBill No.</td>
<td rowspan="1" colspan="3" width="370px">Latest Tracking Record</td>
<td rowspan="1" colspan="1" width="150px">Shipper</td>
<td rowspan="1" colspan="1" width="150px">Receiver</td>
<td rowspan="1" colspan="1" width="40px">Pcs</td>
<td rowspan="1" colspan="1" width="80px">Actual Weight</td>
<td rowspan="1" colspan="1" width="70px">Vol. Weight</td>
</tr>
</tbody>
<tbody id="chart" style="height: auto">
<!-- record start -->
<tr>
<td>
<a href="#0" shape="rect">
25017894414
</a>
</td>
<td width="160px">
<div style=" position:relative; width:100%;align:left;vertical-align:
middle;">
<div style="position:absolute;top:0pt;left: 1pt; margin: 1px;">
Fri
</div>
<div style="position:absolute;top:0pt;left:25pt;">
04Sep2020
</div>
<div style="position:absolute;top:0pt;left:80pt;">
09:40
</div>
</div>
</td>
<td width="90px">
<input type="text" value="Product Scanned" style="width:90px;" tabindex="-1" class="readonly_left" readonly="readonly">
</td>
<td width="130px" style="border-width:1px 1px 1px 0px;">
<img src="./image/tpStatus_BLUE4.gif" width="130px" height="16px" class="middle">
</td>
<td>
<input type="text" value="SUZHOU/CHINA" style="width:145px;" tabindex="-1" class="readonly_left" readonly="readonly">
</td>
<td>
<input type="text" value="AICHI KEN/JAPAN" style="width:145px;" tabindex="-1" class="readonly_left" readonly="readonly">
</td>
<td class="t_right">
<input type="text" value="1" style="width:40px;" tabindex="-1" class="readonly_right" readonly="readonly">
</td>
<td class="t_right">
<input type="text" value="1.9kg" style="width:70px;" tabindex="-1" class="readonly_right" readonly="readonly">
</td>
<td class="t_right">
<input type="text" value="1.2kg" style="width:70px;" tabindex="-1" class="readonly_right" readonly="readonly">
</td>
</tr>
<!-- record end -->
</tbody>
</table>
回答1:
Provided you wait for results to load you should be able to use ie.document.querySelector("#charttitle + table")
to grab the table and use the clipboard to copy the outerHTML
of that node as a table to excel. You could loop until table has results with a time-out (preferable), or use an explicit wait.
This
#charttitle + table
is a css selector that looks for the table which is the adjacent sibling to the element with id charttitle
'wait condition after click to submit
Dim clipboard As Object
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
clipboard.SetText ie.document.querySelector("#charttitle + table").outerHTML
clipboard.PutInClipboard
ActiveSheet.Cells(1, 1).PasteSpecial
You can get all those tables with querySelectorAll
and a css general sibling combinator ~
Dim tables As Object, i As Long
Set tables = ie.document.querySelectorAll("#charttitle ~ table")
You then need to loop from For i = 0 to tables.length -1
and access the current table in the loop with tables.item(i).outerHTML
and write out to the correctly determined desired output row.
Read about CSS selectors here:
https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors
And finding last row
https://www.rondebruin.nl/win/s9/win005.htm
Remember to check if scraping is allowed under the terms of service.
来源:https://stackoverflow.com/questions/63758241/web-scraping-without-specified-name-id-or-class-attached-to-the-data