Here is an example web page I am trying to get data from. http://www.makospearguns.com/product-p/mcffgb.htm
The xpath was taken from chrome development tools, and f
I had a similar issue (Chrome inserting tbody elements when you do Copy as XPath). As others answered, you have to look at the actual page source, though the browser-given XPath is a good place to start. I've found that often, removing tbody tags fixes it, and to test this I wrote a small Python utility script to test XPaths:
#!/usr/bin/env python
import sys, requests
from lxml import html
if (len(sys.argv) < 3):
print 'Usage: ' + sys.argv[0] + ' url xpath'
sys.exit(1)
else:
url = sys.argv[1]
xp = sys.argv[2]
page = requests.get(url)
tree = html.fromstring(page.text)
nodes = tree.xpath(xp)
if (len(nodes) == 0):
print 'XPath did not match any nodes'
else:
# tree.xpath(xp) produces a list, so always just take first item
print (nodes[0]).text_content().encode('ascii', 'ignore')
(that's Python 2.7, in case the non-function "print" didn't give it away)