web-scraping

Scrapy does not find text in Xpath or Css

我与影子孤独终老i 提交于 2021-01-07 02:16:15
问题 I've been at this one for a few days, and no matter how I try, I cannot get scrapy to abstract text that is in one element. to spare you all the code, here are the important pieces. The setup does grab everything else off the page, just not this text. from scrapy.selector import Selector start_url = "https://www.tripadvisor.com/VacationRentalReview-g34416-d12428323-On_the_Beach_Wide_flat_beach_Sunsets_Gulf_view_Sharks_teeth_Shells_Fish-Manasota_Key_F.html" #BASIC ITEM AND SPIDER YADA, SPARE

How can I check if either xpath exists and then return the value if text is present?

自闭症网瘾萝莉.ら 提交于 2021-01-07 01:44:38
问题 I'm having trouble with the second r.html.xpath request. When there is a special deal on an item, the second Xpath changes from //*[@id="priceblock_ourprice"] to //*[@id="priceblock_dealprice"] This causes the script to fail since there the right xpath cannot be returned. How can I include this second xpath that only shows up occasionally? I would like to see if either xpath exists, if so return that, or return N/A. The first url that is searched has the ourprice xpath and the second url has

How can I check if either xpath exists and then return the value if text is present?

对着背影说爱祢 提交于 2021-01-07 01:43:10
问题 I'm having trouble with the second r.html.xpath request. When there is a special deal on an item, the second Xpath changes from //*[@id="priceblock_ourprice"] to //*[@id="priceblock_dealprice"] This causes the script to fail since there the right xpath cannot be returned. How can I include this second xpath that only shows up occasionally? I would like to see if either xpath exists, if so return that, or return N/A. The first url that is searched has the ourprice xpath and the second url has

trying to close popover - python - selenium - Glassdoor

血红的双手。 提交于 2021-01-05 11:08:47
问题 Trying to close a popover while scraping Glassdoor for jobs [It keeps popping up from time to time - need to close it every time].. I've tried quite a few things Tried closing it by looking for the close button. Please help ! driver.find_element_by_class_name("SVG_Inline modal_closeIcon").click() Tried looking for a ElementClickInterceptedException when the bot couldn't click on the next company, and everywhere else there was a click element = WebDriverWait(driver, 3).until(EC.presence_of

How to scraping iframe using selenium?

会有一股神秘感。 提交于 2021-01-05 07:28:32
问题 I want to extract all comment in a website. The website using iframe for the comment section. I already tried to scrap it using selenium. but unfortunaly, i just can scrap 1 comment. How to scrap the rest of the comment and archive it to csv or xmls? Code : from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC driver = webdriver.Chrome() page = driver

How to scraping iframe using selenium?

☆樱花仙子☆ 提交于 2021-01-05 07:27:08
问题 I want to extract all comment in a website. The website using iframe for the comment section. I already tried to scrap it using selenium. but unfortunaly, i just can scrap 1 comment. How to scrap the rest of the comment and archive it to csv or xmls? Code : from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC driver = webdriver.Chrome() page = driver

PHP: Simple HTML DOM Parser - how to get the element which has certain content?

早过忘川 提交于 2021-01-05 06:49:54
问题 In PHP I'm using the Simple HTML DOM Parser class. I have a HTML file which has multiple A-tags. Now I need to find the tag that has a certain text inside. for example : $html = "<a id='tag1'>A</a> <a id='tag2'>B</a> <a id='tag3'>C</a> "; $dom = str_get_html($html); $tag = $dom->find("a[plaintext=B]"); The above example doesn't work, since plaintext can only be used as an attribute. Any idea's? 回答1: <?php include("simple_html_dom.php"); $html = "<a id='tag1'>A</a> <a id='tag2'>B</a> <a id=

PHP: Simple HTML DOM Parser - how to get the element which has certain content?

若如初见. 提交于 2021-01-05 06:47:53
问题 In PHP I'm using the Simple HTML DOM Parser class. I have a HTML file which has multiple A-tags. Now I need to find the tag that has a certain text inside. for example : $html = "<a id='tag1'>A</a> <a id='tag2'>B</a> <a id='tag3'>C</a> "; $dom = str_get_html($html); $tag = $dom->find("a[plaintext=B]"); The above example doesn't work, since plaintext can only be used as an attribute. Any idea's? 回答1: <?php include("simple_html_dom.php"); $html = "<a id='tag1'>A</a> <a id='tag2'>B</a> <a id=

Python requests 401 error but url opens in browser

末鹿安然 提交于 2021-01-04 03:13:46
问题 I am trying to pull the json from this location - https://www.nseindia.com/api/option-chain-indices?symbol=BANKNIFTY This opens fine in my browser, but using requests in python throws a 401 permission error. I have tried adding headers with different arguments, but to no avail. Interestingly, the json on this page does not open in the browser as well until https://www.nseindia.com is opened separately. I believe it requires some kind of authentication, but surprised it works in the browser

Python requests 401 error but url opens in browser

拥有回忆 提交于 2021-01-04 03:13:08
问题 I am trying to pull the json from this location - https://www.nseindia.com/api/option-chain-indices?symbol=BANKNIFTY This opens fine in my browser, but using requests in python throws a 401 permission error. I have tried adding headers with different arguments, but to no avail. Interestingly, the json on this page does not open in the browser as well until https://www.nseindia.com is opened separately. I believe it requires some kind of authentication, but surprised it works in the browser