simple-html-dom

How can we get specific links using simple html dom

邮差的信 提交于 2019-12-12 03:23:08
问题 I have used this script which i found in the official simple html dom site to find hyperlinks in a website foreach($html->find('a') as $element) echo $element->href . '<br>'; it returned all the links found in the website but i want only specific links in that website. is there a way of doing it in simple html dom. This is the html code for that specific links <a class="z" href="http://www.bbc.co.uk/news/world-middle-east-16893609" target="_blank" rel="follow">middle east</a> where this is

HTTP_ACCESS returned when invoking file_get_contents (and also simple_html_dom)

帅比萌擦擦* 提交于 2019-12-12 02:54:50
问题 I'm trying to get the contents of a page this way: <?php include_once 'simple_html_dom.php'; $opts = array('http' => array( 'method' => 'GET', 'timeout' => 10 ) ); $domain = "http://www.esperandoaramon.com"; //$domain = "http://www.google.com"; $context = stream_context_create($opts); $input = @file_get_contents($domain,false,$context) or die("Could not access file: $domain"); echo($input); ?> I can get www.google.com contents this way, unfortunately the other domain gives me only this

Which to use? file_get_contents, file_get_html, or cURL? [closed]

守給你的承諾、 提交于 2019-12-12 02:04:44
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 5 years ago . I need to scrape data from a table on a web page. I'd then like to store this data in an array, so that I can later store it in a database. I'm very unfamiliar with this functionality, so I'd like to use the most simple method possible. Which should I use? file_get_contents , file

PHP file_get_html is not working

大兔子大兔子 提交于 2019-12-12 01:25:00
问题 I have used simple_html_dom library but i can not get HTML content only for 1 URL but i am getting 503 error. Check my below code. $base = 'http://www.amazon.com/gp/offer-listing/B001F0M4K8/ref=dp_olp_all_mbc/183-8463780-9861412?ie=UTF8&condition=new'; echo $html = file_get_html($base); Error : Warning: file_get_contents(http://www.amazon.com/gp/offer-listing/B001F0M4K8/ref=dp_olp_all_mbc/183-8463780-9861412?ie=UTF8&condition=new) [function.file-get-contents]: failed to open stream: HTTP

fetching DIV that are inside unordered list and list item with simple html dom

元气小坏坏 提交于 2019-12-11 16:39:58
问题 How can i get DIV that are inside ul and li like the below codes what shows, there is alot of DIV inside a single li what method should i use? there is codes i have used below but they fetch all li including those i don't need, need your thought how can i obtain example <div class="single-event__code"> from <li class="single-event live"> see below php and css codes <div class="app_content"> <ul class="js-events-container">...</li> <li class="single-event live">...</li> <li class="single-event

Trouble getting source code from a webpage

南楼画角 提交于 2019-12-11 15:18:20
问题 I've written a script in php to get the html content or source code from a webpage but I could not succeed. When I execute my script, it opens the page itself. How can I get the html element or source code? This is the script: <?php include "simple_html_dom.php"; function get_source($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $htmlContent = curl_exec($ch); curl_close($ch); $dom = new

File not getting read using file_get_html

可紊 提交于 2019-12-11 15:13:32
问题 I am using curl to store webpage in txt file and then reading the file to parse its content. For some website its running fine but for some websites it is(file_get_html) returning null. I have checked that txt file is generated with data but it is unable read the content. For this website when i use direct link in file_get_html at that time also it returns null. I have added user agent but not working. Finally i removed all the content from file except one div tag at that time it read the

Simple HTML DOM - replace all occurrences of a certain word - without affecting attributes

眉间皱痕 提交于 2019-12-11 13:15:03
问题 I was wondering: how would I change all occurrences of a certain word in a HTML, but only outside of tags? Example: Lets say I want to replace all occurrences of myWordToReplace with <a href="#">myWordToReplace</a> So this html <p data-something="myWordToReplace"> myWordToReplace andSomeOtherText</p> should yield <p data-something="myWordToReplace"> <a href="#">myWordToReplace</a> andSomeOtherText</p> I was trying to achieve this with regex, but it's also a mess - I thought perhaps a DOM

file_get_html() not working?

眉间皱痕 提交于 2019-12-11 10:43:21
问题 I am trying to get the title and meta description data of page by providing it url of target page but file_get_html() always return FALSE value. Any suggestions? by the way I have enabled the php extension php_openssl. <?php include("inc/simple_html_dom.inc.php"); $contents = file_get_html("https://www.facebook.com"); if($contents !=FALSE) //always skips if condition { foreach($contents->find('title') as $element) { $title = $element->plaintext; } foreach($contents->find('meta[description]')

Google Scholar profile scrape PHP

一个人想着一个人 提交于 2019-12-11 10:28:12
问题 I would like to scrap publications from google scholar profile with SimpleHtmlDom. I have script for scraping the projects, but the problem is, that i am able to scrap only projects, that are shown. When i am using url like this $html->load_file("http://scholar.google.se/citations?user=Sx4G9YgAAAAJ"); there are shown only 20 projects. I can increase the number when i change the url $html->load_file("https://scholar.google.se/citations?user=Sx4G9YgAAAAJ&hl=&view_op=list_works&pagesize=100");