simple-html-dom

getting an error reading simpleDomObject

为君一笑 提交于 2019-12-13 04:51:34
问题 I have the following template file, named 'test.html' <div class='title'>TEST</div> And I have the following PHP code: <? include "simplehtmldom/simple_html_dom.php"; $dom = file_get_html( "test.html" ); echo $dom->outertext; ?> So far so good, this displays the file test.html. But when I try to change something I get an error: <? include "simplehtmldom/simple_html_dom.php"; $dom = file_get_html( "test.html" ); $dom->find('.title')->innertext = "changed"; echo $dom->outertext; ?> Warning :

PHP Simple HTML DOM Scrape External URL

帅比萌擦擦* 提交于 2019-12-13 04:42:45
问题 I'm trying to build a personal project of mine, however I'm a bit stuck when using the Simple HTML DOM class. What I'd like to do is scrape a website and retrieve all the content, and it's inner html, that matches a certain class. My code so far is: <?php error_reporting(E_ALL); include_once("simple_html_dom.php"); //use curl to get html content $url = 'http://www.peopleperhour.com/freelance-seo-jobs'; $html = file_get_html($url); //Get all data inside the <div class="item-list"> foreach(

Parsing html page that has two different format on the same elements

折月煮酒 提交于 2019-12-13 04:22:30
问题 In the same html pageThere're two different format of the same contain : the first is : <div class="gs"><h3 class="gsr"><a href="http://www.example1.com/">title1</a> the second is : <div class="gs"><h3 class="gsr"><span class="gsc"></span><a href="http://www.example2.com/">title2</a> How to get links and titles in one code that can handle that two different format with simple_html_dom? I've tried this code, but it doesn't work : foreach($html->find('h3[class=gsr]') as $docLink){ $link =

PHP - Simple HTML DOM Parser - Table Issue

霸气de小男生 提交于 2019-12-13 01:54:03
问题 I'm receiving some data from cURL and want to grab informations so i can save in another database. The result of the cURL is a hole html page, so i'm using Simple HTML DOM Parser to get what i want. The problem is, i want the values of a table, but i'm getting just the tittles. Here's the page: <div id="conteudo"> <body> <div id="tab"> <ul> <li><a href="#tabs-a">Test1</a></li> <li><a href="#tabs-b">Test2</a></li> <li><a href="#tabs-c">Test3</a></li> </ul> <div id="tabs-1"> <div> <table id="d1

Extract doctype with simple_html_dom

十年热恋 提交于 2019-12-13 01:18:24
问题 I am using simple_html_dom to parse a website. Is there a way to extract the doctype? 回答1: You can use file_get_contents function to get all HTML data from website. For example <?php $html = file_get_contents("http://google.com"); $html = str_replace("\n","",$html); $get_doctype = preg_match_all("/(<!DOCTYPE.+\">)<html/i",$html,$matches); $doctype = $matches[1][0]; ?> 回答2: You can use $html->find('unknown') . This works - at least - in version 1.11 of the simplehtmldom library. I use it as

simplehtmldom - SSL operation failed with code 1. OpenSSL Error messages

北城余情 提交于 2019-12-13 00:55:19
问题 I'm using http://simplehtmldom.sourceforge.net/ and file_get_contents() in my webApp. The file_get_contents() work fine on localhost. But when upload webApp on server(Windows server 2012 r2) i get this error. How to fix this error? > Warning: file_get_contents(): SSL operation failed with code 1. OpenSSL Error messages: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed in E:\cfnic.com\includes\class\PHP_Simple_HTML_DOM_Parser.php on line 75 Warning: file_get

Simple HTML Dom - find text between divs

倾然丶 夕夏残阳落幕 提交于 2019-12-12 09:07:25
问题 I need to extract the text in between divs here ("The third of four...") - using Simple HTML Dom PHP library. I have tried everything I think! next_sibling() returns the comment, and next_sibling()->next_sibling() returns the <br/> tag. Ideally I would like to get all the text from the end of the first comment and to the next </div> tag. <div class="left"> Bla-bla.. <div class="float">Bla-bla... </div><!--/end of div.float--> <br />The third of four performances in the Society's Morning

Can't separate cells properly with simplehtmldom

允我心安 提交于 2019-12-12 04:48:26
问题 I am trying to write a web scraper. I want to get all the cells in a row. The row before the one I want has THOROUGHBRED MEETINGS as its plain text value. I can successfully get this row. But I can't figure out how to get the next row's children which are the cells or <td> tags. if ($foundTag = FindTagByText("THOROUGHBRED MEETINGS", $html)) { $cell = $foundTag->parent(); $row = $cell->parent(); $nextRow = $row->next_sibling(); echo "Row: ".$row->plaintext."<br />\n"; echo "Next Row: ".

How to workaround PHP advanced html dom's conversion of entities?

荒凉一梦 提交于 2019-12-12 04:39:20
问题 How can I workaround advanced_html_dom.php str_get_html's conversion of HTML entities, short of applying htmlentities() on every element content? Despite http://archive.is/YWKYp#selection-971.0-979.95 The goal of this project is to be a DOM-based drop-in replacement for PHP's simple html dom library. ... If you use file/str_get_html then you don't need to change anything. I find on include 'simple_html_dom.php'; $set = str_get_html('<html><title> </title></html>'); echo ($set->find('title',0)

Removing unwanted elements from table simple_html_dom

社会主义新天地 提交于 2019-12-12 03:45:49
问题 I am fetching a page that is a page with some style tags, table and other non vital content. I'm storing this in a transient, and fetching it all with AJAX $result_match = file_get_contents( 'www.example.com' ); set_transient( 'match_results_details', $result_match, 60 * 60 * 12 ); $match_results = get_transient( 'match_results_details' ); if ( $match_results != '') { $html = new simple_html_dom(); $html->load($match_results); $out = ''; $out .= '<div class="match_info_container">'; if (