Searching XML Feeds for Keywords

非 Y 不嫁゛ 提交于 2019-12-04 18:15:19

35-40 RSS feeds are a lot of requests for one script to handle and parse all at once. Your bottleneck is most likely the requests, not the parsing. You should separate the concerns. Have one script that requests an RSS feed one at a time every minute or so, and store the results locally. Then another script should parse and save/remove the temporary results every 15-30 minutes.

You could use XPath to search the XML directly... Something like:

$dom = new DomDocument();
$dom->loadXml($feedXml);
$xpath = new DomXpath($dom);

$query = '//item[contains(title, "foo")] | //item[contains(description, "foo")]';
$matchingNodes = $xpath->query($query);

Then, $matchingNodes will be a DomNodeList of all the matching item nodes. Then you can save those in the database...

So to adjust this to your real world example, you could either build the query to do all the searching for you in one shot:

$query = array();
foreach($keywords as $keyword) {
    $query[] = '//item[contains(title, "'.$keyword.'")]';
    $query[] = '//item[contains(description, "'.$keyword.'")]';
}
$query = implode('|', $query);

Or just re-query for each keyword... Personally, I'd build one giant query, since then all the matching is done in complied C code (and hence should be more efficient than looping in php land and aggregating the results there)...

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!