rss

how to check uniqueness (non duplication) of a post in an rss feed

丶灬走出姿态 提交于 2019-11-29 11:47:39
when retrieving and caching/saving (in a database) some posts from an rss feed, how to determine that: it is the same post (example: when some typos are fixed in the feed or if the title changes, the date changes, etc...) find feeds that talk about the same topic (example: same story from different sources) are there any best practices for these things? thnx a lot Some RSS feeds have a guid element as an identifier. Posts with a shared guid are probably duplicates. Some RSS feeds just stuff the URL in there to indicate that a post's uniqueness is tied to its url. Note that if the URL matches

Images in RSS feed

落花浮王杯 提交于 2019-11-29 10:56:12
问题 Whenever I see images in an RSS feed, they are embedded in CDATA, rather than surrounded by tags. In my feed, I would like the images to show up without doing that. Whether in the browser, or a feed reader (Bloglines) or through FeedBurner, the following structure does not show images, although it is valid RSS. Does anyone have experience with this? <item> <category>Viewbook</category> <title>Widget</title> <description>Learn more about our widgets.</description> <link>http://www.widget.com

Feedparser.parse() 'SSL: CERTIFICATE_VERIFY_FAILED'

瘦欲@ 提交于 2019-11-29 10:36:00
I'm having this SSL issue with feedparser parsing an HTTPS RSS feed, I don't really know what to do as I can't find any documentation on this error when it comes to feedparser: >>> import feedparser >>> feed = feedparser.parse(rss) >>> feed {'feed': {}, 'bozo': 1, 'bozo_exception': URLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)'),), 'entries': []} >>> feed["items"] [] >>> Thanks you cmidi for the answer, which was to 'monkey patch' using ssl._create_default_https_context = ssl._create_unverified_context import feedparser import ssl if hasattr(ssl

craigslist rss feed

爷,独闯天下 提交于 2019-11-29 08:48:33
I'm trying to parse the data from a craigslist rss feed. This is the feed url - http://www.craigslist.org/about/best/all/index.rss I'm using jfeed and my code is given below jQuery(function() { jQuery.getFeed({ url: 'proxy.php?url=http://www.craigslist.org/about/best/all/index.rss', success: function(feed) { jQuery('#result').append('<h2>' + feed.title + '</h2>'); } }); }); However, I don't get the feed title displayed or any other property of the feed. If i just try to print out the feed to the screen, I get 'Object Object' which means it correctly returned the feed. Anybody know what I am

Google Feed Loader API ignoring XML attributes

≯℡__Kan透↙ 提交于 2019-11-29 08:43:52
Google's feed loader appears to be ignoring attributes when converting to JSON. I'm using jQuery to grab a feed via AJAX. The actual RSS XML feed can be seen here , and the response from the AJAX call can be seen here . I need to access the url attribute of the <enclosure> tags, but neither appear in the response. For reference, the code I am using is: function getFeed(url) { url = 'http://ajax.googleapis.com/ajax/services/feed/load?v=1.0&num=10&callback=?&q=' + encodeURIComponent(url); $.ajax({ type: 'GET', url: url, dataType: 'jsonp', cache: false, success: function(d) { alert(JSON.stringify

how to check if rtmp or hls urls are exist or they'll give 404 error in swift

匆匆过客 提交于 2019-11-29 08:27:06
I need to parse some data from rss and open related links from parsed rss in swift 2, for example i want to check this link is valid or not: rtmp://185.23.131.187:1935/live/jomhori1 or this one : http://185.23.131.25/hls-live/livepkgr/_defint_/liveevent/livestream.m3u8 My code to check the validation of the url : let urlPath: String = "http://185.23.131.25/hls-live/livepkgr/_defint_/liveevent/livestream.m3u8" let url: NSURL = NSURL(string: urlPath)! let request: NSURLRequest = NSURLRequest(URL: url) let response: AutoreleasingUnsafeMutablePointer<NSURLResponse?>=nil var valid : Bool! do { _ =

Error tolerant XML reader

自闭症网瘾萝莉.ら 提交于 2019-11-29 08:05:37
Does anyone have/make/sell an error tolerant XML reader for .NET? Yeah, I know, XML isn't designed to have errors in it and should be rejected if it's not valid .. blah blah. But sadly the real-world is imperfect and developers do make mistakes and I still want to be able to read their feeds even if I'm missing the odd element here or there because it wasn't encoded properly or had some other error in it. So please, no answers "fix the source" or "reject it". So, does anyone have a component that can recover and handle common mistakes in XML files? Look around HTML Parser, 'cause html is

C#版简易RSS阅读器

£可爱£侵袭症+ 提交于 2019-11-29 06:09:42
C#版简易RSS阅读器。由VB版修改完成,感谢aowind的技术支持! 源代码: using System; using System.Drawing; using System.Collections; using System.ComponentModel; using System.Windows.Forms; using System.Data; using System.Xml; using System.IO; using System.Threading; namespace YuLRSSReader { /// <summary> /// Form1 的摘要说明。 /// </summary> public class Form1 : System.Windows.Forms.Form { private System.Windows.Forms.Label label1; private System.Windows.Forms.Label label2; private System.Windows.Forms.Label label3; private System.Windows.Forms.TextBox textBox1; private System.Windows.Forms.Button button1; private System

性能分析 | Linux 内存占用分析

99封情书 提交于 2019-11-29 05:36:41
这篇博客主要介绍 linux 环境下,查看内存占用的两种方式:使用 ps,top等命令;查看/proc/[pid]/下的文件。文章简要介绍了命令的使用方法与一些参数意义,同时对/proc/[pid]/下的文件内容进行了一些详细的介绍。文章内容来自google和自我总结,如有不当之处,欢迎批评指正。 查看Linux内存的方法 linux 下面查看内存有多种渠道,比如通过命令 ps ,top,free, pmap 等,或者通过/proc系统。一般情况下,ps,top,pmap,free可以满足要求,如果需要比较详细和精确地知道整机内存或者某个进程内存的使用情况,可以通过/proc 系统。 使用命令 free : 显示系统可用内存以及已经使用的内存的信息 ps: 查看进程信息,静态,即当前状态 top: 查看进程信息,动态 pstree: 查看进程树 pmap: 根据进程ID查看进程信息 ps vs top ps命令–提供系统过去信息的一次性快照,也就是说ps命令能够查看刚刚系统的进程信息。 top命令反应的是系统进程动态信息,默认10s更新一次。 ps和top都是从/proc目录下读取进程的状态信息,内核把当前系统进程的各种有用信息都放在这个伪目录下。 常见ps命令: ps -aux: 查看系统所有进程 ps -l: 进查看自己的bash相关进程 top 命令详解,请参考http:/

Official Facebook RSS feed for a Page

回眸只為那壹抹淺笑 提交于 2019-11-29 03:05:30
问题 Many people have described how to obtain the RSS data feed for a Facebook page. For example: http://ahrengot.com/tutorials/facebook-rss-feed/ The following URL provides the feed for Coca-Cola's page: http://www.facebook.com/feeds/page.php?format=rss20&id=40796308305 However, I cannot seem to find any documentation on facebook.com that describes this interface. Does anyone know if this interface is officially supported by Facebook? I don't want to reference it in my code only to have it