How to Discover Rss Feeds for a Given Url

How To Discover RSS Feeds for a given URL

Found something that I wanted:

Google's AJAX Feed API has a load feed and lookup feed function (Docs here).

a) Load feed provides the feed (and feed status) in JSON

b) Lookup feed provides the RSS feed for a given URL

Theres also a find feed function that searches for RSS feeds based on a keyword.

Planning to use this with JQuery's $.getJSON

How to find RSS feed of a particular website?

You might be able to find it by looking at the source of the home page (or blog). Look for a line that looks like this:

<link rel="alternate" type="application/rss+xml" title="RSS Feed" href="http://example.org/rss" />

The href value will be where the RSS is located.

How to get the feed URL(s) from a website?

It is not common practice for websites to send back their RSS feed from an HTTP request to the home page asking for an application/rss+xml MIME type in the Accept header. That documentation on Mozilla you've linked is a suggestion I've never seen before after many years involvement in RSS as a developer.

A more established and widely adopted method for a site to identify its RSS feed is a technique called RSS Autodiscovery. Open the site's home page and look for this tag in the HEAD section:

<link rel="alternate" type="application/rss+xml" title="RSS"
href="http://feeds.example.com/rss-feed">

The type attribute can be any of the MIME types for RSS, Atom or JSONFeed feeds.

To find whether the given URL is a RSS Feed URL or not

There are a few things you can try, off of the top of my head:

  1. See what Content-Type the server returns for the given URL. However, this may not be definitive and a server may not necessarily return the correct header.
  2. Try to parse the content of the URL as RSS and see if it is successful - this is likely the only definitive proof that a given URL is a RSS feed.

Extract RSS Feed url from

In general, a website that offers RSS feed(s) indicates so in the header of at least the home page, some every single page.

There is an example of an RSS feed:

<link href="http://snapwebsites.org/rss.xml"
title="Snap! A C++ Open Source CMS RSS"
type="application/rss+xml"
rel="alternate">

Note that the type will vary slightly between websites. For example some websites may use text instead of application (which is wrong, but XML is text...) There is also application/atom+xml. You may also have both formats.

If that's not available, then you'd have to check the home page or other pages for anchor links to an RSS feed, which means:

  • Parse the HTML
  • Look for anchors
  • Read the href attribute
  • Check the destination to see whether it returns an XML file
  • If you get an xml file (starts with <?xml ...) then check the root tag:
  1. 'rss' -- RSS format (version is an attribute)
  2. 'feed' -- Atom format

I have an example on the following page that includes the <link ...> tag in the header:

http://snapwebsites.org/implementation/feature-requirements/feed-feature-core-atom-rss-20-etc

I have to say, without that link, it will be quite a bit harder to find the RSS feeds. That being said, on many websites the feeds files make use of an extension (.rss, .atom, .xml) and that could be used to simplified the search. Yet, more and more, feeds look like directory names (.../blah or .../foo cannot be distinguished from a standard HTML page or a feed, so the only way is to read the file at the destination and check the file format; the Content-Type of the HTTP reply should be application/rss+xml or application/atom+xml too... like the header link type=... attribute)


As a side note, although very unlikely (I've not really seen it on a live website), it is possible to use the Link: ... HTTP header to indicate... links just the same as the <link ...> tag found in the HTML header. If you have access to the HTTP header (here is how to do it in PHP), then it's worth looking for those headers to see whether one of them is an RSS feed.

Find feed rss for a given URL: Feedbag error?

Currently I didn't found why this occurring. Since I don't going update this question / answer in the future, you can check the current status of my issue on GitHub clicking here.

I hope the developer has saw my issue, but until the moment I didn't get any tip.



Related Topics



Leave a reply



Submit