Reading/Parsing RSS feed using ROME

ROME is an open source tool to parse, generate and publish RSS and Atom feeds. Using Rome you can parse the available RSS and Atom feeds. Without bothering about format and version of RSS feed. The core library depends on the JDOM XML parser.

Atom is on the similar lines of RSS is another kind of feed. But it’s different in some aspects as protocol, payloads.

RSS is a method to share and publish contents. The contents may be any things from news to any little information. The main component is xml. Using xml you can share your contents on web. At the same time you are free to get what you like from others.

Why use Rome instead of other available readers

The Rome project started with the motivation of ‘ESCAPE’ where each letter stands for:
E – Easy to use. Just give a URL and forget about its type and version, you will be given a output in the format which you like.
S – Simple. Simple structure. The complications are all hidden from developers.
C – Complete. It handles all the versions of RSS and Atom feeds.
A – Abstract. It provides abstraction over various syndication specifications.
P – Powerful. Don’t worry about the format let Rome handle it.
E – Extensible. It needs a simple pluggable architecture to provide future extension of formats.

Dependency

Following are few dependencies:
J2SE 1.4+, JDOM 1.0, Jar files (rome-0.8.jar, purl-org-content-0.3.jar, jdom.jar)

Using Rome to read a Syndication Feed

Considering you have all the required jar files we will start with reading the RSS feed. ROME represents syndication feeds (RSS and Atom) as instances of the com.sun.syndication.synd.SyndFeed interface.

ROME includes parsers to process syndication feeds into SyndFeed instances. The SyndFeedInput class handles the parsers using the correct one based on the syndication feed being processed. The developer does not need to worry about selecting the right parser for a syndication feed, the SyndFeedInput will take care of it by peeking at the syndication feed structure. All it takes to read a syndication feed using ROME are the following 2 lines of code:

SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build (new XmlReader (feedUrl));

Now it’s simple to get the details of Feed. You have the object.

The sample code is as follows.

package com.infosys.hanumant.rome;

import java.net.URL;
import java.util.Iterator;

import com.sun.syndication.feed.synd.SyndEntry;
import com.sun.syndication.feed.synd.SyndFeed;
import com.sun.syndication.io.SyndFeedInput;
import com.sun.syndication.io.XmlReader;

/**
 * @author Hanumant Shikhare
 */
public class Reader {

  public static void main(String[] args) throws Exception {
   
    URL url  = new URL("http://viralpatel.net/blogs/feed");
    XmlReader reader = null;
    
    try {
	 
      reader = new XmlReader(url);
      SyndFeed feed = new SyndFeedInput().build(reader);
      System.out.println("Feed Title: "+ feed.getAuthor());

	 for (Iterator i = feed.getEntries().iterator(); i.hasNext();) {
		SyndEntry entry = (SyndEntry) i.next();
		System.out.println(entry.getTitle());
 			}
		} finally {
			if (reader != null)
				reader.close();
		}
	}
}

Understanding the Program

Initialize the URL object with the RSS Feed or Atom url. Then we will need XMLReader object which will then take URL object, as its constructor argument. Initialize the SyndFeed object by calling the build(reader) method. This method takes the XMLReader object as an argument.

References

https://rome.dev.java.net/
http://www.intertwingly.net/wiki/pie/Rss20AndAtom10Compared

http://www.rss-specifications.com



18 Comments

  • Venkatesan Padmanabhan 28 June, 2009, 15:03

    Really thanks for this… This really saved me out after struggling for hours with other parsers.

  • Viral Patel 28 June, 2009, 16:45

    you welcome Venkatesan.. :)

  • dinesh 13 July, 2009, 13:31

    Hay, i have an error reading the RSS feeds which are generated from google groups. I guess google is blocking requests from other applications than the browsers. Can you please help me on this

  • Viral Patel 13 July, 2009, 14:15

    Hi Dinesh,
    You can set User-Agent of your http request to any of the bot’s user-agent so that Google treat it as a bot. To change the user agent of request use XmlReader(java.net.URLConnection conn) constructor of XmlReader class. Pass the conn object which has the user agent set to proper value.
    conn.setRequestProperty(”User-Agent”,”whateveryouwant”);

    Hope this works

  • dinesh 13 July, 2009, 19:15

    It worked. Thank you very much

  • bradford cross 5 September, 2009, 16:56

    I treid your example and got:

    Invalid XML: Error on line 10: The element type “META” must be terminated by the matching end-tag “”.
    [Thrown class com.sun.syndication.io.ParsingFeedException]

    • Viral Patel 5 September, 2009, 19:10

      Hi Bradford, I suggest you to validate the xml before you parsing it using ROME. Check the source RSS and see if it does not contain any error. There are online tools to validate RSS. Search on Google and you will get lot of such online utilities.

  • Jnew 20 November, 2009, 22:17

    generates this error why?

    Exception in thread “main” java.lang.NoClassDefFoundError: org/jdom/input/JDOMParseException
    at com.sun.syndication.io.SyndFeedInput.(SyndFeedInput.java:58)
    at com.sun.syndication.io.SyndFeedInput.(SyndFeedInput.java:48)
    at feedrss.Main.main(Main.java:27)
    Caused by: java.lang.ClassNotFoundException: org.jdom.input.JDOMParseException
    at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
    at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
    … 3 more
    Java Result: 1

  • anon 27 November, 2009, 5:10

    you may want to look at vtd-xml, the latest and most advanced xml processing api

    vtd-xml

  • Erik 18 December, 2009, 7:38

    Rome 1.0 doesn’t have a build method who takes a XmlReader. this tutorial needs to be updated.

  • Erik 18 December, 2009, 8:30

    Sorry ’bout that. the solution was to load jdom lib, nothing wrong with example.
    Jnew: that should solve your problem aswell

  • jaipuig 28 October, 2010, 13:53

    Thank you very much for your tutorial! Clear, concise and very good

  • Malhar 28 February, 2011, 12:34

    How to parse when content is in html format.. mean.. syndEntry.getType is html

    any idea?

  • Manohar 22 March, 2011, 11:34

    Where can I download rome jar from? What is the official site?

    I tried http://java.net/projects/rome/downloads , but there is nothing to download there….

  • Osi 18 May, 2011, 1:59

    It looks like ROME have been discontinued many months ago. Where’s the replacement?

  • David 9 October, 2012, 20:37

    Thanks, exactly what I was looking for!

  • Franky 16 January, 2013, 11:05

    I am getting IO exception connection timeout inspite of setting sun.net.client.defaultConnectTimeout and sun.net.client.defaultReadTimeout. Please help.

  • Rashed 7 November, 2013, 0:32

    How do you parse elements with namespaces not in the SyndFeed?

Leave a Reply

Your email address will not be published. Required fields are marked *

Note

To post source code in comment, use [code language] [/code] tag, for example:

  • [code java] Java source code here [/code]
  • [code html] HTML here [/code]

Current day month ye@r *