Reading/Parsing RSS feed using ROME

ROME is an open source tool to parse, generate and publish RSS and Atom feeds. Using Rome you can parse the available RSS and Atom feeds. Without bothering about format and version of RSS feed. The core library depends on the JDOM XML parser. Atom is on the similar lines of RSS is another kind of feed. But it’s different in some aspects as protocol, payloads. RSS is a method to share and publish contents. The contents may be any things from news to any little information. The main component is xml. Using xml you can share your contents on web. At the same time you are free to get what you like from others.

Why use Rome instead of other available readers

The Rome project started with the motivation of ‘ESCAPE’ where each letter stands for: E – Easy to use. Just give a URL and forget about its type and version, you will be given a output in the format which you like. S – Simple. Simple structure. The complications are all hidden from developers. C – Complete. It handles all the versions of RSS and Atom feeds. A – Abstract. It provides abstraction over various syndication specifications. P – Powerful. Don’t worry about the format let Rome handle it. E – Extensible. It needs a simple pluggable architecture to provide future extension of formats.

Dependency

Following are few dependencies: J2SE 1.4+, JDOM 1.0, Jar files (rome-0.8.jar, purl-org-content-0.3.jar, jdom.jar)

Using Rome to read a Syndication Feed

Considering you have all the required jar files we will start with reading the RSS feed. ROME represents syndication feeds (RSS and Atom) as instances of the com.sun.syndication.synd.SyndFeed interface. ROME includes parsers to process syndication feeds into SyndFeed instances. The SyndFeedInput class handles the parsers using the correct one based on the syndication feed being processed. The developer does not need to worry about selecting the right parser for a syndication feed, the SyndFeedInput will take care of it by peeking at the syndication feed structure. All it takes to read a syndication feed using ROME are the following 2 lines of code:
SyndFeedInput input = new SyndFeedInput(); SyndFeed feed = input.build (new XmlReader (feedUrl));
Code language: Java (java)
Now it’s simple to get the details of Feed. You have the object. The sample code is as follows.
package com.infosys.hanumant.rome; import java.net.URL; import java.util.Iterator; import com.sun.syndication.feed.synd.SyndEntry; import com.sun.syndication.feed.synd.SyndFeed; import com.sun.syndication.io.SyndFeedInput; import com.sun.syndication.io.XmlReader; /** * @author Hanumant Shikhare */ public class Reader { public static void main(String[] args) throws Exception { URL url = new URL("https://www.viralpatel.net/feed"); XmlReader reader = null; try { reader = new XmlReader(url); SyndFeed feed = new SyndFeedInput().build(reader); System.out.println("Feed Title: "+ feed.getAuthor()); for (Iterator i = feed.getEntries().iterator(); i.hasNext();) { SyndEntry entry = (SyndEntry) i.next(); System.out.println(entry.getTitle()); } } finally { if (reader != null) reader.close(); } } }
Code language: Java (java)

Understanding the Program

Initialize the URL object with the RSS Feed or Atom url. Then we will need XMLReader object which will then take URL object, as its constructor argument. Initialize the SyndFeed object by calling the build(reader) method. This method takes the XMLReader object as an argument.

References

https://rome.dev.java.net/ http://www.intertwingly.net/wiki/pie/Rss20AndAtom10Compared http://www.rss-specifications.com
Get our Articles via Email. Enter your email address.

You may also like...

20 Comments

  1. Really thanks for this… This really saved me out after struggling for hours with other parsers.

  2. you welcome Venkatesan.. :)

  3. dinesh says:

    Hay, i have an error reading the RSS feeds which are generated from google groups. I guess google is blocking requests from other applications than the browsers. Can you please help me on this

  4. Hi Dinesh,
    You can set User-Agent of your http request to any of the bot’s user-agent so that Google treat it as a bot. To change the user agent of request use XmlReader(java.net.URLConnection conn) constructor of XmlReader class. Pass the conn object which has the user agent set to proper value.
    conn.setRequestProperty(”User-Agent”,”whateveryouwant”);

    Hope this works

  5. dinesh says:

    It worked. Thank you very much

  6. bradford cross says:

    I treid your example and got:

    Invalid XML: Error on line 10: The element type “META” must be terminated by the matching end-tag “”.
    [Thrown class com.sun.syndication.io.ParsingFeedException]

    • Hi Bradford, I suggest you to validate the xml before you parsing it using ROME. Check the source RSS and see if it does not contain any error. There are online tools to validate RSS. Search on Google and you will get lot of such online utilities.

  7. Jnew says:

    generates this error why?

    Exception in thread “main” java.lang.NoClassDefFoundError: org/jdom/input/JDOMParseException
    at com.sun.syndication.io.SyndFeedInput.(SyndFeedInput.java:58)
    at com.sun.syndication.io.SyndFeedInput.(SyndFeedInput.java:48)
    at feedrss.Main.main(Main.java:27)
    Caused by: java.lang.ClassNotFoundException: org.jdom.input.JDOMParseException
    at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
    at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
    … 3 more
    Java Result: 1

  8. anon says:

    you may want to look at vtd-xml, the latest and most advanced xml processing api

    vtd-xml

  9. Erik says:

    Rome 1.0 doesn’t have a build method who takes a XmlReader. this tutorial needs to be updated.

  10. Erik says:

    Sorry ’bout that. the solution was to load jdom lib, nothing wrong with example.
    Jnew: that should solve your problem aswell

  11. jaipuig says:

    Thank you very much for your tutorial! Clear, concise and very good

  12. Malhar says:

    How to parse when content is in html format.. mean.. syndEntry.getType is html

    any idea?

  13. Manohar says:

    Where can I download rome jar from? What is the official site?

    I tried http://java.net/projects/rome/downloads , but there is nothing to download there….

  14. Osi says:

    It looks like ROME have been discontinued many months ago. Where’s the replacement?

  15. David says:

    Thanks, exactly what I was looking for!

  16. Franky says:

    I am getting IO exception connection timeout inspite of setting sun.net.client.defaultConnectTimeout and sun.net.client.defaultReadTimeout. Please help.

  17. Rashed says:

    How do you parse elements with namespaces not in the SyndFeed?

  18. Vijay says:

    hey Viral.. thanks a lot for this tutorial….its helpful.

  19. sai says:

    Exception in thread “main” java.net.ConnectException: Connection timed out: connect
    at java.net.DualStackPlainSocketImpl.connect0(Native Method)

Leave a Reply

Your email address will not be published. Required fields are marked *