Monday, February 12, 2007

Creating an RSS Reader: the Reader

In this article we are going to discuss how to create a PHP-based RSS reader. It would be helpful if you know something about XML, but not really necessary. RSS documents have three main tags: Title, Link and Description. And they all do exactly what their names suggest. I will go into detail about these tags in my second article dealing with “building an RSS file.” For now, we will only focus on the “reading” part of the article.
A downloadable file is available for this article.
As an extra I will introduce a database aspect of the reader. We will use the database to store and retrieve the latest stories. To continue with this article you will need PHP 4 and higher and optionally MYSQL.

Below is an example text from an RSS document:

Start example text

<item>

<title>First example</title>

<link>www.mylink.com/someplace.html</link>

<description>Some description, blah,blah,blah
</description>


</item>

<item>

<title>Thousands set to attend todays celebration</title>

<link>http://
www.mylink.com/someplace.html /NewsTopStories?m=318</link>


<description>blah,blah,blah </description>

</item>

End example text

Code

To create an RSS Reader in PHP, we need to:

  1. Create a function to read the start tag (start element).

  2. Create a function to read the end tag (endElement).

  3. Create function to read the text associated with the tags.


A typical RSS document will have the following structure:

<RSS>

<channel>

<item>

</item>

</channel>

</RSS>

A start tag is a tag without the “/” character, for example: <items>. An end tag is a tag with the “/” character, for example: </item>.

So the start and end tag functions will search for the “<item></item>” tags and once they have found those, it will be a simple matter of retrieving the text data from them to display.


Now, PHP provides us with several XML-related functions, a few of which we will be using here:


xml_parser_create() – Creates an instance of the xml parser object. Xml_parser_create() is a class. In order to use any class we need to instantiate it, or create a copy of it.

To create a new copy:

$xmlParser = xml_parser_create();

xml_set_element_handler() – Searches and sets the start and end elements(tags). This function sets the start and end tags for the parser. It accepts three parameters:

  • The parser: references the parser that is calling the handler.

  • The tagname: contains the name of the element for which the handler is called.

  • The attributes: an array that contains the element's attributes.


The parameters are used later in this article.


xml_set_character_data_handler() – This handles the text part of the tag elements. This function takes two parameters, the parser and data.




  • The parser: references the parser that is calling the handler.

  • The data: contains the character data as a string.


You can get more information about these and other XML functions at:


http://uk2.php.net/manual/en/ref.xml.php

The first thing we do is set the global variables that are going to be used by the functions.


$GLOBALS['titletag'] = false;

$GLOBALS['linktag'] = false;

$GLOBALS['descriptiontag'] = false;

$GLOBALS['thetitletxt'] = null;

$GLOBALS['thelinktxt'] = null;

$GLOBALS['thedesctxt'] = null;

These variables are going to be used to read in tag information from the RSS file that is going to be used with this reader.


The function below deals with the starting element. This function searches through the document to find one of the three tags we discussed earlier:


function startTag( $parser, $tagName, $attrs ) {

switch( $tagName ) {



case 'TITLE':

$GLOBALS['titletag'] = true;

break;

case 'LINK':

$GLOBALS['linktag'] = true;

break;

case 'DESCRIPTION':

$GLOBALS['descriptiontag'] = true;

break;

}

}

This next function deals with the end tag:

function endTag( $parser, $tagName ) {

switch( $tagName ) {



case 'TITLE':

echo "<p><b>" . $GLOBALS[the'titletxt'] . "</b><br/>";

$GLOBALS['titletag'] = false;

$GLOBALS['thetitletxt'] = "";

break;

case 'LINK':

echo "Link: <a href="". $GLOBALS['thelinktxt'] . "">" .
$GLOBALS['thelinktxt'] . "</a><br/>";


$GLOBALS['linktag'] = false;

$GLOBALS['thelinktxt'] = "";

break;

case 'DESCRIPTION':

echo "Desc: " . $GLOBALS['thedesctxt'] . "</p>";

$GLOBALS['descriptiontag'] = false;

$GLOBALS['thedesctxt'] = "";

break;

}

}

This next function verifies the tag that the text belongs to. Once we know which tag it is that we are dealing with, we set the global variable to true.

function txtTag( $parser, $text ) {

if( $GLOBALS['titletag'] == true ) {

$GLOBALS['thetitletxt'] .= htmlspecialchars( trim
($text) );




} else if( $GLOBALS['linktag'] == true ) {

$GLOBALS['thelinktxt'] .= trim( $text );

} else if( $GLOBALS['descriptiontag'] == true ) {

$GLOBALS['thedesctxt'] .= htmlspecialchars( trim
( $text ) );


}

}


Now that we have created the required functions, let's continue with the meat of the code:


function parsefile($RSSfile){

// Create an xml parser

$xmlParser = xml_parser_create();

// Set up element handler

xml_set_element_handler( $xmlParser, "startTag", "endTag" );



// Set up character handler

xml_set_character_data_handler( $xmlParser, "TxtTag" );

// Open connection to RSS XML file for parsing.

$fp = fopen( $RSSfile,"r" )

or die( "Cannot read RSS data file." );



// Parse XML data from RSS file.

while( $data = fread( $fp, 4096 ) ) {

xml_parse( $xmlParser, $data, feof( $fp ) );

or die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));


}



// Close file open handler

fclose( $fp );

// Free xml parser from memory

xml_parser_free( $xmlParser );

}

The above function calls both the startTag/endTag functions to loop through the XML file and displays the contents.




While it is good to have an RSS reader that can read any RSS document, it would be even better if you could store that information in a database and read it at your leisure when you are not connected to the Internet. It would also be good to be able to update your RSS file through the use of the database. It is relatively easy to achieve this, so let's create a table from which we will add our data:


CREATE TABLE `rss_tbl` (

`feed_id` int(5) NOT NULL auto_increment,

`title` varchar(200) NOT NULL default '',

`link` varchar(200) NOT NULL default '',

`description` text NOT NULL,

`the_date` date NOT NULL default '0000-00-00',

PRIMARY KEY (`feed_id`)

) TYPE=MyISAM AUTO_INCREMENT=1 ;

The table will store the individual links as they are read in by the RSS reader. Fill this table with data, using the following format:

  • Title – the title of your story.

  • Link – The link to your story.

  • Description – A short description of your story.


You can then use this data to write to your RSS file :

<?

$fp=fopen(“myrssfile”, “w+”);

if (!$fp){

echo “error opening file”;

exit;

}else{

$query1="Select *,DATE_FORMAT(the_date,'%W,%d %b %Y') as thedate
FROM rss_tbl WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) ORDER BY
the_date DESC LIMIT 10 ";


$result=mysql_query($query1);

while($row=mysql_fetch_assoc($result)){

fwrite($fp,$row[‘title’]."\r\n");

fwrite($fp,$row[‘link’]."\r\n");

fwrite($fp,$row[‘description’]."\r\n");

fwrite($fp,$row[‘thedate’]."\r\n");

fwrite($fp, ” ”);

}//endwhile

fclose($fp);

}//end else

This code does two things. First, it opens (or creates) a file called "myrssfile":


$fp=fopen(“myrssfile”, “w+”);

The "w+" instructs PHP to create the file if it does not exist and to overwrite any contents that it might have. Then it checks to see if there are any problems opening the file:

if (!$fp){

echo “error opening file”;

exit;



If there are problems, the program displays a message and stops execution. If every thing is okay, a SQL query is run that retrieves ten articles from the database that were created in the last thirty days:

$query1="Select *,DATE_FORMAT(the_date,'%W,%d %b %Y') as
thedate FROM rss_tbl WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY)
ORDER BY the_date DESC LIMIT 10 ";


The DATE_FORMAT() function enables us to format the date column in what ever fashion we like. After this the code writes the database data to the file:

fwrite($fp,$row[‘title’]."\r\n");

fwrite($fp,$row[‘link’]."\r\n");

fwrite($fp,$row[‘description’]."\r\n");

fwrite($fp,$row[‘thedate’]."\r\n");

fwrite($fp, ” ”);

That’s it. A file called "myrssfile" should now be available and contain ten articles from the database. With small changes to the table you can expand the database usage and create an RSS aggregator, which is like a online "newspaper" that is entirely made up of RSS feeds from different websites.


To actually enter the data into the database, you only need to create a form that will take the necessary input values and write them to the table. In one of the articles that I wrote about RSS, I discuss how to create and populate an RSS file through a form. Although in that particular article we transfer data from a form to a file, with some small changes you can transfer the data from a form to a database.


Conclusion

To use this code make sure to include “xmlparser.php” in whatever page you are using. Then just call the “parsefile(“yourRSSfileLocation”)” function and your file data will parsed. Also, you might have noticed that in some news sites, the news headlines are scrolling from right to left on the screen. You can achieve this by using the <marquee> HTML tag; Google it to find out how to use it.


Download the xmlparser.php here. This is the same file we link to at the beginning of this article. Next, I will be discussing how to build a RSS File. Till then have fun.




 







No comments: