Netscape Smart Browsing Explained

Netscape has added a new feature to their Communicator product called "Smart Browsing".  The purpose of this feature is to assist surfers in their quest for information.  One aspect of the smart browsing is called What's Related.  The ability to enable or disable this feature is provided in the Communicator preferences dialog under Edit->Preferences->Navigator->Smart Browsing.  To enable, select the toggle box to the left of the "Enable 'What's Related'" option.  Make sure you select "After first use" for the load mode.  Once enabled, a new menu appears to the right of the URL entry box labeled originally enough "What's Related".  When this menu is selected, a list of sites related to the current entry in the URL box are displayed.  For example, navigate to http://www.ibm.com.  Now select the What's Related menu.  At the time this article was written, this is what was displayed:

The first section gives a list of supposedly related sites that you may be interested in surfing to. In reality, this is a list maintained by Netscape that is based on priority ordering.  Speculation would have this priority based on monetary gains, however, this has not been verified by Netscape.

Now for an explanation on how Netscape pulls this off.  It really isn't that tricky.  Every time you select the What's New option, a hidden http request is made to a Netscape webserver.  This request includes the exact contents of the URL text box.  The webserver then does a lookup of associated entries related to that domain.  In our case, it is retrieving the URL's related to .ibm.com.  Since this is simply an http request, we can actually send our own requests to this server and mimic this functionality in a manner we can analyze.  The hidden http request actually looks like: http://www-rl.netscape.com/wtgn?url (where url is the text in the URL text box).  In our case, this is the actual http request: http://www-rl.netscape.com/wtgn?www.ibm.com.  The actual results for this query are: (again, at the time this article was written)

<RDF:RDF> <RelatedLinks> <child href="http://info.netscape.com/fwd/rlpaid/http://home.netscape.com/products" name="Learn about Netscape products..." type=244/> <child href="http://info.netscape.com/fwd/rlpaid/http://www.xsnet.com" name="Save on SGI Intergraph TDZ2000 Systems" type=233/> <child instanceOf="Separator1"/> <child href="http://www.ibm.net/" name="IBM Internet Connection Services " priority="7"/> <child href="http://www.tandem.com/" name="Tandem " priority="7"/> <child href="http://www.sun.com/" name="Sun Microsystems " priority="7"/> <child href="http://www.sgi.com/" name="Silicon Graphics Inc. " priority="7"/> <child href="http://www.mitsubishi.com/" name="Mitsubishi Electric " priority="7"/> <child href="http://www.hp.com/" name="Hewlett Packard " priority="7"/> <child href="http://www.dell.com/" name="Dell Computer " priority="7"/> <child href="http://www.att.com/" name="AT&T " priority="7"/> <child href="http://www.apple.com/" name="Apple Computer " priority="7"/> <child href="http://www.3com.com/" name="3Com " priority="7"/> <child href="http://editorial.alexa.com/netscape_editor" name="Suggest related links..."/> <child instanceOf="Separator1"/> <Topic name="Matching Open Directory categories"> <child href="http://info.netscape.com/fwd/rlstatic/http://directory.netscape.com/Business/Business_Directories/1999_Fortune_500" name="Business: Business Directories: 1999 Fortune 500"/> <child href="http://info.netscape.com/fwd/rlstatic/http://directory.netscape.com/Computers/Hardware/Retailers/I" name="Computers: Hardware: Retailers: I"/> <child href="http://info.netscape.com/fwd/rlstatic/http://directory.netscape.com/Computers/Hardware/Systems/IBM" name="Computers: Hardware: Systems: IBM"/> <child href="http://info.netscape.com/fwd/rlstatic/http://directory.netscape.com/Computers/Vendors/Product_Support/IBM" name="Computers: Vendors: Product Support: IBM"/> <child instanceOf="Separator1"/> <child href="http://info.netscape.com/fwd/rlstatic/http://directory.netscape.com/add.html" name="Submit a site to the Open Directory..."/> <child href="http://info.netscape.com/fwd/rlstatic/http://directory.netscape.com/about.html" name="Become an Open Directory editor..."/> </Topic> <child instanceOf="Separator1"/> <child href="http://info.netscape.com/fwd/rlstatic/http://quote.netscape.com/quote/Quote.tibco?symbols=ibm&view=quote" name="Stock quote for IBM Corporation"/> <child href="http://info.netscape.com/fwd/rlstatic/http://financialnews.netscape.com/financialnews/Quote.tibco?symbols=ibm&view=news" name="News stories on IBM Corporation"/> <child instanceOf="Separator1"/> <Topic name="Site info for www.ibm.com"> <child href="http://info.netscape.com/fwd/rlstatic/http://home.netscape.com/escapes/related/faq.html" name="Owner: IBM Corporation"/> <child href="http://info.netscape.com/fwd/rlstatic/http://home.netscape.com/escapes/related/faq.html" name="Date established: 19-Mar-86"/> <child href="http://info.netscape.com/fwd/rlstatic/http://home.netscape.com/escapes/related/faq.html" name="Popularity: in top 500 sites on web"/> <child href="http://info.netscape.com/fwd/rlstatic/http://home.netscape.com/escapes/related/faq.html" name="Number of pages on site: 808"/> <child href="http://info.netscape.com/fwd/rlstatic/http://home.netscape.com/escapes/related/faq.html" name="Number of links to site on web: 87516"/> </Topic> <child instanceOf="Separator1"/> <child href="http://info.netscape.com/fwd/rlstatic/http://directorysearch.netscape.com/cgi-bin/search?search=ibm" name="Search on this topic..."/> <child href="http://info.netscape.com/fwd/rlstatic/http://home.netscape.com/escapes/smart_browsing" name="Learn about Smart Browsing..."/> </RelatedLinks> </RDF:RDF>

As you can see, this is the exact contents of the What's Related menu with extra HTML tags surrounding it.  The What's Related functionality knows how to parse this information and display it in a convenient menu.  You can, of course, mimic any other request by simply changing the string after the '?' mark in the http request.  

If you are sitting behind a firewall, this may be a small security risk.   Not from an immediate risk of intrusion, but from a standpoint of information mining.  As this article points out, you are actually sending the URL you are viewing down to Netscape.  Since this is a webserver, their logs will contain at least two  important pieces of information.  The site doing the request (in our case, the firewall) and the URL you surfed to.  Whip a quick perl script to parse the logs and you can now distill out every site that your employees surf to - both internal to your network and external.  Granted, this is only sites your employees have surfed to and selected the What's Related option.  (Unless "always" was selected as the load option).  In a worst case scenario consider a sudden large number of IBM hits to a particular site such as Wallop or to a new technology like Java.  If interpreted correctly, this could imply a possible buy out (i.e. Wallop) or a new endeavor (i.e. Java).  Before you consider joining Conspiracy Theorists 'R' Us, keep in mind these are long shots and only mentioned from a data awareness standpoint. At a minimal, you should add your companies suffix to the suffixes area of the smart browsing preferences dialog.  This will prevent Netscape from logging your internal nodes.  

I will point out that the "site info" option is slightly interesting since it gives some statistics about size and ownership.  Here is the information returned about IBM. Note, as in the first example, Netscape appears to have a problem generating the proper list items - evident by the ?'s:

The 500 line is suppose to say "in the top" websites. My guess is Netscape will have this problem resolved when you try this yourself.

Keep in mind, this process applies to all smart surfing including your home use as well.

See you next technote, till then, watch your step, the dog's been out