Creating a HTML "friends" page from a Google Reader subscription list


Google's Social Graph API crawls the web and extracts publicly available relationship data (edges) for people on various public pages marked up with XFN or FOAF metadata (nodes). Like many others, I have accounts on Twitter, Flickr, FriendFeed, etc., which are all great sources to crawl for social graph edge relationships, but a number of interesting relationships were still hidden inside my private Google Reader subscription list.

This short guide demonstrates how I extract that information and publish it publicly via plain HTML decorated with XFN markup. (See an example.)

...

The process involves using Google Reader to manage your blogroll to 1) give a common label to the people whose blogs you read, 2) name those subscriptions appropriately, 3) share those subscriptions publicly, 4) find your public user id, and finally, 5) write a script that reads the subscription list and republishes it as HTML on your own site.

Step One: Using the "Manage Subscriptions »" link at the bottom of the left-hand panel in Google Reader, create a new folder called "people". You can call this "friends", "blogroll", whatever -- the important thing is that this label is used to tag every personal blog you want in your public blogroll. Note that these are blogs written by individual people, not collections of people.

Step Two: Next, go back over each of these subscriptions and use the "Rename" button to set the name of the blog to be the name of the author. Some blog feeds are already published this way by their author. Others, such as John Gruber's Daring Fireball, set the blog title to the name of the site, not the person. In those cases, go back and rename the feed "John Gruber", or whatever.

Step Three: Using the "Settings > Tags" menu, find your new "people" label and toggle the "private" sharing status to "public". Doing so will reveal a number of new features, such as "view public page", "email a link", "add a clip to your site", and "add a blogroll to your site". These are all interesting features, and something you might want to explore more fully later, but in our case we're doing something a little different.

Step Four: On the same "Settings > Tag" screen, look at the URL for the "view public page" link. It should read something like:

http://www.google.com/reader/shared/user/16671002588179970970/label/people


That 20 digit number is your public id. Jot this down -- you'll need this for the next step.

Step Five: This is the hardest part, in that it involves a bit of programming (or the ability to cut and paste and modify python code), and a server that you can run scripts on.

In this example, we'll be using a python script to take the JSON-formatted output provided by Google Reader to build a data structure composed of dicts and sequences with the help of the simplejson library, and subsequently using Django templates to render this data as HTML.

You could do the same with PHP (perhaps directly inside Wordpress or the like), or Perl, or any language, but this simple example does it as a stand-alone Python script.

...

The following python code snippets (line breaks added for clarity) detail how to poll Google Reader and generate the XFN markup.

First, the Google Reader JSON output for your public subscription list is available at:

SUBSCRIPTION_URL = 
    'http://www.google.com/reader/public/javascript-sub/user/%s/label/%s'


Reading and parsing the JSON data is done with the following snippet:

  url = SUBSCRIPTION_URL % (user_id, label)
  json = urllib2.urlopen(url)
  data = simplejson.load(json)


The HTML is generated using Django templates. The template itself is simply a multiline string:

TEMPLATE = '''<!DOCTYPE html                                                                                                                                                                                       
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"                                                                                                                                                                          
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">                                                                                                                                                             
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">                                                                                                                           
  <head>                                                                                                                                                                                                                    
    <title>Jane Smith's Friends</title>                                                                                                                                                                  
    <link rel="me" href="http://example.com/"/>                                                                                                                                                
  </head>                                                                                                                                                                                                                   
  <body>                                                                                                                                                                                                                    
    <p class="vcard">                                                                                                                                                                                             
      <a href="http://example.com/" class="url" rel="me">                                                                                                                            
	<span class="fn">Jane Smith</span>                                                                                                                                                         
      </a>\'s friends and reading list:</p>                                                                                                                                                                       
    <ul>                                                                                                                                                                                                                    
      {% for item in items %}
        <li class="vcard">                                                                                                                                                                                        
          <a href="{{ item.alternate.href }}" class="url" rel="friend">                                                                                                                       
            <span class="fn">{{ item.title }}</span>                                                                                                                                                        
          </a>                                                                                                                                                                                                              
        </li>                                                                                                                                                                                                               
      {% endfor %}
    </ul>                                                                                                                                                                                                                   
  </body>                                                                                                                                                                                                                   
</html>                                                                                                                                                                                                                     
'''


That template is parsed and interpreted by Django using the following code:

  template = django.template.Template(TEMPLATE)
  context = django.template.Context(data)
  html = template.render(context)


And then saved to disk with:

  out = open(output, mode='w')
  out.write(html)
  out.close()


...

Putting all of those snippets together, with the proper error handling, imports, main method, etc., you end up with a very short (96 line) script that can read your subscription list and write it out as XFN decorated XHTML.

The full script that I run on unto.net is available at:

http://static.unto.net/reader-subscriptions-to-html.py


I saved this to my ~/bin/ directory and run it every 15 minutes with via a cron job:

*/15 * * * * /home/dewitt/bin/reader-subscriptions-to-html.py 
    16671002588179970970 people 
    --output /var/www/dewitt/friends/index.html
...

So how does it look as HTML?

The page at dewitt.unto.net/friends is generated once every 15 minutes with the script. As you can see, it includes XFN markup that signal relationship edges between nodes that I control (dewitt.unto.net, blog.unto.net, twitter.com/dewitt, etc.) and the nodes that are controlled by people I subscribe to using Google Reader (daringfireball.net, brad.livejournal.com, etc.).

And how does it look to the social graph api?

According to one of the social graph api sample applications that explores site connectivity, it looks pretty darn interesting.

...

Are you doing anything interesting with the social graph API? Have any ideas of more ways it should be crawling for public relationship data? Let me know in the comments, and be sure to join the mailing list.