Creating a HTML “friends” page from a Google Reader subscription list
March 23rd, 2008 by DeWitt Clinton

Google’s Social Graph API crawls the web and extracts publicly available relationship data (edges) for people on various public pages marked up with XFN or FOAF metadata (nodes). Like many others, I have accounts on Twitter, Flickr, FriendFeed, etc., which are all great sources to crawl for social graph edge relationships, but a number of interesting relationships were still hidden inside my private Google Reader subscription list.

This short guide demonstrates how I extract that information and publish it publicly via plain HTML decorated with XFN markup. (See an example.)

The process involves using Google Reader to manage your blogroll to 1) give a common label to the people whose blogs you read, 2) name those subscriptions appropriately, 3) share those subscriptions publicly, 4) find your public user id, and finally, 5) write a script that reads the subscription list and republishes it as HTML on your own site.

Step One: Using the “Manage Subscriptions »” link at the bottom of the left-hand panel in Google Reader, create a new folder called “people”. You can call this “friends”, “blogroll”, whatever — the important thing is that this label is used to tag every personal blog you want in your public blogroll. Note that these are blogs written by individual people, not collections of people.

Step Two: Next, go back over each of these subscriptions and use the “Rename” button to set the name of the blog to be the name of the author. Some blog feeds are already published this way by their author. Others, such as John Gruber’s Daring Fireball, set the blog title to the name of the site, not the person. In those cases, go back and rename the feed “John Gruber”, or whatever.

Step Three: Using the “Settings > Tags” menu, find your new “people” label and toggle the “private” sharing status to “public”. Doing so will reveal a number of new features, such as “view public page”, “email a link”, “add a clip to your site”, and “add a blogroll to your site”. These are all interesting features, and something you might want to explore more fully later, but in our case we’re doing something a little different.

Step Four: On the same “Settings > Tag” screen, look at the URL for the “view public page” link. It should read something like:

http://www.google.com/reader/shared/user/16671002588179970970/label/people

That 20 digit number is your public id. Jot this down — you’ll need this for the next step.

Step Five: This is the hardest part, in that it involves a bit of programming (or the ability to cut and paste and modify python code), and a server that you can run scripts on.

In this example, we’ll be using a python script to take the JSON-formatted output provided by Google Reader to build a data structure composed of dicts and sequences with the help of the simplejson library, and subsequently using Django templates to render this data as HTML.

You could do the same with PHP (perhaps directly inside Wordpress or the like), or Perl, or any language, but this simple example does it as a stand-alone Python script.

The following python code snippets (line breaks added for clarity) detail how to poll Google Reader and generate the XFN markup.

First, the Google Reader JSON output for your public subscription list is available at:

SUBSCRIPTION_URL = 
    'http://www.google.com/reader/public/javascript-sub/user/%s/label/%s'

Reading and parsing the JSON data is done with the following snippet:

  url = SUBSCRIPTION_URL % (user_id, label)
  json = urllib2.urlopen(url)
  data = simplejson.load(json)

The HTML is generated using Django templates. The template itself is simply a multiline string:

TEMPLATE = '''<!DOCTYPE html
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <title>Jane Smith's Friends</title>
    <link rel="me" href="http://example.com/"/>
  </head>
  <body>
    <p class="vcard">
      <a href="http://example.com/" class="url" rel="me">
	<span class="fn">Jane Smith</span>
      </a>\'s friends and reading list:</p>
    <ul>
      {% for item in items %}
        <li class="vcard">
          <a href="{{ item.alternate.href }}" class="url" rel="friend">
            <span class="fn">{{ item.title }}</span>
          </a>
        </li>
      {% endfor %}
    </ul>
  </body>
</html>
'''

That template is parsed and interpreted by Django using the following code:

  template = django.template.Template(TEMPLATE)
  context = django.template.Context(data)
  html = template.render(context)

And then saved to disk with:

  out = open(output, mode='w')
  out.write(html)
  out.close()

Putting all of those snippets together, with the proper error handling, imports, main method, etc., you end up with a very short (96 line) script that can read your subscription list and write it out as XFN decorated XHTML.

The full script that I run on unto.net is available at:

http://static.unto.net/reader-subscriptions-to-html.py

I saved this to my ~/bin/ directory and run it every 15 minutes with via a cron job:

*/15 * * * * /home/dewitt/bin/reader-subscriptions-to-html.py 
    16671002588179970970 people 
    --output /var/www/dewitt/friends/index.html

So how does it look as HTML?

The page at dewitt.unto.net/friends is generated once every 15 minutes with the script. As you can see, it includes XFN markup that signal relationship edges between nodes that I control (dewitt.unto.net, blog.unto.net, twitter.com/dewitt, etc.) and the nodes that are controlled by people I subscribe to using Google Reader (daringfireball.net, brad.livejournal.com, etc.).

And how does it look to the social graph api?

According to one of the social graph api sample applications that explores site connectivity, it looks pretty darn interesting.

Are you doing anything interesting with the social graph API? Have any ideas of more ways it should be crawling for public relationship data? Let me know in the comments, and be sure to join the mailing list.

Adding OpenSearch to FriendFeed
March 21st, 2008 by DeWitt Clinton

A quick 30 second post because it’s hard to add markup to FriendFeed comments…

In response to this guide on adding FriendFeed search to your browser, here’s the 30 second guide for the FriendFeed guys on how to do it on their own site (not that they need it).

First, download this file and serve it as http://friendfeed.com/opensearch.xml.

<?xml version="1.0" encoding="UTF-8"?>
<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">
  <ShortName>FriendFeed</ShortName>
  <Description>Search FriendFeed.</Description>
  <Tags>friendfeed</Tags>
  <Contact>admin@friendfeed.com</Contact>
  <Url type="text/html"
       template="http://friendfeed.com/search?q={searchTerms}"/>
  <Url type="application/atom+xml"
       template="http://friendfeed.com/search?q={searchTerms}&amp;format=atom"/>
  <LongName>FriendFeed Search</LongName>
  <Image height="16" width="16" type="image/vnd.microsoft.icon">http://friendfeed.com/favicon.ico</Image>
  <Query role="example" searchTerms="friendfeed" />
  <Developer>FriendFeed</Developer>
  <Attribution>
    Copyright 2008, FriendFeed, Inc., All Rights Reserved
  </Attribution>
  <SyndicationRight>open</SyndicationRight>
  <AdultContent>false</AdultContent>
  <Language>en-us</Language>
  <OutputEncoding>UTF-8</OutputEncoding>
  <InputEncoding>UTF-8</InputEncoding>
</OpenSearchDescription>

Next, add this snippet to the <head/> section of all pages on FriendFeed:

<link rel="search"
           type="application/opensearchdescription+xml"
           href="http://friendfeed.com/opensearch.xml"
           title="FriendFeed Search" />

And you’re done. But let’s talk again once you syndicate search over Atom or RSS and we’ll add that, too.

Update: Added the <Url/> element for Atom based search results per Bret’s comment below.

Server migration
February 11th, 2008 by DeWitt Clinton

Users of either the OpenSearch interface to the Amazon catalog at aws.unto.net or the click-tracking bookmark manager at delancey.unto.net may experience a bit of inconsistent state while DNS changes propagate this morning. I moved those services from the virtual machine (hosted by the superstars at John Companies), to the newer instance that also hosts this blog (blog.unto.net), a wiki (wiki.unto.net), and a simple identity page (dewitt.unto.net). I’ll be moving the opensearch.org wiki server over to the new instance shortly as well. In the process I also upgraded the new machine to Fedora 7 via a somewhat risky in-place yum-based upgrade. Please let me know in the comments if you have any problems.

On the coding front I spent the weekend working on a new blog engine for unto.net, similar in spirit to Blosxom/PyBlosxom. (Oh the irony.) While I still appreciate the convenience that WordPress offers, I’m not a fan at all of the new admin console design in WP trunk, and would really prefer a backend that stores posts in a transparent version control system. Opaque CMS datastores, such as the MySQL backend for Wordpress, are my nemesis these days…

43 mini-reviews of drm-free and (mostly) independent albums
January 1st, 2008 by DeWitt Clinton

I purchased copies of, and unless the RIAA succeeds in their creative but wrongful re-imagining of copyright law, the right to fairly use, 43 music albums in 2007. Needless to say every single one was purchased on DRM-free media — high-bitrate digital MP3s when available, CD when not.

While buying 43 albums/year is well off my ten-year average of 2.44 albums a week, one of the things that caught my attention was that over the past year 84% of the albums were released artists on independent labels, i.e. labels that are not affiliated with the RIAA. Perhaps even the RIAA might start taking note when customers who spend hundreds (or thousands) of dollars a year on music start giving the vast majority of that to independent artists.

The following is a list of those 43 albums, complete with links to places you can buy them (DRM-free, of course) and links to some external reviews. (The Amazon links have my referral code; if that bothers you then please go to Amazon directly.)

I’ve also included my own one-sentence review of each album, as well as a restaurant review-style “star” rating. Note that these ratings are not done on a linear 1 to 5 scale. Rather here each star representing something an album needs to earn. To put it in perspective, I can only think of a few albums ever released that would get 5 stars, and since I previewed everything before I bought it, each should have at least 1 star (in theory).

An aside before I start: the state of music search is abysmal. I spent hours upon hours writing Python scripts to help automate the generation of this list. Amazon was the only site that provided a search API at all, and even that needs some work. In the end I had the best luck using Google search, even to find things on Emusic or Bleep or AllMusic.

If anyone ever wants to fix the situation with music search please drop me an email.

But a big thank you to my friends at RIAA Radar — keep up the good work!

And with that, here are the 43 albums I bought in 2007: