The most interesting part to me is what Microsoft is doing to extend the power of the OpenSearch description document format. They've defined the Location Definition file format, which at the core is a standard OpenSearch description document, but augmented with all sorts of additional functionality.
...
To illustrate the difference between a traditional OpenSearch description document and Microsoft's extended Location Definition format, I've downloaded one of the examples that Search Sever 2008 ships with: the Yahoo FLD.
As a baseline for comparison here is a minimal version of a OpenSearch description document for querying the Yahoo web search API (line breaks added for clarity):
<?xml version="1.0" encoding="UTF-8"?>
<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">
<ShortName>Yahoo</ShortName>
<Description>Returns results from Yahoo Search</Description>
<Url type="application/rss+xml"
template="http://api.search.yahoo.com/WebSearchService/rss/webSearch.xml
?appid=yahoosearchwebrss&query={searchTerms}
&adult_ok=0"/>
</OpenSearchDescription>
Versus the extended Location Definition format for the same query, where I've added line breaks, snipped out some of the longer portions, and I converted their inline xmlns declarations to use the
ldf
prefix:
<?xml version="1.0"?>
<OpenSearchDescription
xmlns="http://a9.com/-/spec/opensearch/1.1/"
xmlns:ldf="http://schemas.microsoft.com/Search/2007/location">
<ShortName>Yahoo</ShortName>
<ldf:InternalName>Yahoo</ldf:InternalName>
<Description>Returns results from Yahoo Search</Description>
<Language/>
<ldf:LocationType>OpenSearch</ldf:LocationType>
<ldf:Version>1.0.0.0</ldf:Version>
<ldf:IsPrefixPattern>False</ldf:IsPrefixPattern>
<ldf:ConnectionUrlTemplate>
http://api.search.yahoo.com/WebSearchService/rss/webSearch.xml?
appid=yahoosearchwebrss&query={searchTerms}&adult_ok=0
</ldf:ConnectionUrlTemplate>
<ldf:MoreLinkTemplate>
http://search.yahoo.com/search?ei=gbk&fr=fp-tab-web-ycn&
pid=ysearch&source=yahoo_yhp_0706_search_button&
p={searchTerms}
</ldf:MoreLinkTemplate>
<ldf:CreationDate>10/30/2007 7:17:19 PM</ldf:CreationDate>
<ldf:LastModifiedDate>10/30/2007 11:02:27 PM</ldf:LastModifiedDate>
<ldf:ExpirationDate>1/1/0001 12:00:00 AM</ldf:ExpirationDate>
<ldf:Visualization name="summary">
<Xsl>
inlined XSL snipped here...
</Xsl>
<Properties>
<Columns>
<Column Name="title"/>
<Column Name="link"/>
<Column Name="description"/>
</Columns>
</Properties>
<SampleData>
<rss xmlns="" version="2.0">
<channel>
<title/>
<link>http://www.sample.com/</link>
<description/>
<item>
<title>Title of document or web page</title>
<link/>
<description>This is the summary of ....</description>
</item>
</channel>
</rss>
</SampleData>
</Visualization>
<Visualization name="topanswer">
<Xsl>
inlined XSL snipped here...
</Xsl>
<Properties>
<Columns>
<Column Name="title"/>
<Column Name="link"/>
<Column Name="description"/>
</Columns>
</Properties>
<SampleData>
<rss xmlns="" version="2.0">
<channel>
<title/>
<link>http://www.sample.com/</link>
<description/>
<item>
<title>Title of document or web page</title>
<link/>
<description>This is the summary of ...</description>
</item>
</channel>
</rss>
</SampleData>
</ldf:Visualization>
</OpenSearchDescription>
...
Okay, there's a lot of new stuff there. But first thing to point out is that this is a perfectly valid OpenSearch description document. The required elements are all there, and the extension elements (allowed by the OpenSearch specification) are properly enclosed in namespaces. (Granted, this isn't directly useful to existing OpenSearch clients due to the missing
Url
element, but it is still technically valid.)To see what has been added, lets take a look at the Location Definition File schema, which extends the OpenSearch description document:
<OpenSearchDescription
xmlns=http://a9.com/~/spec/opensearch/1.1/
xmlns:ldf="http://schemas.microsoft.com/Search/2007/location">
<ShortName>string</ShortName>
<Description>string</Description>
<Language>LCID</Language>
<Url type="application/sharepoint+xml" template="{searchTerms}"/>
<ldf:InternalName>string</ldf:InternalName>
<ldf:LocationType>type</ldf:LocationType>
<ldf:Version>version number</ldf:Version>
<ldf:IsPrefixPattern>true|false</ldf:IsPrefixPattern>
<ldf:Trigger>string</ldf:Trigger>
<ldf:ConnectionUrlTemplate>url</ldf:ConnectionUrlTemplate>
<ldf:MoreLinkTemplate>url</ldf:MoreLinkTemplate>
<ldf:CreationDate>date</ldf:CreationDate>
<ldf:LastModifiedDate>date</ldf:LastModifiedDate>
<ldf:IsRestricted>true|false</ldf:IsRestricted>
<ldf:AllowedSiteCollectionGuids>string</ldf:AllowedSiteCollectionGuids>
<ldf:Visualization name="string">
<Xsl>...</Xsl>
<Properties>...</Properties>
<SampleData>string</SampleData>
</ldf:Visualization>
</OpenSearchDescription>
Here are the new elements in turn:
InternalName
- "Specifies a unique name to identify the location." Seems to differ from the ShortName in that it is unique and limited to 60 chars.LocationType
- Either "OpenSearch" for remote search servers or "SharePoint" for local servers. Interestingly, you use OpenSearch for remote SharePoint servers.Version
- Used for versioning the OpenSearch description document/Location Definition file itself. This is a great extension, and something that should be added to the core spec.CreationDate
, LastModifiedDate
, and ExpirationDate
- More metadata about the OpenSearch description document/Location Definition file itself, and again something that should be added to the core spec. Very nice.IsPrefixPattern
- Used with the Trigger
element.Trigger
-- "Only search queries matching the specified pattern are forwarded to the federated search location." Very cool. This uses regexes, and while it could probably be better generalized, it is exciting nonetheless.ConnectionUrlTemplate
and MoreLinkTemplate
- "Query template that specifies how to pass search queries to the OpenSearch location‰Ûªs URL" and "Link template that specifies the URL of the HTML page that displays results for the search query." Hmm. These seem to serve the same purpose as the standard Url
element (of type="atom" or "rss" and type="html" respectively) and they use the same template grammar. I'm not sure why these are needed. My hypothesis is that the Location Definition format was originally created based on OpenSearch 1.0, which had a more primitive Url
element than 1.1 does. It's a shame though, as this alone prevents Location Definition files from being used by standard OpenSearch clients.IsRestricted
- I believe this is used as a flag to indicate that auth is required. I'm excited about Microsoft's work around ACL models for federated search.AllowedSiteCollectionGuids
- "Specifies a string that contains the site GUIDs permitted to use this federated search location." Again presumably related to ACLs, though I'm not quite sure how it works yet. I believe this is based on prior work with SharePoint and other services.Visualization
- Very interesting. The Location Definition file can give the client explicit directions about how to display the search results by providing an XSLT. If I understand correctly, this maps the supplied search results into the native SharePoint format. Xsl
- Contains the XSLT used for the Visualization
. Nice in that it uses the XSL namespace and doesn't need to be escaped.Properties
- "Specifies the list of properties retrieved from the federated search location." A mapping specification for the Visualization
transformation. I don't quite follow it, but I think that's because I haven't really looked at SharePoint yet.SampleData
- "Represents a string containing sample search results XML from the federated search location." Part of the Visualization
element, it contains an example response for the client to render. It is nice how this relates to the standard Query
element, which provides values for search queries, this provides data for responses. Interesting approach....
First off, it's great to see Microsoft continue to embrace OpenSearch, and extend it in a good way. They've remained faithful to the specification and introduced a number of valuable improvements.
One important thing that is missing is a link to their licensing terms. (The links at the bottom of their documentation appear to refer to the community content, not this spec.) The original OpenSearch specification is made available under the Creative Commons Attribution Share-Alike license, which means that Microsoft (or anyone else) has every right to use and extend it this way, though they need to license their extensions in the same fashion.
I imagine that the license omission is an accidental hiccup, and that once it is cleared up others will consider their spec sufficiently open to be reused.
...
For my part, I'd like to add some of these extensions directly into the OpenSearch core spec. In particular, the
Version
and date elements would be useful everyone and could be added to the core spec right away.Regarding the search triggers, I think Microsoft is onto something here. I'd love to discuss this more with them.
The
InternalName
element makes me wish that I had done a better job creating a reusable Title
element in the first place. My fault entirely.I'm not crazy about inlining the result transformations, though I think I understand why they did it this way. That said, I have a potentially different way to address this one but I'd want to bounce it off of the Search Server team first.
And I'd love to get a better understanding of why Microsoft needed to define new
Url
elements. I still suspect it has something to do with legacy OpenSearch 1.0 support. Note that they also support the 1.1 Url
element.All told I'm completely thrilled about this. I can't wait to talk with the developers behind it.
...
Microsoft's TechNet has a forum dedicated to Search Server 2008, including a forum for federation. I'll subscribe to the feeds and keep tabs on how things progress.