Sphere Blog

June 4, 2007

Sphere’s Document Genome

Filed under: Notable — sphere @ 4:14 am

picture-38.pngSince launching our Sphere Related Content FeedFlare and WordPress.org Plug-in, we’ve gotten a number of questions about how we magically make connections between content. We thought it’d be fun to share some background on how it works. I truly do marvel at how it works each time a new partner goes live with our Sphere Related Content plug-in.

Sphere’s contextual matching technology dynamically generates related content links without the need for any additional information beyond the article’s text. Part of Sphere’s “magic” is that it does not rely on, or need, any meta data, tags, links nor a taxonomy to understand the article topic. In essence, the article is the meta-data. This is a significant benefit to publisher partners and blog authors, as no work is required on their part to provide hints when they partner with Sphere; there is no need to provide tags to content or standardize meta-data across properties. Sphere performs high-precision matching against multiple content collections by extracting a Document Genome™ from the page in real time and matching against previously indexed Document Genomes™ from the same and other partner sites.

nyt.pngSphere’s Document Genome™ automatically performs high-precision matching based on the article’s content alone. If article meta-data is available in the article, we can make use of it, although this isn’t usually required and may artificially constrain matching. We’ve done numerous deployments that relate content across multiple properties – the New York Times is a great example of how we can connect their articles to blogs, other publisher content as well as other New York Times articles, all in a few seconds after a reader clicks on the Sphere icon. In addition to related content from their own sites, most of our partners show related links from blog sites – either a defined list of content or culled from the tens of millions of blogs that Sphere indexes.

If you have thoughts or experience on alternative contextual matching technologies, don’t be shy to let us know your thoughts.

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: