« March 2007 | Main | May 2007 »

2 posts from April 2007

April 25, 2007

The Most Important Taxonomy for your Web Site

Taxonomies continue to be popular in companies, but I have to wonder if they are really that useful for the majority of organizations.  I can’t tell you how many times I‘ve had otherwise intelligent people tell me “We plan to implement a corporate enterprise search solution – as soon as we finish our taxonomy project”. When I hear this, I know search won’t be happening for at last two years or more, and in the meantime every visitor to the web site suffers. Usually I spend a few seconds feeling sorry for their users and/or employees, but then I realize that innovative companies with real work to do are moving ahead full speed with Enterprise Search 2.0 platforms and I feel better.

Taxonomies generally fall into on of two categories: subject-based taxonomies and content based taxonomies.

Subject or "Domain" taxonomies attempt to completely describe all of the terms in a field, as well as the relationship between the terms. Typically these relationships are hierarchical, and they are the kind of taxonomies we use to classify knowledge - the kind of taxonomies your biology teacher would talk about. You need a real subject matter expert to create useful subject based taxonomy. And whatever you do, don't hire two (or more) subject experts, because they will never agree on the taxonomy.

Content based taxonomies are organized using existing content. Organization charts, computer directory/folder structures, or social tagging content is typically a 'content based' taxonomy. These taxonomies are often built by humans - you do it yourself when you decide what folders to use on your computer. But these can also be done automatically with tools many search and content management vendors sell.

Whether you go with a subject or a content taxonomy for your company, hooking it into your enterprise search technology will be a trick. This is the dirty little secret of the search software business: There are few, if any, commercial engines that can really take advantage of a complex taxonomy. What do you do with it, after all? Do you tag every document with the full taxonomy of terms in the hierarchy for every term in the document? Do you think that somehow the search engine will automatically know what to do with the taxonomy, and look up and down the taxonomy tree to find related terms? Verity had a great concept when they invented Topics in the late 80s, but since then even they have lost some of the taxonomy emphasis.

We think there is a third kind of taxonomy that is even more important that the traditional subject and content taxonomies: we call it a Behavior-Based Taxonomy.

Really, the reason most companies want a taxonomy is to help people find content. You can probably keep several experts and a bunch of computes working for years to anticipate every possible term and every possible hierarchy that someone on your internet or intranet site may use. But we think the most important taxonomy on any web site is the list of search terms that people actually use when they search a site.

If your search engine can provide great results for the ‘top 100’ queries on your site, you have a lot of happy users. Why do you think search experts at trade shows have finally started talking about your search logs? You can't know what your behavior-based taxonomy (BBT) is unless you are monitoring your search activity at least quarterly. Verify that the ‘top 100’ queries are working fine - either with organic search results or with featured links (or best bets or result promotion, depending on your search vendor).

You keep your Behavior Based Taxonomy up-to-date, and your search users will be satisfied!


April 04, 2007

FAST Acquires Convera's Retrievalware

Big news this week from Norway. FAST Search and Transfer has announced two deals with Convera.

In the smaller deal, FAST announced that Convera will deploy FAST's AdMomentum - although the press release wasn't clear whether they would be a reseller of the product, or simply an end user.

The real whopper however, was a one-paragraph press release announcing that FAST will acquire the RetrievalWare assets of Convera for $23M US. The acquisition of a smaller 'niche player' in the market (Gartner Magic Quadrant for Information Access Technology, 2006) might seem strange, until you consider that a huge proportion of RetrievalWare's installed base is in governments around the world - TechWeb's Intelligent Enterprise reports that some 70% of these government accounts are in the United States. This may mark a real opportunity for FAST to break into the US Government market in a big way.

So the consolidation continues - Autonomy bought a great list of qualified prospects when they picked up Verity two years ago - OK, they also picked up some technology that may be the basis for the long-needed interface around IDOL as well, if they can ever deploy it. FAST, on the other hand, gets a great installed base that doesn't change vendors very often - and got it for much less money. The G2 Enterprise Intelligence blog comment son how the announcement, along with deals with Comcast and Target, are helping FAST's stock price and, more importantly, its mind share. Soon you may decide to FAST yourself.

We say 'Let the games begin'!

What should be really interesting to watch is how the big American players do now. There are some government groups that want to buy from American software companies; so will this be the break that IBM, Oracle and Microsoft have been waiting for?