The Most Important Taxonomy for your Web Site
Taxonomies
continue to be popular in companies, but I have to wonder if they are really
that useful for the majority of organizations. I can’t tell you how many times I‘ve had otherwise
intelligent people tell me “We plan to implement a corporate enterprise search
solution – as soon as we finish our taxonomy project”. When I hear this, I know
search won’t be happening for at last two years or more, and in the meantime
every visitor to the web site suffers. Usually I spend a few seconds feeling sorry
for their users and/or employees, but then I realize that innovative companies
with real work to do are moving ahead full speed with Enterprise Search 2.0
platforms and I feel better.
Taxonomies
generally fall into on of two categories: subject-based taxonomies and content
based taxonomies.
Subject
or "Domain" taxonomies attempt to completely describe all of the
terms in a field, as well as the relationship between the terms. Typically
these relationships are hierarchical, and they are the kind of taxonomies we use
to classify knowledge - the kind of taxonomies your biology teacher would talk
about. You need a real subject matter expert to create useful subject based
taxonomy. And whatever you do, don't hire two (or more) subject experts,
because they will never agree on the taxonomy.
Content
based taxonomies are organized using existing content. Organization charts,
computer directory/folder structures, or social tagging content is typically a
'content based' taxonomy. These taxonomies are often built by humans - you do
it yourself when you decide what folders to use on your computer. But these can
also be done automatically with tools many search and content management
vendors sell.
Whether
you go with a subject or a content taxonomy for your company, hooking it into
your enterprise search technology will be a trick. This is the dirty little secret
of the search software business: There are few, if any, commercial engines that
can really take advantage of a complex taxonomy. What do you do with it, after
all? Do you tag every document with the full taxonomy of terms in the hierarchy
for every term in the document? Do you think that somehow the search engine will
automatically know what to do with the taxonomy, and look up and down the
taxonomy tree to find related terms? Verity had a great concept when they
invented Topics in the late 80s, but since then even they have lost some of the
taxonomy emphasis.
We think
there is a third kind of taxonomy that is even more important that the
traditional subject and content taxonomies: we call it a Behavior-Based
Taxonomy.
Really,
the reason most companies want a taxonomy is to help people find content. You
can probably keep several experts and a bunch of computes working for years to anticipate
every possible term and every possible hierarchy that someone on your internet
or intranet site may use. But we think the most important taxonomy on any web site
is the list of search terms that people actually use when they search a site.
If your
search engine can provide great results for the ‘top 100’ queries on your site,
you have a lot of happy users. Why do you think search experts at trade shows
have finally started talking about your search logs? You can't know what your
behavior-based taxonomy (BBT) is unless you are monitoring your search activity
at least quarterly. Verify that the ‘top 100’ queries are working fine - either
with organic search results or with featured links (or best bets or result promotion,
depending on your search vendor).
You keep
your Behavior Based Taxonomy up-to-date, and your search users will be
satisfied!
Hi.
Well the point is that neither Autonomy or Fast or Recommind have a solution for Taxonomy. Also IBM does not have a solution (Omnifind). The only working solution on this planet is InfoCodex. InfoCodex comes with 3 Mio words and can actually do Cross-Language search, so if you search in English it will also find Spanish or Italian documents. Another nice feature is that you can find similar documents as well. There is no other software that can do that without 1 day of training. InfoCodex can do that.
Best
Zeno
Posted by: Zeno Davatz | April 30, 2007 at 08:15 AM