June 04, 2008

Tips For Building Drill Down Navigators

Taking a cue from tagging at social networking sites, about 6 tags can identify most documents. Here are a couple tips for building drill down navigators:

Who
      Author      
      Attendees      
      Teams
      Group      
      Project
      
Where
      City      
      County      
      Site      
      Meeting Room
      
      
What
      White papers      
      Specs      
      Presentation
      Meeting Notes      
      Audio      
      Products
Why - Visitor's Goal
      Purchase Product      
      Find a Store
      Service
      HR Transactions
      
      
When
      Date      
      Event      
      Revs
How
      Tips      
      Best Practices
      Service Manual

 

We have  blogged about implicit versus explicit tagging, a big difference between enterprise and public web sites. And our article 5 Steps to Better Tagging is online in the archives of our Enterprise Search newsletter.

 

October 29, 2007

Google Appliance Growing Up?

 

The newest version of the Google Search Appliance (GSA) is available, and it's starting to look like a pretty decent solution for more and more corporations.

Google released Version 5 provides what they call “Universal Search"  in October. The newest release for the entire GSA line (except the Mini) includes a number of excellent enterprise features including enhanced security; parametric search, Wiki KeyMatch, a social tagging for best bets; and an application called One Box, a search federator tool.

GSA security now includes Windows Integrated Authorization (WIA) and includes a security API to customize special security needs. It handles security both at crawl time and at search time. It fully respects data store security from all sources, so users only see documents, best bets, parametric results, and features which they have permission to view.

The parametric code in Universal Search is based on open source code available from Google (http://code.google.com/p/parametric/). In demos, it looks like most of the parametric demos we've seen; so we'll have to say more once we have a chance to drill down.

The odd feature in this release is the Wiki KeyMatch feature. Essentially it lets any employee tag a search result list by add "best bet" suggestions to the top of the result list for a given query. It looks like anyone can suggest a related or better result for any query. Apparently this has worked well in Google for a while, and Google folks say it's great. Administrators are notified when new tags are added or updated, and the best bet does show who created the tag. As Jimmy Wales says about his Wikipedia product, anyone posting understands that if the best bet is not useful or appropriate it's going to be removed; so in a sense any author who wants his/her best bet to survive, it better be good. I have to admit the corporate manager I’ve talked to are a bit skeptical; but it can potentially start using the 'wisdom of the crowd' to get better results where it works.

OneBox is a search federation application that provides a way to combine results from a number of different corporate data sources, as well as from Google Apps. As one of the Google folks said recently, "One Box is a way of pulling in live data (such as employee info, salesforce.com data, business objects data) right into your search results."

Google has a solution for SharePoint, Documentum, Livelink, and FileNet, as well as to Google Apps. They provide an API so you can write your own, and we're sure third party developers are busy working on then now. The Google provided connectors are free; but third party connectors may be priced depending on how the developer wants to market it.

Finally, Google also seems to have improved their existing "data biasing" to allow administrators to 'query tweak' using URL patterns and document recency.

The only bad news for small users and corporate departments is that the new upgrade and features are not (yet) available for the popular Google Mini.

If you looked at the Google offerings a while ago and they didn’t meet your needs, you may want to take a second look. It looks like they’ve started to come of age in the enterprise search market.

 

 

April 25, 2007

The Most Important Taxonomy for your Web Site

Taxonomies continue to be popular in companies, but I have to wonder if they are really that useful for the majority of organizations.  I can’t tell you how many times I‘ve had otherwise intelligent people tell me “We plan to implement a corporate enterprise search solution – as soon as we finish our taxonomy project”. When I hear this, I know search won’t be happening for at last two years or more, and in the meantime every visitor to the web site suffers. Usually I spend a few seconds feeling sorry for their users and/or employees, but then I realize that innovative companies with real work to do are moving ahead full speed with Enterprise Search 2.0 platforms and I feel better.

Taxonomies generally fall into on of two categories: subject-based taxonomies and content based taxonomies.

Subject or "Domain" taxonomies attempt to completely describe all of the terms in a field, as well as the relationship between the terms. Typically these relationships are hierarchical, and they are the kind of taxonomies we use to classify knowledge - the kind of taxonomies your biology teacher would talk about. You need a real subject matter expert to create useful subject based taxonomy. And whatever you do, don't hire two (or more) subject experts, because they will never agree on the taxonomy.

Content based taxonomies are organized using existing content. Organization charts, computer directory/folder structures, or social tagging content is typically a 'content based' taxonomy. These taxonomies are often built by humans - you do it yourself when you decide what folders to use on your computer. But these can also be done automatically with tools many search and content management vendors sell.

Whether you go with a subject or a content taxonomy for your company, hooking it into your enterprise search technology will be a trick. This is the dirty little secret of the search software business: There are few, if any, commercial engines that can really take advantage of a complex taxonomy. What do you do with it, after all? Do you tag every document with the full taxonomy of terms in the hierarchy for every term in the document? Do you think that somehow the search engine will automatically know what to do with the taxonomy, and look up and down the taxonomy tree to find related terms? Verity had a great concept when they invented Topics in the late 80s, but since then even they have lost some of the taxonomy emphasis.

We think there is a third kind of taxonomy that is even more important that the traditional subject and content taxonomies: we call it a Behavior-Based Taxonomy.

Really, the reason most companies want a taxonomy is to help people find content. You can probably keep several experts and a bunch of computes working for years to anticipate every possible term and every possible hierarchy that someone on your internet or intranet site may use. But we think the most important taxonomy on any web site is the list of search terms that people actually use when they search a site.

If your search engine can provide great results for the ‘top 100’ queries on your site, you have a lot of happy users. Why do you think search experts at trade shows have finally started talking about your search logs? You can't know what your behavior-based taxonomy (BBT) is unless you are monitoring your search activity at least quarterly. Verify that the ‘top 100’ queries are working fine - either with organic search results or with featured links (or best bets or result promotion, depending on your search vendor).

You keep your Behavior Based Taxonomy up-to-date, and your search users will be satisfied!

 

Search Blog Archive

Dr Search

  • Dr. Search is the technical genius of enterprise search. Feel free to Ask the Doctor any questions you may have about enterprise search.

Enterprise Search Newsletter

Other Resources