August 27, 2007

Data Matters! Map your Data to search engine Features (Part 1)

There have been a lot of discussions lately (some private) about whether folks should use taxonomies or faceted navigation, what type of clustering is best, what tool should be used to tag documents, how well does automated clustering work, etc. etc. etc. (click the image below to enlarge)

Levelsoforganization_3 But first, a definition:

Navigators: Click-able things in search results that let users  "drill down" into the results, to narrow in on what they were really looking for.  There are many types of navigators out there, and they all look the same to users (and to many corporate managers), but behind the scenes their implementation is very different, and is highly data and vendor dependent.

While we're big fans of all this whizzy tech, before anybody runs out and grabs something, we strongly suggest they give some careful thought to what type of data they have, and map that to what type of tool they should get.  There is huge confusion in the industry about all these buzz words, but many of these fancy sounding techniques are really targeted to fix specific problems arising from specific types of problematic data.  Better data can use simpler and much more reliable techniques.

The short answer is that, if your data is highly structured, you go to the head of the class!  You can use really mature technology such as parametric search or faceted navigation.  If you data is somewhat less structured, you can still probably create a taxonomy, or automatically "upgrade" your documents through additional tagging; data in this state will probably need some massaging.  Folks with totally unstructured data are the ones who need the new fancy stuff, which may or may not work all that well.

Data, data, data!  Know your data before you buy technology, that's our advice!

So which type of navigator should you use?  And which types of data and navigators map to which tools?  And how is this all done?  That's the subject of some upcoming articles that we're working on, this is just to get the ball rolling (basically we're previewing a draft of the first diagram from that series).  We promise to explain all of this to our faithful readers!