« Where do actual your Autocomplete / Autosuggest terms come from? | Main | What does it take to qualify as 'Big Data'? »

July 07, 2014

Data quality as critical to 'big data' as it is to search

For years, we've been preaching in the wilderness about the important of 'data quality' - the new name for 'garbage in, garbage out'. Maybe it gets so little respect because it was first cited on April Fools Day (back in 1963, according to Wikipedia). Bad content has caused enterprise search owners headaches for years - heck, one of our most popular posts, Sixty Guys named Sarah is really about a data quality problem.  

Last Monday, Fortune Magazine posted an article called Big data's dirty problem, telling us all that:

"Inaccuracies, misspellings, and obsolete information makes achieving the big data utopia a slog for businesses and researchers"

For those of us who have worked in search for a while, it comes as no surprise. (It's also a reason why you need great search along with a big data distro to succeed).

So many companies approach enterprise search with what I call a 'fire and forget' mentality. Google on the public web makes it look so easy - how hard can it be?

At so many companies, we've seen this vicious cycle: Pick the search platform that looks best, install it, ignore it, and repeat in two to four years. No - really, ask yourself: how long have you used your current enterprise search platform?

Now ask yourself "How often do we review top queries, top misspellings, zero hit query reports?"  In my experience, there is an inverse correlation between longevity of search platform and money invested in monitoring and maintaining the search platform. In search, big data, and life, you get what you pay for.

Looking to end the vicious circle of 'roll out, replace, repeat' with your enterprise OR big data apps? You might start with a data audit - we can help.

Miles Kehoe/July 6, 2014

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c84cf53ef01a73de765ff970d

Listed below are links to weblogs that reference Data quality as critical to 'big data' as it is to search:

Comments

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.