« Enterprise Search Summit 2010 - DC | Main | An Exploratory Search Demo based on top of a general Web Search Engine »

March 09, 2010

Enterprise search engines: They're *not* all the same

We're in the process of doing a search engine evaluation for a large customer. That, by itself, isn't news: we do those quite a bit for companies large and small. No, what makes this project most interesting is that we are doing side-by-side comparisons of three leading search technologies using industry-standard data sets.

Our assumption going in was that, for out-of-box simple searches, all three engines would return pretty much of the same set of results: after all, if TF/IDF (term frequency/inverse document frequency) was at the core of these technologies, they should be getting roughly the same results sets. Much to our surprise, if we look at the top 10 search results from each engine for a simple search, we get only about 15% overlap.

Let me explain it this way: if we retrieve ten search results for a specific query from one search engine, only 3 of the twenty - 15% - results were found by either of the other engines. In a typical list of 10 results, only 3 show up in more than one engine. We were especially amazed because we are going out of our way to use default parameters as much as possible: no entity extraction, no search tuning, no special synonyms or thesaurus terms.

We're still too early in the process to understand what's behind this surprising situation: it's always possible the results are too tentative to make any judgments, or we could find an error in our methodology.  We're working on it, and we'll get back with any findings that we can share. If you have any explanations, leave a comment - we'd love to hear what you think.

/s/Miles

TrackBack

TrackBack URL for this entry:
https://www.typepad.com/services/trackback/6a00d8341c84cf53ef01310f828a46970c

Listed below are links to weblogs that reference Enterprise search engines: They're *not* all the same:

Comments

I think Rup's description was well presented with the comparison of Google internet search when compared to the performance of its GSA for enterprise search. Even though the same pool of info is being searched, my guess would be that the software configurations vary between the three engines.
===
His comment back on March 11 was good; it's tough to get any two search engines to see the exact same content. we once stood up three engines to compare platforms, and going against the same data we found wide variations in total indexed documents.. crawling can be a real issue. "Trust, but verify"
/s/Miles

Hi Miles

Sometimes its good to include something completely from the left-field in an evaluation. The results could be a revelation.

A text-search query will find documents containing a combination of the keywords in the query.

An item-search query using the same keywords will find entire documents that are relevant to the query.

Xyggy (www.xyggy.com) finds similar (relevant) items. There is a bigger story here about a search engine that learns and generalizes in much the same way that humans do. Happy to discuss offline.

Dinesh
www.xyggy.com

Hey Miles,

What you're expecting is generally what most search engine specialists expect to occur.

Many search engines utilize an "Inverted Index" which is basically the "term frequency/inverse document frequency" you mention in your article.

This really is just part of the complexity that goes into ordering and setting relevancy for documents that are returned.

For example many search engines process each document as they are added into the engine and perform what we call Stemming/Lemmatization. e.g. if the document mentions "hiking" the term "back packing" maybe added to that document for search purposes.

Some search engines do this better than others and some search engines (mostly the older ones) don't to this at all.

Similarly some engines allow you to control and configure this any many more processes to enhance the original content. For example allowing you to have stemming for business terms unique to the client or department.

This is just one example, there are potentially hundreds of different ways a search engine enhances the original document and or the users query.

When performing a vendor selection you need to really understand the clients needs and then match that to how configurable the engine is which allows you to match these requirements to technical functionality.

I've been doing this for almost 4 years now and I'm still learning, the subject really is that complex and it's still developing.
==
Agreed Pete. Heck, I've been doing search engines since 1989 at Verity. Still, I was surprised to see such differences among three 'leading' commercial vendors. Thanks for your comment, and sorry it took so long to post. /s/Miles

One reason could be the difference in relevancy models for different search engines. I have noticed changes in top results with only version upgrades without touching any of the configuration changes.

I think determining why the results came on top for a particular engine individually and then overlapping the findings for three engines would give a better picture.

For example, while indexing html content the text extracted may not be same across all engines. The weightage given to title, summary may be different than the weightage to full text content. When you analyze the top results per engine, you would see these patterns and can judge which ones better suit the purpose.

Would be interesting to know what questions/words do you use to test: number of keywords, natural language phrases...

On the other hand, take into account the expected relevance: do not worry about the % of overlap. Every search engine has its own recipe to match and get results. Overlap does not mean relevance for the end-user.

I think would be more useful to compare between different approaches to search (bayesian, boolean, semantic, thesaurus...) instead of leading search technologies. Look at GSA. It's google technology for the enterprise, but it does not work as fine as google in the internet. (the reason would be another issue...)

Methodology: give the power to the user, collect some real examples of how they (would) search a concrete document: let them test, and let them re-search until the desired subject appears in the top 10. Give them freedom to use keywords, natural language... Take into account that the different way to search one subject is unlimited.

The comments to this entry are closed.