78 posts categorized "Enterprise search"

March 09, 2010

Enterprise search engines: They're *not* all the same

We're in the process of doing a search engine evaluation for a large customer. That, by itself, isn't news: we do those quite a bit for companies large and small. No, what makes this project most interesting is that we are doing side-by-side comparisons of three leading search technologies using industry-standard data sets.

Our assumption going in was that, for out-of-box simple searches, all three engines would return pretty much of the same set of results: after all, if TF/IDF (term frequency/inverse document frequency) was at the core of these technologies, they should be getting roughly the same results sets. Much to our surprise, if we look at the top 10 search results from each engine for a simple search, we get only about 15% overlap.

Let me explain it this way: if we retrieve ten search results for a specific query from one search engine, only 3 of the twenty - 15% - results were found by either of the other engines. In a typical list of 10 results, only 3 show up in more than one engine. We were especially amazed because we are going out of our way to use default parameters as much as possible: no entity extraction, no search tuning, no special synonyms or thesaurus terms.

We're still too early in the process to understand what's behind this surprising situation: it's always possible the results are too tentative to make any judgments, or we could find an error in our methodology.  We're working on it, and we'll get back with any findings that we can share. If you have any explanations, leave a comment - we'd love to hear what you think.

/s/Miles

February 24, 2010

Enterprise Search Summit 2010 - DC

Even as we prepare for ESS East in New York (ESS NY from now on?), Information Today has issued its call for papers for the first ever ESS-DC to be held in Washington DC November 16-18 2010.

Follow this link to find background on what InfoToday is looking for; or jump right to the submissions page. Don't be shy: everyone who presents papers had, at one time, never done it before. What you know, someone else needs to know!

In our experience, the kind of content InfoToday likes is the information that can help an organization select or manage search and related technologies. Generally, real-world stories about how other companies and organizations have succeeded with search are the ones that attendees appreciate the most. 

We'll also be having a searchdev dinner at ESS DC this year. Details to come late in summer, but plan for it now!

Are you doing search now? Have you been successful getting it going on time and under budget? Tell your story. Submit your idea now!

February 10, 2010

Acquitision Wednesday

As we hinted here last week, Autonomy has announced that it has acquired MicroLink, its 2007 Partner of the Year.

MicroLink, a major player for Autonomy and for Microsoft in the federal; space, has been a reseller and implementation partner for both for years. As recently as last year, MicroLink started development of a very cool social search product that helped blur the lines between enterprise search and social search on the SharePoint platform, and had architected its application to sit on FAST as well as IDOL. There were even hints that they were eying a Lucene platform. 

I would have loved to be able to hear the negotiations between Microsoft and Autonomy concerning access to internals of the FAST search engine currently being integrated tightly into SharePoint. The story we've heard is that the Microsoft negotiations contributed to the delay in the announcement, since apparently folks in both companies have had the news since at least Christmas.

It will be curious to see what happens now. We've always thought of MicroLink as a consulting firm, delivering implementation support. Mike Lynch, Autonomy's boss, has never had good things to say about consultants, and has certainly overseen the dwindling of the Verity consulting group he acquired a few years back. Either he's decided that independent consultants are bad, his consultants are good; or he's hoping he can reduce his 'days outstanding' receivables by bring the implementers in house. Let's hope it wasn't $55M spent just to gain access to the federal sales force.

/s/Miles

News front: Convera files for dissolution

Convera, one of the companies offering 'vertical search' to help publishers and other content owners monetize their content, has filed to dissolve and liquidate the business. It was de-listed from NASDAQ Monday afternoon.

Convera, not unlike SearchButton.com which NIE spun off in 1998, was a hosted search company - now called 'site search' or 'search as a service'. It was a great idea, but when things imploded in 2001, Convera went after the market of monetizing content. That lead to Convera becoming a victim of the same problems faced by newspapers and publishers around the world, who they counted as their market: how do you sell content that is freely available from companies like Google, Yahoo, and Bing; and from blogs world-wide?

Google has a pretty darned reasonable site search service, by the way.

RIP.

/s/Miles

January 27, 2010

A new acquisition?

I don't like talking about rumors: they are often wrong to start with, and the deals are as delicate as eggshells until the deal is complete. And when you predict one, you look silly when you are wrong.

Given all that let me be as vague as I can...:)

Key folks at two different companies we work with have told me in the last few days that a well known search company is going to be acquiring a smaller consulting firm with deep connections in the US federal government market. The holdup seems to be with the legal team at another search company which the  consulting firm represents: apparently the second search company isn't wild about a major competitor being part of its partner program.

The funny part is that the company rumored to be the acquiring company may be more interested in the sales channel the consulting firm has, rather than its broad expert consulting group or its interesting new product line.

Stay tuned. When (if?) it breaks, all will become clear. And if it drops through, you'll hear it here. I promise.

s/Miles

(Just in case you're wondering, none of these parties is New Idea Engineering...)

January 20, 2010

Google I/I Open for registration!

Google has announced its Google I/O 2010 to be held in San Francisco May 19-20 at the Moscone Center.

I think this is their third such annual event, and it's always been a full two days of information. The good news is the price is $400 per person (until April 15), a bargain really. The bad news? You'll need to bring four or five people from your company to hit all of the sessions in each track!

This conference is VERY technical, VERY good. You get the most from it if you are a developer, you know Java, Ajax, Python, or the other technologies Google uses in its various products. You won't find much in the way of marketing fluff here: in our experience, most presenters are Google developers.

The conference is being held the same week that Gilbane content management conference comes back to San Francisco. Bad timing for them, but good for you: you can probably walk to the nearby Westin at lunch and maybe catch the exhibits.

Last year, attendees received a free phone for development purposes on the Android OpSys; who knows what they might give away this year - besides the expected cool T-shirt!

Register at http://code.google.com/events/io/2010/.

December 02, 2009

Deep Web Sponsoring a federated search challenge

Abe and Sol Lederman over at Deep Web Technologies have announced the second annual contest to discover the best federated search methodology out there. The objective, from their FederatedSearchBlog web site:

Tell us about the most impressive federated search application you’ve ever seen, or about one you’ve dreamed up. How innovative can federated search be? What unique problems can it solve?

The first ten serious entries get an Amazon gift certificate or $25 via PayPal; and the top prices are $1000, $500, and $250 respectively. The winner will be a panelist at the April 2010 Computers in Libraries conference; and Deep Web will pick up the travel costs for the winner.

Federated Search is a hot topic, partly because nearly every organization wants to search content they may not have rights to index. Deep Web Technologies has some great examples of federated search and query time facets and clustering. Check out their web site, then write up a submission, win a few bucks, and speak at the Computers in Libraries conference next Spring! Do it now!

/s/Miles

November 23, 2009

Webinar: Basics of Search and Relevancy with Solr

Lucid Imagination, the Lucene and Solr folks, are running a webinar featuring Mark Bennett, CTO of New Idea Engineering. The presentation is scheduled for Wednesday, December 2nd at 2:00PM Eastern/11AM Pacific time (1900 GMT is my calculations are correct). Read more about the event and register today!

The description of the sessions follows:

In this introductory technical presentation, renowned search expert Mark Bennett, CTO of search consultancy New Idea Engineering, will present practical tips and examples that web application developers can use to quickly get productive with Solr, including:

  • Working with the "web command line" to control your search
  • Understanding Solr's DISMAX parser
  • Using Solr's Explain output to tune your results relevance
  • Using Solr's Schema browser

Sign up today and get ready for some relevance.

/s/Miles

KMWorld/ESS West moves to Washington DC 2010

Andrew McAfee of MIT and Harvard fame presented a great keynote last week at what may turn out to be the last ESS West. InfoToday has announced that KMWorld, and probably the enterprise Search Summit, will take place in Washington, DC, next Fall rather than in San Jose. ESS East, traditionally held in New York in May. While held at a smaller venue - the midtown Hilton Hotel - ESS East has always had a stronger feel to it, and apparently InfoToday will be looking for growth in the government sector.

Ironically, InfoToday recently acquired the Boston Search Engine Meeting from Infonortics, so they'll be running two shows in the Spring (Boston in April and New York in May), leaving leaves the west coast high and dry in terms of search conferences. Maybe the west coast companies are more comfortable in the 'so it yourself' search using Lucene and Solr; or maybe west coast companies just don't want to spend the time schmoozing at shows, when the real work gets done one-on-one.

In any case, look to Washington for KM World November 16-19 2010  at the Renaissance Washington DC Hotel. Should we call it ESS DC?

/s/Miles

November 09, 2009

SearchDev Dinner in San Jose at ESS West

We've just put the final touches on the annual SearchDev dinner in conjunction with the Enterprise Search Summit West next week in San Jose, California. Anyone who attends the conference, or anyone in the Bay Area, is welcome to attend.

Lucid Imagination is sponsoring the dinner this year along with New Idea Engineering, which will be held on Wednesday night, November 18, at 630 PM in the San Jose Hilton, adjacent to the convention center.

Seats are limited, so if you think you will want to attend, please RSVP today to info(at)ideaeng.com with your name and names of the folks who will join you. Of course, replace the (at) with @...

Miles