January 27, 2010

A new acquisition?

I don't like talking about rumors: they are often wrong to start with, and the deals are as delicate as eggshells until the deal is complete. And when you predict one, you look silly when you are wrong.

Given all that let me be as vague as I can...:)

Key folks at two different companies we work with have told me in the last few days that a well known search company is going to be acquiring a smaller consulting firm with deep connections in the US federal government market. The holdup seems to be with the legal team at another search company which the  consulting firm represents: apparently the second search company isn't wild about a major competitor being part of its partner program.

The funny part is that the company rumored to be the acquiring company may be more interested in the sales channel the consulting firm has, rather than its broad expert consulting group or its interesting new product line.

Stay tuned. When (if?) it breaks, all will become clear. And if it drops through, you'll hear it here. I promise.

s/Miles

(Just in case you're wondering, none of these parties is New Idea Engineering...)

January 22, 2010

To Search

There is no time like the New Year to rethink everything and take a step back in an attempt to see the proverbial forest from the trees. Often this comes to me in the form of wondering where the words we use come from.

The verb "search" comes through the Old French circare, meaning to "go about, wander, traverse," from the Latin circus or circle - A very fitting description indeed. The term comes to be known in the early fourteenth century and would exactly describe the process of looking for something or someone. An individual actually had to go about and wander around looking for what they sought.

Contrast this with the expectations placed on search engines today. Users expect the engine to know immediately upon asking, often using a query of less than two words, where the exact piece of content is they are looking for.  If it does require a bit of wandering or traversing it seems to immediately frustrate the user. The desire is that the document most relevant to them is returned in the top results every time. Very little wandering or going about is expected by the user.

In reality the user is not performing a search but instructing the search engine to do so - yet we say "I am going to search for X." We query but the engine does all the "going about, wandering and traversing." This usage is very telling - the engine and the process of its searching has become an extension of the user. The expectation is that the engine, being a natural extension of themselves, knows their every desire and what they consider important.

In light of this should we should not be surprised at user's constant complaints regarding their search experience. Yet the industry seems to keep churning out more and more algorithms that focus on natural language processing, semantic search and other content focused approaches. Vendors seem to neglect and purchasers of enterprise engines keep pushing back deploying any sort of relevance methods that actually focus on understanding the user and fall for the newest vendor jargon year after year.

In this coming year I do not doubt we will see some very interesting technologies brought to market. They will undoubtedly allow us to find experts, tag results, star them and move them around, share them and socialize them - but will they seek to understand what is relevant to an individual searcher? Search profiles on a individual level do exist in some engines but usually remain fixed and static - ignoring context and behavior altogether.

I am putting in an early request - all I want for Christmas is my enterprise search engine to pay attention to me this year.

January 20, 2010

Google I/I Open for registration!

Google has announced its Google I/O 2010 to be held in San Francisco May 19-20 at the Moscone Center.

I think this is their third such annual event, and it's always been a full two days of information. The good news is the price is $400 per person (until April 15), a bargain really. The bad news? You'll need to bring four or five people from your company to hit all of the sessions in each track!

This conference is VERY technical, VERY good. You get the most from it if you are a developer, you know Java, Ajax, Python, or the other technologies Google uses in its various products. You won't find much in the way of marketing fluff here: in our experience, most presenters are Google developers.

The conference is being held the same week that Gilbane content management conference comes back to San Francisco. Bad timing for them, but good for you: you can probably walk to the nearby Westin at lunch and maybe catch the exhibits.

Last year, attendees received a free phone for development purposes on the Android OpSys; who knows what they might give away this year - besides the expected cool T-shirt!

Register at http://code.google.com/events/io/2010/.

December 16, 2009

Google Quantum Search?

Google has recently announced the fruits of their research with D-Wave, a firm that claims they have built the first quantum computer. Hartmut Neven, Head of Google's Image Recognition team, announced they have been able to successfully sort 20,000 images into sets with and without automobiles faster than anything running in a Google data center currently.

The team adapted quantum adiabatic algorithms to the task and trained on a set of 20,000 human tagged images and video stills of street scenes with and without automobiles.

Perhaps we can look to the future of quantum computing to untangle the problem of relevance and high quality search.

Read more about the alleged Google Quantum Search

December 02, 2009

Deep Web Sponsoring a federated search challenge

Abe and Sol Lederman over at Deep Web Technologies have announced the second annual contest to discover the best federated search methodology out there. The objective, from their FederatedSearchBlog web site:

Tell us about the most impressive federated search application you’ve ever seen, or about one you’ve dreamed up. How innovative can federated search be? What unique problems can it solve?

The first ten serious entries get an Amazon gift certificate or $25 via PayPal; and the top prices are $1000, $500, and $250 respectively. The winner will be a panelist at the April 2010 Computers in Libraries conference; and Deep Web will pick up the travel costs for the winner.

Federated Search is a hot topic, partly because nearly every organization wants to search content they may not have rights to index. Deep Web Technologies has some great examples of federated search and query time facets and clustering. Check out their web site, then write up a submission, win a few bucks, and speak at the Computers in Libraries conference next Spring! Do it now!

/s/Miles

November 23, 2009

Webinar: Basics of Search and Relevancy with Solr

Lucid Imagination, the Lucene and Solr folks, are running a webinar featuring Mark Bennett, CTO of New Idea Engineering. The presentation is scheduled for Wednesday, December 2nd at 2:00PM Eastern/11AM Pacific time (1900 GMT is my calculations are correct). Read more about the event and register today!

The description of the sessions follows:

In this introductory technical presentation, renowned search expert Mark Bennett, CTO of search consultancy New Idea Engineering, will present practical tips and examples that web application developers can use to quickly get productive with Solr, including:

  • Working with the "web command line" to control your search
  • Understanding Solr's DISMAX parser
  • Using Solr's Explain output to tune your results relevance
  • Using Solr's Schema browser

Sign up today and get ready for some relevance.

/s/Miles

KMWorld/ESS West moves to Washington DC 2010

Andrew McAfee of MIT and Harvard fame presented a great keynote last week at what may turn out to be the last ESS West. InfoToday has announced that KMWorld, and probably the enterprise Search Summit, will take place in Washington, DC, next Fall rather than in San Jose. ESS East, traditionally held in New York in May. While held at a smaller venue - the midtown Hilton Hotel - ESS East has always had a stronger feel to it, and apparently InfoToday will be looking for growth in the government sector.

Ironically, InfoToday recently acquired the Boston Search Engine Meeting from Infonortics, so they'll be running two shows in the Spring (Boston in April and New York in May), leaving leaves the west coast high and dry in terms of search conferences. Maybe the west coast companies are more comfortable in the 'so it yourself' search using Lucene and Solr; or maybe west coast companies just don't want to spend the time schmoozing at shows, when the real work gets done one-on-one.

In any case, look to Washington for KM World November 16-19 2010  at the Renaissance Washington DC Hotel. Should we call it ESS DC?

/s/Miles

November 18, 2009

SharePoint 2010 public beta now available

Microsoft now has released the beta of many (if not all) of the elements of SharePoint 2010 for testing. Don't be surprised that it's labeled 'Beta 2': This is the first public beta, although Beat 1 was available to select corporations and Microsoft MVP partners.

The release includes SharePoint, the 2010 version of Search Server, and most interestingly to us, the new release of FAST ESP in SharePoint. This latter beta includes what we believe is a full re-write of FAST ESP 5.3 to integrate it into SharePoint 2010, with a huge number of usability, management, and feature capabilities. Stay tuned for more on these differences and enhancements over the coming months.

Jie Li's GeekWorld site links to the Microsoft public download pages; but if your company are Microsoft partners (MSPP) you'll find the downloads with other Microsoft code you can access.

You'll also find tips to getting everything working right on Jie Li's site as well as on Alex's SharePoint blog, where you'll find  a pretty darned complete list of what 'gotchas' you should know about.

November 09, 2009

SearchDev Dinner in San Jose at ESS West

We've just put the final touches on the annual SearchDev dinner in conjunction with the Enterprise Search Summit West next week in San Jose, California. Anyone who attends the conference, or anyone in the Bay Area, is welcome to attend.

Lucid Imagination is sponsoring the dinner this year along with New Idea Engineering, which will be held on Wednesday night, November 18, at 630 PM in the San Jose Hilton, adjacent to the convention center.

Seats are limited, so if you think you will want to attend, please RSVP today to info(at)ideaeng.com with your name and names of the folks who will join you. Of course, replace the (at) with @...

Miles

November 06, 2009

Relevance by, for, and of the people...

Have you ever found yourself browsing a search result list, clicked on a result with a promising teaser, and been frustrated that the document didn't live up to its summary? Me too... you mutter 'this search sucks' to yourself, click the browser's Back link, and browse the result list again, hoping for a better result.

It seems the obsession with 'social search' has lead a few of the best known search companies to tie click popularity back into the base relevance engine. Google recently announced  Self-Learning Scorer as a new part of its latest Google Search Appliance update; and Microsoft announced similar interactive behavior ranking capability in both SharePoint and FAST ESP search - Behavioral Adaptation, one engineered called it.

Color us skeptical. We like the concept of click popularity, but we prefer to see it linked with a 'thumbs-up/thumbs-down' feedback mechanism. If people like the document they see, they won't bother telling you what a great job you did; but trust us, if it's not what they wanted, they will spend the extra few seconds to enter a negative vote. We've not been able to find out the details of the Google feature; Microsoft tells us that the recommendations have a 'time to live' of 30 days, so at least there's hope that crummy documents with great summaries won't fill the top spots of your search result lists.

What do you think?