« March 2009 | Main | May 2009 »

5 posts from April 2009

April 23, 2009

SharePoint sure seems strong

By Miles Kehoe

Microsoft's acquisition of FAST has certainly got the attention of many other search technology companies. Based on promotions, ad campaigns, and new product announcements, it seems that most of the major FAST competitors are making the case for their support of SharePoint. That seems to indicate that either 1) they didn't think they could compete against Search Server, but can provide better search than FAST; or 2) the impending integration of FAST ESP with SharePoint has other companies worried about losing new and existing customers to Microsoft.

Nothing against Search Server and it's free version, Search Server Express - but I think I'd have to go with option (2) above. Autonomy, Google and others have long had customers who used SharePoint, and they had 'pretty good solutions'. Now, a year after the acquisition was completed, both of these search giants and a number of smaller competitors are kicking off campaigns and launching newly improved products for integration their product with SharePoint. And Autonomy, with their recently announced acquisition of Interwoven, even has their own CMS competitor to SharePoint.

In the end, the winner will be the SharePoint customer. FAST ESP is one of the top search platforms, and a tight integration will benefit SharePoint users. But sometimes companies have reason to use other vendors for search, and a tighter integration will benefit these same SharePoint users as well.

Let the games begin!

April 21, 2009

Web Search and Dates, again....

Miles likes to brag that our first argument with the Google founders goes back to 2000/2001, about dates, and he's blogged out date issues before.

I was reminded of this old argument today.  We had a power glitch in Cupertino a little while ago, so I went online to see if there was any news - after Twittering about it of course!

Google brought back a top result with a promising title... from 2004:

Whatever... the other search portals had the same or similarly old results, so I'm not gonna site and bash Google.  In all fairness this is a tough problem to fix 100%, and all search engines have issues with bad dates.

AND, if no paper or blog talked about it, then there's nothing to "find" anyway.  Google can't find what isn't there.

Going to twitter search, I found my own posting, and then some guy asking about driving times to Cupertino.

But freshness of content remains a problem.

I have a compromise suggestion for "the powers the be".  How about, when a spider makes a guess as to the proper date of a story, that it also add a "confidence" to that.  For example, if the URL encodes a date, then I'd call that high confidence.  Or there's a newspaper byline.  At the other end is the when a web server gives the current date and time every time a page is fetched - so clearly not connected to the content, so a lower confidence.  And then some default weight, the first time a spider encounters a piece of content.

April 20, 2009

Attivio sponsoring SearchDev dinner at ESS NY

The Enterprise Search Summit in New York, one of the best of the enterprise search trade shows, is just three weeks away. We're happy to announce that Attivio is sponsoring the SearchDev.org dinner on Monday evening, May 11th.

As it was last year, the dinner will be at the Bice Restaurant at 7 E 54th Street, just a block from the Hilton. If you're involved in creating an enterprise search solution at your company and you plan on attending the show - or if you're just in town - feel free to attend. RSVP to searchdev@ideaeng.com to confirm your spot.

Attivio Attivio, which Gartner Group has called one of the 'cool vendors' in BI and performance management, has designed its search technology with a small footprint, incremental scalability, and real power to combine searches across structured and unstructured documents.

SearchDev.org is a technical and business discussion forum for people evaluating, selecting, and implementing enterprise search applications. It is managed by New Idea Engineering.

To attend the dinner, contact searchdev@ideaeng.com.

April 10, 2009

80Legs - The Mercenary Spider

By Carl Grimm, New Idea Engineering

80Legs, a Houston based company that runs a spider-for-hire launched its private beta yesterday.  80legs runs on a grid computing system provided by Plura Processing, a sister company.

80legs is slated to charge $2 per million pages crawled and $.03 per CPU hour used for analysis. They will soon bring online the ability to write custom code for page specific processes you wish to execute by means of their pageProcess() function.  Simply upload a .jar file and their system will run your code on the retrieved data.

While this launch falls slightly in the shadow of Amazon’s recent announcement of the availability of Elastic MapReduce, it serves as another confirmation of the impact and looming changes grid and cloud computing will have on search, both Internet and enterprise.

In a world where the width of train tracks is the same with as the wheels on Imperial Roman war chariots, I cannot help but think how search technology, who some may consider a mature technology, some who even consider a dead technology, has grown up constrained and shackled to the past.

I hope enterprise search vendors are taking note and rethinking their architecture and capabilities in the newfound land of massive computing power. I know that enterprise data security poses a daunting barrier to leveraging these capabilities. I am confident the tide will come as sentiments change and new generations bring their attitudes into IT and the business world as a whole.

Learn more about 80legs at http://www.80legs.com/

April 06, 2009

Infonortics Search Engine Meeting in Boston April 27

Just a heads up that the Infornortics Search Engine Meeting - perhaps the grand-daddy of enterprise search conferences - will take place in Boston later this month on April 27-28 2009. There's also an additional Google tutorial by Steve Arnold on Sunday the 26th.

Mark Bennett and I will be speaking on search engine security. Among the speakers in addition to Steve Arnold are Bjorn Olstad of Microsoft (via FAST), Sid Probstein of Attivio, David Seuss of Northern Light, and Daniel Tunkelang of Endeca. A pretty powerful gathering of enterprise search experts!