Lucene: It's coming from inside the firewall!
We've done a number of projects helping large clients with search roadmap planning, including an audit of their various data sources. This is often an early step in implementing an enterprise search solution that will integrate diverse content across multiple sources.
On a number of our recent projects, an interesting thing has happened. As we've spoken with content owners, we've found an increasing number of Lucene implementations that no one knew about. This has often been a surprise to the people who brought us in, usually corporate IT. Much like PCs infiltrated into corporations in the early days, it looks like Lucene is making its way into companies under the radar, often hacked in by a creative employee who just wants to get a simple search capability working, and who doesn't have time for a formal selection process or budget to purchase a commercial solution.
As we're written before, Lucene/Solr is getting to be a pretty decent search solution, although it's still a bit rough round the edges. This can't be a good sign for companies that market premium-priced search products.
Consider:
- IBM offers the free IBM/Yahoo! search for up to 500,000 documents
- Microsoft offers free Search Server Express as well as a higher-capacity Search Server
- Google Site Search and Google Custom Search are free and low-cost hosted solutions that provide search to your site - or a group of sites - and not spend much money
Finally, as Microsoft subsidiary FAST moves into the mid-range price sector with high end capabilities, the price of enterprise search is dropping for many companies that might have had to license six-figure deals for licensing alone. Add to this the free and low costs supporting technologies - consider clustering engine Carrot^2, for example - and you've got a movement.
To paraphrase our long-time friend Deep Search, you can spend a bunch of money on a commercial search, then spend much more implementing it well; or you can find a free or low-cost solution and spend a bunch of money implementing it. Your call.