« Bing's Here | Main | Enterprise Search Engine Optimization: eSEO »

June 06, 2009

Impressions of first Lucene/Solr SF Meetup

Kudos to Carl, our NIE Marketeer and defacto social director, for getting us to attend, well worth it, and conveniently coinciding with Gilbane.

The Good:

  • VERY entertaining, very informative.  Lots of good info about upcoming versions of Lucene and Solr, including additional performance tweaks.
  • A friendly, supportive bunch of like-minded nerds, and I mean this is the best possible way.
  • Also discussions of other related Apache projects.  We're all gonna need a cheat sheet pretty soon to keep track of it all.
  • Lucene/Solr will soon have implemented much of the core features of Autonomy IDOL, Endeca, FAST, etc.  They really ought to be spying.  :-)

Personally I think Otis & co. might wanna fly out for the next one.  I also think Dieselpoint ought to attend and talk about Open Pipeline.  If we get up enough energy maybe we could even volunteer to do that next time, we're on the board after all, but this is really Chris's baby.

The Not-so-Good:

  • About 50 terms that clients would not understand.  Don't get me wrong, we love the Map/Reduce, Bayesian, K-Means, SVD stuff, but most corporate clients would be lost.
  • Not much for Enterprise Packaging.  Ironically it's the mundane aspects of search, from a non-developer standpoint, that are still not on the horizon.  Not a criticism of the developers, they have what they need.
  • Not much about Nutch.  Nutch 1.0 is out, along with rumors of a revised admin GUI, but not much coverage here.

Impressions of Lucid Imagination:

This event was sponsored by Lucid, a company that recently got funding for bringing commercial packaging and services to the open source search world, and their senior staff includes quite a few of the core committers.

  • A very sincere bunch of guys.
  • They haven't sold their souls to corporate America, I think their "geek cred" is still well in tact.
  • Probably will not be filling in enterprise packaging pot holes any time soon.
  • Do they understand the Enterprise Market?

Also a shout out to LinkedIn and IBM for giving back to open source community.

There was also an "open mic" segment, and I'd like to give a shout to Avi Rappaport - I agree 1,000%, "stop words bad!" (or at least the blind use of index time stop words)


  • Not much of a threat to Google Appliance, due to packaging.  Yes, Google scales with their Map/Reduce and relevancy algorithms, and the open source guys have responded, but that's not the stuff that makes Google tick these days.
  • And despite the impressive and rapidly evolving core technologies, also not a real threat to the other Tier One vendors like FAST and Autonomy.  More on this seeming contradiction in a bit.
  • The Tier 2 vendors of the world, Attivio, Exalead, Dieselpoint, etc. DO need to pay attention.  There is a place for Tier 2 vendors, but they need to mind what the open source products do and do not provide more carefully.
  • It's really cool to see IBM willing to contribute so aggressively to the open source search engines, even though they sell several of their own.  A naive person might think they are competing with themselves, sabotaging their own sales guys, but they're a lot smarter than that.  They are selling their commercial search products as pure search, those technologies are always part of a larger (and more expensive) grand business solution.  They know what they're doing!

For similar reasons, still not a huge threat to Autonomy, MS/FAST, Endeca, etc. on corporate services.  I said earlier that the Apache projects are implementing a lot of the "secret sauce" that launched Autonomy and Endeca, etc, so you'd think this represents "a clear and present danger", but Mike Lynch's secret algorithms are not why people buy IDOL anymore.  Things like giant reference accounts, professional services, and commercial grade spiders have a lot more to with why big companies still pay six figures for search technology.

And speaking of surprises and Lucid Imagination, I wanna circle back to their PR a few months back when they got their funding and launched their company.  They talked about relevancy in their press releases!?  Wow... Yes, Lucene and Solr have some good traction there, but that specific competitive advantage has been used by almost every commercial search vendor in the past 15 years, including Verity, Autonomy and Google!

I would've expected them to say something like "we're gonna do for Lucene what RedHat did for Linux" - this would have been a very clear business-oriented proposition, though to be fair lots of companies have used that business model as well.  It wouldn't be original, but would be more business centric.  Then again, I'm not in Marketing, and their VC's obviously liked their pitch, so what do I know!



TrackBack URL for this entry:

Listed below are links to weblogs that reference Impressions of first Lucene/Solr SF Meetup:


I while ago I found an announcement about a company who used actually both (Google appliance AND Lucene). Not only there was'nt much meat to this article, as far as I remember, I lost the URL altogether. (And surprisingly a Google search for it doesn't help there.) I do wonder though, does anybody know some architecture pattern on how to use a combination of both. The approach of using a free platform with some commercial complements does sound appealing to me.

Could you give me specific ideas on what Lucid needs to do for the Enterprise market/packaging?
Probably longer discussion than we can do in a comment. I think we've written about it before here and on our newsletter (http://www.ideaeng.com/newsletter). High level answer: the big commercial engines come with 'enterprise packaging' - installers, user documentation, search analytics, consoles for the business user, etc. Lucene/Solr still come with 'developer packaging': download a zip file, read and follow questions on http://searchdev.org and other discussion boards, get a compiler, and make your own. When you finish the program, you're usually about 50% to where you need to be to be successful in the enterprise market. They - and we - are working on it!

Great write-up, I seem to have missed it before.

The Meetup was heavily technical, I agree, but I think those are the people who saw the invitation, rather a completely representative group of Lucene/Solr people. And some things have to be discussed in technical terms, like TrieRanges and Query Parser Frameworks -- though I'd be happy to explain any/all of them, if someone wants to pay me to do it!

Thanks Avi! It was good to see you there as well!

Agree about Autonomy: "Mike Lynch's secret algorithms are not why people buy IDOL anymore. Things like giant reference accounts, professional services, and commercial grade spiders have a lot more to with why big companies still pay six figures for search technology." In fact, they haven't really been interested in search for about a decade, which is kinda sad. I hope the spider reference is to Ultraseek, because all those years of care and tweaking deserve to live on.

Avi Rappoport (please note spelling?)

Mark, sound observations about the balance between technical credibility and readiness for the enterprise market. As for the Meetup, I too observed more pizza/t-shirts than neckties -- a clear indicator of more geek, less sales call. Further discussions on deep "arcana," like updating the industry's approach to stopwords, are always welcome in Meetups.

But there is a trend here: just as they did with Linux, more and more enterprise customers are turning to open source an alternative to proprietary packages. Enterprise support, whether Red Hat or Lucid Imagination, is a key success factor ( heck, it's the business model we describe in our press releases). We're hearing from lots of customers who are realizing that the costs and constraints of closed source -- like paying per query or per document -- don't deliver enough advantage.

We'd sure be interested to hear whether you and your readers are seeing what we're seeing: in today's economy, the race will go to whoever solves the customer's problems faster, better, cheaper. Open Source Lucene/Solr is clearly on the short list.

The comments to this entry are closed.