54 posts categorized "Solr"

October 26, 2012

Deep Solr in London and New York

Last week I had the pleasure of conducting a workshop at the recent Enterprise Search Summit on open source tools including Solr, Lucene, and some of the commercial products based on these tools. To a lesser extent we also covered ElasticSearch, SearchBloxAlcove9, and a few other platforms, as well as a number of open source and commercial tools that support enterprise search.

One thing many of the attendees had in common was that they had been experimenting with Lucene/Solr for a while, but many were skeptical that they were ready do dive into a deep project on their own.

While that sentiment is no problem for me - after all, we provide services around implementing both open source and commercial search to our customers - I know many companies want to have expertise in-house

For those of you who are looking for those skills, you might be interested in a post I just saw from LucidWorks, They are offering a developer course titled  'Everything you always wanted to know about Solr' in both New York City and in London England during November. If you've been experimenting with Solr in-house (or on your own) for a while now, and you're ready to move to the next level, you might give some thought to registering for one of these classes.

It will cover the usual Solr topics, but also replication, sharding, and all the things you need to know to really use Solr in production search. Take a look and see if it's right for you.

 

 

 

 

July 06, 2012

Search appliance 'blues'

Over the US Independence day holiday many of us learned that Google is dropping its entry-level search GsaBlue box, the Google Mini. This comes as part of 'summer cleaning', the Mini being dropped with a number of other services and products that are just not hot enough to support the effort. (The one I'll really miss? iGoogle.) Google hasn't provided much information on how successful the GSA 'Blue' has been, but with a price point between $3K US and $10K US I imagine they moved a bunch of them to customers with simple search requirements. 

I think it may have Steve Arnold who said recently that the Google pubic web search and its advertising sales accounting for something like 96% of the company's revenue, so I don't think too many Googlers are upset about losing a small slice of a small slice of revenue. Heck, Mini proficts probably don't even pay the fuel bills for a weekend flight to Europe for the Google 767.

The impact? Well, back then the Mini was new and it was big news. Heck, the bigger  models were even better at not too much more money. Still, enterprise search was an expensive proposition then. Lucene was pretty new and quite rough around the edges; FAST, Exalead and Endeca were selling for upwards of $250K, and needed at least that amount of money to actually get them to work. Google Site Search was there; but not many other enterprise search products were around for that price.

A funny thing happened in the new century. Now enterprise customers are more demanding about search. The GSA - even the larger models - is generally well-received at first. At least as long as the 'Powered by Google' icon is visible. We had one customer tell us that just licensing the Google icon would solve most of his user complaints. And Verity's Andy Feit proved it statistically a year or two later. (Have a look at our post last year 'It's not Google unless it says it's Google'.)

But over time, even when content and user query activity remains about the same, people become increasingly frustrated using the GSA. But will Google abandon the color yellow too? Steve Arnold has wondered on LinkedIn whether the larger Google appliances are going to see the same fate soon. 

The problem isn't that it's an appliance. It's the closed system that people are turning away from. In the enterprise, you can't use the cool techniques that Google uses to generate psychic results on the internet. In the enterprise, managers know what content to boost; Metadata? Fielded search? Boost based on content? Not in the blue (or yellow) world. 

Still, I think Google and the GSA provide pretty darned good value for a certain part of the market. If your data is pretty decent; if you're serving highly interliked web and PDF content; if your data needs are not too demanding - GSA may be the solution you want. But before you spend money blindly, do what you do with any product you buy - verify it works in your environment. And as with any enterprise search platform, allocate a budget to run it properly after roll-out.

Yes, search has changed. Really good low-cost options are available. Where? Well, in addition to Google's site search offerings, there's Lucid Imagination's cloud and on-premises solutions; and some other darned good offerings based on open source: Flax - SearchBlox - and more.

What do you think? Is the loss of the Mini giving you the blues? 

 /s/Miles

(With thanks to Karan!)

May 10, 2012

Lucene Revolution: MS talks of being more open

Lucene Revolution: MS talks of being more open

At yesterday’s kickoff of Lucene Revolution 2012, Lucid CEO Paul Doscher introduced Gianugo Rabellino, Microsoft's Director of Open Source Communities. Gianugo said little about search per se, but he did confess to having been a fan of Lucene and Solr for a while now. In his talk, he told the audience that Microsoft has changed with respect to open source, and he went on to tell everyone how they have become more involved in open standard like HTML5, CSS3; and in hardware specifications like USB. He went so far as to say 'Microsoft's survival depends on open source software'.

News to me, and perhaps to others in the room, was the extent to which Microsoft is supporting a number of open source products and languages. Gianugo reported that Linux is now a 'first-class guest operating system' on Microsoft HyperV; and provides support for PHP, Ruby on Rails, node.js and other projects on Azure (and presumably for 'on premises' systems).

A number of folks from large commercial organizations seemed to appreciate the news about Microsoft's shift towards supporting open source; but a number of the open-source folks in the room felt this offered little new, and some even felt it was an unrelated 'sales pitch'. Even though we are Microsoft partners, I'm glad to see more support for open source products like PHP and Linux.

The finniest part of the talk came as Gianugo was describing how SharePoint data was easily accessible to other non-Microsoft' search platforms. An attendee asked if he felt there was a role for other platforms to be used as the primary engine for search in SharePoint; as he paused to craft a reply, Paul Doscher (loudly) pronounced his belief that there was, much to the pleasure of the crown.

There was not much else in the way of Microsoft news; but it was interesting to see how many people and how much effort Microsoft is putting into open source projects.

 

 

April 30, 2012

Is Microsoft joining the Lucene/Solr dance?

Lucene Revolution is only 10 days away, and if you're not already planning on being in Boston, today's a great time to register.

Why be at the 3rd annual Lucene Revolution, Lucid Imagination's open source conference? Several reasons:

  • Open source search is hot, and Lucene/Solr is better than ever;
  • Lucid Imagination is just introducing their LucidWorks Enterprise 2.1 release;
  • Paul Doscher, recently of Exalead, is the new CEO and keynote speaker; and
  • Microsoft's Gianugo Rabellino is speaking about Lucene, Azure, and OSS.

Yes, you saw it here. A Microsoft Azure guy is speaking right after Paul Dorscher Wednesday moring at Lucene Revolution. Has Microsoft caught the drift of the market towards Lucene/Solr in search, big data, and the cloud? Even search pundit Steven Arnold posted a few days back about Microsoft and Linux. Strange bedfellows perhaps, but there it is. 

So yes, I think if you can find any way to get to Boston in a week, I'd say do it. See you there!

 

March 29, 2012

Lucid positioning for success in open source search

Lucid Imagination is the Redwood Shores company whose charter is to market advanced products based on the open source Lucene/Solr project. With a large number of the Apache project committers in its employ, they have the technical wherewithal to succeed, but they never really screamed 'business success' - until recently.

In December of last year, Lucid's board hired Paul Doscher as CEO, presumably to make Lucid's premier product, LucidWorks Enterprise, a success in the marketplace. He seems to have been a good choice: he was the guy at Exalead who built a first-class organization in the US; and who was as responsible as anyone in making Exalead an attractive acquisition last summer for Dassault

Now, just a few months later, Paul has hired Mike Moody, formerly of Spigit, to be Lucid's EVP of Development. I had the opportunity last year to work with Mike at Spigit, an up-and-coming product in its own right, and I suspect having Mike on-board will have a positive impact on Lucid's products and services in the coming years.

It's tough to break into the commercial search market, but it seems to me that Lucid is serious about being a leader in the space - soon.

 

 

March 28, 2012

The importance of context in enterprise search

For years we have talked about the important of context when it comes to enterprise search. we blogged about it as long ago as 2007 and we stressed that the context of the user, the content, and the query all need to be considered between the time the user click 'Search' and the search platform gets the extended query. As an example, we've used things like Google's special treatment of 12-digit numbers that match the algorithm for FedEx tracking numbers. 

Now it appears that Google has started plans to expand their use of context as published in the Wall Street Journal and called out in blog postings from Avalon's Joe Hilger and Mashable's Lance Ulanoff. Google's Amit Singhal spoke of the shift from keywords to meaning, a change not only at Google but, over time, in the enterprise search platforms most companies use internally every day.

Extended_search_processing_flowAs we talk about in a recent webinar 'Secrets your Search Vendor Won't Tell You', search platform vendors have always trailed user requirements; sometimes you just need to write your own custom code to create a search experience users are happy with. You often need to add your own pre-search processing code to analyze the user query and create an expanded query using the vendor-specific search operators; make the most of standard platform capabilities; and post-process the search result list in order to give yours a great, meaningful, helpful set of results and actions.

At ESS New York in May, we're doing a pre-conference workshop that will take a deep dive into this process. We'll talk about how you can do this extended processing in several popular search platforms, and will include some representative examples of how you can implement this type of contextual enhancement for several popular search platforms. If you're going to be in New York anyway, come to the workshop!

s/Miles

 

 

March 06, 2012

Solr essentials Training: Lucid Imagination

Our friends over at Lucid Imagination are running a 5 hour instructor-lead training class on Tuesday March 13 that looks like a good deal for anyone looking at Solr as a potential enterprise search platform. You'll see how to get Solr installed and how to index content and search. The class will also cover faceted search, relevance tuning and query analysis.

Note this class covers Solr, and does not dive into the LucidWorks Enterprise, the enterprise-packaged version of Solr that you can license from Lucid Imagination. Nonetheless, if you have wondered whether open source can replace your existing search infrastructure, you should attend this class. Even though the class is on-line, space is limited so get registered today!

 

 

 

January 11, 2012

Webinar: What users want from enterprise search in 2012

If you ask the average enterprise user what he or she wants from their internal search platform, chances are good that they will tell you they want search 'just like Google'. After all, people are born with the ability to use Google; why should they need to learn how to use their internal search?

The problem is that web search works so well because, at the sheer scale of the internet, search can take advantage of methodologies that are not directly applicable to the intranet. Yet many of the things that make the public web experience so good can, in fact, be adapted in the enterprise. Our opinion is that, beyond a base level, the success of any enterprise search platform depends on how it is implemented and managed rather than on the core technology.

In this webinar we'll talk about what users want, and how you can address the specific challenges of enterprise content and still deliver a satisfying and successful enterprise search experience inside the firewall.

Register today for our first webinar of the new year scheduled for January 25 : What enterprise users want from search in 2012.

 

 

 

 

 

 

January 10, 2012

ISYS filters to be used for SAP Platforms

ISYS announced today that SAP has selected the popular ISYS Document Filters to replace software from both Autonomy and Oracle in their popular suite of analytical products.

ISYS, which has marketed an enterprise search product successfully for years, recognized the need for high-capability and low cost document filters, and packaged their internally developed technology. Because of its capabilities, support and price, ISYS Document Filters have become the best choice for companies that need to extract content from hundreds of different formats.

We particularly like that the ISYS filters are lightweight, easy to implement, and priced such that any company can afford to use them in-house or bundled with product. For large companies that use  Lucene/Solr for search but insist on having supported up-to-date filtering technology can solve the problem at a competitive price with ISYS.

 

 

November 30, 2011

Odd Google Translate Encoding issue with Japanese

Was translating a comment in the Japanese SEN tokenization library.

It seems like if your text includes the Unicode right arrow character, Google somehow gets confused about the encoding.  Saw this on both Firefox and Safari.  Not a big deal, strangely comforting to see even the big guys trip up on character encodings.

OK: サセン
OK: チャセ
Not OK: サセンチャセ?

Google-translate-encoding