51 posts categorized "Solr"

April 30, 2012

Is Microsoft joining the Lucene/Solr dance?

Lucene Revolution is only 10 days away, and if you're not already planning on being in Boston, today's a great time to register.

Why be at the 3rd annual Lucene Revolution, Lucid Imagination's open source conference? Several reasons:

  • Open source search is hot, and Lucene/Solr is better than ever;
  • Lucid Imagination is just introducing their LucidWorks Enterprise 2.1 release;
  • Paul Doscher, recently of Exalead, is the new CEO and keynote speaker; and
  • Microsoft's Gianugo Rabellino is speaking about Lucene, Azure, and OSS.

Yes, you saw it here. A Microsoft Azure guy is speaking right after Paul Dorscher Wednesday moring at Lucene Revolution. Has Microsoft caught the drift of the market towards Lucene/Solr in search, big data, and the cloud? Even search pundit Steven Arnold posted a few days back about Microsoft and Linux. Strange bedfellows perhaps, but there it is. 

So yes, I think if you can find any way to get to Boston in a week, I'd say do it. See you there!

 

March 29, 2012

Lucid positioning for success in open source search

Lucid Imagination is the Redwood Shores company whose charter is to market advanced products based on the open source Lucene/Solr project. With a large number of the Apache project committers in its employ, they have the technical wherewithal to succeed, but they never really screamed 'business success' - until recently.

In December of last year, Lucid's board hired Paul Doscher as CEO, presumably to make Lucid's premier product, LucidWorks Enterprise, a success in the marketplace. He seems to have been a good choice: he was the guy at Exalead who built a first-class organization in the US; and who was as responsible as anyone in making Exalead an attractive acquisition last summer for Dassault

Now, just a few months later, Paul has hired Mike Moody, formerly of Spigit, to be Lucid's EVP of Development. I had the opportunity last year to work with Mike at Spigit, an up-and-coming product in its own right, and I suspect having Mike on-board will have a positive impact on Lucid's products and services in the coming years.

It's tough to break into the commercial search market, but it seems to me that Lucid is serious about being a leader in the space - soon.

 

 

March 28, 2012

The importance of context in enterprise search

For years we have talked about the important of context when it comes to enterprise search. we blogged about it as long ago as 2007 and we stressed that the context of the user, the content, and the query all need to be considered between the time the user click 'Search' and the search platform gets the extended query. As an example, we've used things like Google's special treatment of 12-digit numbers that match the algorithm for FedEx tracking numbers. 

Now it appears that Google has started plans to expand their use of context as published in the Wall Street Journal and called out in blog postings from Avalon's Joe Hilger and Mashable's Lance Ulanoff. Google's Amit Singhal spoke of the shift from keywords to meaning, a change not only at Google but, over time, in the enterprise search platforms most companies use internally every day.

Extended_search_processing_flowAs we talk about in a recent webinar 'Secrets your Search Vendor Won't Tell You', search platform vendors have always trailed user requirements; sometimes you just need to write your own custom code to create a search experience users are happy with. You often need to add your own pre-search processing code to analyze the user query and create an expanded query using the vendor-specific search operators; make the most of standard platform capabilities; and post-process the search result list in order to give yours a great, meaningful, helpful set of results and actions.

At ESS New York in May, we're doing a pre-conference workshop that will take a deep dive into this process. We'll talk about how you can do this extended processing in several popular search platforms, and will include some representative examples of how you can implement this type of contextual enhancement for several popular search platforms. If you're going to be in New York anyway, come to the workshop!

s/Miles

 

 

March 06, 2012

Solr essentials Training: Lucid Imagination

Our friends over at Lucid Imagination are running a 5 hour instructor-lead training class on Tuesday March 13 that looks like a good deal for anyone looking at Solr as a potential enterprise search platform. You'll see how to get Solr installed and how to index content and search. The class will also cover faceted search, relevance tuning and query analysis.

Note this class covers Solr, and does not dive into the LucidWorks Enterprise, the enterprise-packaged version of Solr that you can license from Lucid Imagination. Nonetheless, if you have wondered whether open source can replace your existing search infrastructure, you should attend this class. Even though the class is on-line, space is limited so get registered today!

 

 

 

January 11, 2012

Webinar: What users want from enterprise search in 2012

If you ask the average enterprise user what he or she wants from their internal search platform, chances are good that they will tell you they want search 'just like Google'. After all, people are born with the ability to use Google; why should they need to learn how to use their internal search?

The problem is that web search works so well because, at the sheer scale of the internet, search can take advantage of methodologies that are not directly applicable to the intranet. Yet many of the things that make the public web experience so good can, in fact, be adapted in the enterprise. Our opinion is that, beyond a base level, the success of any enterprise search platform depends on how it is implemented and managed rather than on the core technology.

In this webinar we'll talk about what users want, and how you can address the specific challenges of enterprise content and still deliver a satisfying and successful enterprise search experience inside the firewall.

Register today for our first webinar of the new year scheduled for January 25 : What enterprise users want from search in 2012.

 

 

 

 

 

 

January 10, 2012

ISYS filters to be used for SAP Platforms

ISYS announced today that SAP has selected the popular ISYS Document Filters to replace software from both Autonomy and Oracle in their popular suite of analytical products.

ISYS, which has marketed an enterprise search product successfully for years, recognized the need for high-capability and low cost document filters, and packaged their internally developed technology. Because of its capabilities, support and price, ISYS Document Filters have become the best choice for companies that need to extract content from hundreds of different formats.

We particularly like that the ISYS filters are lightweight, easy to implement, and priced such that any company can afford to use them in-house or bundled with product. For large companies that use  Lucene/Solr for search but insist on having supported up-to-date filtering technology can solve the problem at a competitive price with ISYS.

 

 

November 30, 2011

Odd Google Translate Encoding issue with Japanese

Was translating a comment in the Japanese SEN tokenization library.

It seems like if your text includes the Unicode right arrow character, Google somehow gets confused about the encoding.  Saw this on both Firefox and Safari.  Not a big deal, strangely comforting to see even the big guys trip up on character encodings.

OK: サセン
OK: チャセ
Not OK: サセンチャセ?

Google-translate-encoding

November 29, 2011

10 Handy Things to Know about the Lucene / Solr Source Code

It's funny how certain facts are "obvious" to some folks, stuff they've known a long time, but come as a pleasant surprise to others.  Chances are you know at least half of these, but no harm in double checking!

  1. Although Lucene and Solr are available in binary form, most serious users are eventually going to need some custom code.  If you post questions on the mailing lists, I think the assumption is you're comfortable with compilers, source code control and patches.  So it's a good habit to get into early on.
  2. Lucene and Solr source code were combined a while back (circa March 2010), so it's now one convenient checkout.
  3. You'll want to be using Java 6 JDK to work with recent versions of Lucene / Solr.
  4. Lucene/Solr use the ant build tool by default.  BUT did you know that the build file can also generate Project files for Eclipse, IntelliJ and Maven.  So you can use your favorite tool.  (See the README.txt file for info and links)
  5. Lucene/Solr use the Subversion / SVN source code control system.  There are clients for Windows and plugins for Eclipse and IntelliJ. (Mac OS X has it built in)
  6. You're allowed to do read-only checkout without needing any sort of login - checkouts are completely open to the public.  This is news to folks who've used older or more secure systems.
  7. Although checking any changes back in would require a login, it's more common to post patches to the bug tracking system or mailing list, and then let the main committers review and checkin the patch.  Even the read-only checkouts create enough information on your machine to generate patches from your local changes.
  8. Doing a checkout, either public or with a login, does not "lock" anything.  This is also a surprise to folks used to older systems.  This non-locking checkout is why anonymous users can be allowed to checkout code - there's no need to coordinate checkouts.
  9. The read-only current source for the combined Lucene + Solr is at http://svn.apache.org/repos/asf/lucene/dev/trunk  Even though it's an http link, and can be browsed with a web browser, it's also a valid Subversion URL.
  10. The "contribute" wiki pages for Lucene and Solr have more info about the source code and patch process.

November 28, 2011

Solr Disk and Memory Size Estimator (Excel worksheet)

If you do a standard checkout of the Lucene/Solr codebbase you also get a dev-tools directory.  One interesting tidbit in there is an Excel spreadsheet for estimating the RAM and disk requirements for a given set of data.  Be sure to notice the tabs along the bottom, tab 2 is for memory/RAM estimates, and tab 3 is for disk space.

Full URL: http://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/size-estimator-lucene-solr.xls

November 08, 2011

Are you spending too much on enterprise search?

If your organization uses enterprise search, or if you are in the market for a new search platform, you may want to attend our webinar next week "Are you spending too much for search?". The one hour session will address:

  • What do users expect?
  • Why not just use Google?
  • How much search do you need?
  • Is an RFI a waste of time?   

Date: Wednesday, November 16 2011

Time: 11AM Pacific Standard Time / 1900 UTC

Register today!