« February 2012 | Main | April 2012 »

11 posts from March 2012

March 29, 2012

Recommind looking to Predictive Coding to improve eDiscovery search

Recommind, the search vendor best known for its focus on eDiscovery, has recently blogged about predictive coding and their use of it in the product line.  What, you may ask, is predictive coding? The blog post above does a good job of providing background and some definition.

Craig Carpenter, the author of the post, makes it clear that predictive coding is an assist for human reviewers; it is technology that is useful in conjunction with people and workflow. It assists the process of identifying critical concepts and using the information - in this case, in a discovery matter.

Autonomy has blasted predictive coding as inferior to its 'meaning based coding'. Now, Autonomy has advertised the strength of its 'meaning based search' for a while now, so whether meaning based coding is something new for them, or a repackaging of the existing IDOL technology, we can't say.

We can say that eDiscovery is a growing field, and we've seen Recommind and hosted eDiscovery vendors like Catalyst (based on Mark Logic) make some big inroads. To have eDiscovery system up and running in days rather than months is a key advantage; and we think these two are among the strong contenders in the market. 

How long before predictive coding finds its way into your search platform's marketing material? Give us a few months and we'll tell you.

Do you see an advantage or disadvantage for meaning based coding over predictive coding?  Let us know.




Lucid positioning for success in open source search

Lucid Imagination is the Redwood Shores company whose charter is to market advanced products based on the open source Lucene/Solr project. With a large number of the Apache project committers in its employ, they have the technical wherewithal to succeed, but they never really screamed 'business success' - until recently.

In December of last year, Lucid's board hired Paul Doscher as CEO, presumably to make Lucid's premier product, LucidWorks Enterprise, a success in the marketplace. He seems to have been a good choice: he was the guy at Exalead who built a first-class organization in the US; and who was as responsible as anyone in making Exalead an attractive acquisition last summer for Dassault

Now, just a few months later, Paul has hired Mike Moody, formerly of Spigit, to be Lucid's EVP of Development. I had the opportunity last year to work with Mike at Spigit, an up-and-coming product in its own right, and I suspect having Mike on-board will have a positive impact on Lucid's products and services in the coming years.

It's tough to break into the commercial search market, but it seems to me that Lucid is serious about being a leader in the space - soon.



March 28, 2012

SharePoint 2010 Search Center Improvement

Our friends over at Arcovis are hosting a SharePoint Shoptalk webinar Thursday 3/29 that SP search admins and business line managers should attend. 

"5 Little things you can do to make a big impact on your SharePoint Search Center" will address steps that you can take to improve search in SharePint. Paul Olenick is the speaker, and he certainly knows his stuff.

Register here.

The importance of context in enterprise search

For years we have talked about the important of context when it comes to enterprise search. we blogged about it as long ago as 2007 and we stressed that the context of the user, the content, and the query all need to be considered between the time the user click 'Search' and the search platform gets the extended query. As an example, we've used things like Google's special treatment of 12-digit numbers that match the algorithm for FedEx tracking numbers. 

Now it appears that Google has started plans to expand their use of context as published in the Wall Street Journal and called out in blog postings from Avalon's Joe Hilger and Mashable's Lance Ulanoff. Google's Amit Singhal spoke of the shift from keywords to meaning, a change not only at Google but, over time, in the enterprise search platforms most companies use internally every day.

Extended_search_processing_flowAs we talk about in a recent webinar 'Secrets your Search Vendor Won't Tell You', search platform vendors have always trailed user requirements; sometimes you just need to write your own custom code to create a search experience users are happy with. You often need to add your own pre-search processing code to analyze the user query and create an expanded query using the vendor-specific search operators; make the most of standard platform capabilities; and post-process the search result list in order to give yours a great, meaningful, helpful set of results and actions.

At ESS New York in May, we're doing a pre-conference workshop that will take a deep dive into this process. We'll talk about how you can do this extended processing in several popular search platforms, and will include some representative examples of how you can implement this type of contextual enhancement for several popular search platforms. If you're going to be in New York anyway, come to the workshop!




March 23, 2012

Webinar: Is bad metadata costing you money?

We've planned a webinar that will help identify whether you have a metadata problem, what you can do to fix it, and how to justify the cost.

Despite what some vendors claim, enterprise search platforms rely on good metadata in order to deliver quality results. Yet few organizations have the resources to attack their metadata problems, so findability suffers and users lament "Why don't we use Google?"  Search managers know that even the Google Search Appliance, without quality metadata, can't deliver the internet search experience end users know, love, and trust. Yet it's hard to justify the time and effort to improve metadata in hopes of a better search experience.

In this webinar we will consider the issue of bad metadata, ways to address the problem, and some ideas on what the ROI can be. We will discuss:

  • Do you have a metadata problem?
  • How much is it costing you?
  • What is the risk of bad metadata
  • What tools are available?
  • What will it cost to fix?  
  • What's the ROI of improved metadata?


We're hosting the webinar twice; Wednesday, April 11th at 11AM Pacific time (GMT-7); and again on Thursday, April 12th at 8:30AM Pacific time. Click the link on the appropriate session you'd like to attend.

See you then!

Making sense of the Lexmark/ISYS acquisition

The dust is settling from the news announced last Monday that Lexmark was acquiring independent search and tools company ISYS. Just a couple of weeks before, HP's Mike Lynch, founder of recently acquired Autonomy, speculated to the press that HP may produce an Autonomy Search Appliance, and furthermore that HP might integrate Autonomy IDOL with its high-end printers. That seemed funny at the time; does the world really need printers that can search? Risk management makes for strange bedfellows, I suppose.

But now, barely two weeks later, Lexmark's Perceptive Software company (a 'Lexmark Company'), acquires ISYS. One wonders if Mike was aware of those negotiations.

Stranger still: Perceptive also announced it had acquired Massachusetts-based Nolij, a leading provider of document imaging and workflow solutions. While their focus has been in the education marketplace, presumably there is some synergy between document imaging, scanning, search - and hard-copy output devices. And you'd think that growing a market among enterprises is a possibility for a well-respected imaging company formerly focused only on higher education markets.

So imagine a system that can scan and OCR textual content and full-text index the content into your enterprise search platform. Seamless magic. Do you think anyone would want that?

I think we'll see more and more of this kind of software/hardware integration, and not just 'because we can'. Consider this: New Zealand-based Pingar uses its software to perform entity recognition and extraction and is now embedded in high speed scanners to eliminate the need to manually enter data from standardized forms like insurance warranty claims. Disclosure: here at New Idea Engineering, we were so impressed with the value Pingar brings to enterprise search that we signed on as their first US-based partner

Can you imagine a scanner that can scan, OCR, extract entities, and create a search index in real time as the high speed scanner reads your forms. Federate that into your existing enterprise search platform with appropriate security, and some big problems for big companies are solved pretty much automatically. Seamless magic indeed!

What do you think? Would you find such a device useful at your organization?





March 22, 2012

The sorry state of FAST training

Suppose your company uses SharePoint 2010. Suppose your company uses FAST Search for SharePoint as well. Where would you go for training?

Today I decided to see where we could get training for one of our new guys. My first stop was the old FAST University, which now directs you to Microsoft Learning at


There, you can read about all of the classes, except that when you click Register, you find a page with class schedules for January and February of.. 2012. None after that. If you log in, you find a few classes for people with advanced Microsoft certifications, but nothing helpful.

I called the number where it suggested I could contact FAST University; and a very nice person directed me to the Microsoft Learning site (note: the title on the page I was looking at said 'Microsoft Learning' but apparently that is really the old FAST site at mzinga.)

On the real Microsoft Learning portal - microsoft.com/learning - I did a search for 'fast search' only to find:


I went back to search for simply 'search' and did find a single class with FAST Search in the title - sadly, one offered August of 2011 - 6 months ago. 

After a call back to the nice person on the other Microsoft Learning page at mzinga, she told me there are no more FAST classes (meaning FAST ESP, I guess); and that for Fast Search for SharePoint classes, I need to find a partner. Go back to the (real) Microsoft Learning site, search for 'class locator' to find a partner. Use care: if you click on the class description, you'll see the date first published, not the date of any classes: to get that, you need to click on 'instructor lead'. 

So yes, there were three training providers here in Silicon Valley. And one claims to have classroom training for next week! But when I called them (at 1:15PM) an answering service picked up and told me that because I called 'outside of normal business hours' he's have to have them call me back. Nothing yet, day after. Maybe an early Easter holiday?

Second training partner: went to their web site - no phone number, but a 'Live Chat'. "Sorry, no operators are available; you'll be connected to the next one free'. An hour later, nada. I left.

Final partner - ONLC - picked up the phone; confirms that they teach the class, but will need a few more students to register before they can confirm it will happen. If it does happen, they teach it remotely. I can go into San Jose at their center, or even take it from here in our offices. Cool. But they can't find four students in the US to take it?

Kudos to ONLC; but it's a shame to see how far down the line training is for Microsoft with respect to FAST Search for SharePoint. Luckily, there are a number of good former FGAST ESP partners - including us - who can help you with what you need, be it training, remote support, or even appdev.

What do you do for Microsoft search training?






March 19, 2012

ISYS Acquired by LexMark

Independent search and tools provider ISYS announced today they have been acquired by LexMark, best known before today as a supplier of printers and hardcopy technology. 

While ISYS search has been extremely successful in a number of verticals including government agencies and law enforcement, it's perhaps best known as being the sole independent supplier of commercial quality filters for applications that need to extract textual content from proprietary document formats. The recent re-write of those filters has made ISYS filters among the fastest on the market, and great for cross-platform environments.

A few weeks back, Autonomy founder and HP Vice President Mike Lynch talked about including IDOL in HP's printer family; it looks like there may be some competition there!



March 14, 2012

Vannevar Bush: His future, our present

In the last few weeks I've been reading 'Your life, Uploaded' by Gordon Bell and Jim Gemmell. The book talks about 'MyLifeBits', one of the Microsoft Research projects. In it, the authors talk about the process of capturing your life digitally. It sounds a bit compulsive to me, but you never know: I used to think texting was something only kids did, and yet I now use text messages to communicate with other folks in our company nearly every day.

Anyway, the book mentioned Vannevar Bush, one of the little known pioneers of the internet who was born in 1890 and who passed away in 1974. It might be a push, but it seems to me he was, to the internet, what FrederickTerman (one of Vannevar's students at MIT) was to Silicon Valley. He lead a fascinating life, serving in the National Research Council in World War I, founded Raytheon, and served as Director of the Office of Scientific Research and development during World War II. In all, an amazing life.

What I find most interesting is an article he published in the Atlantic Monthly in 1945 before the end of the war. In it, he writes about a device he believes will help scientists and researchers index and cross-index their work; he called it a Memex. If you read the article, you'll see he is talking about the technology of today: the internet, highly capable indexing and retrieval technologies, and amazing storage capabilities available to the common man.

It turns out that the Memex was one of the inspirations for Gordon Bell in attempting to digitize his life. Like text messages, it may be new and on the fringe, but it sounds alot like Apple's Knowledge Navigator from the mid-1980s, cloud computing, and some entirely new things rolled into one. Just in time, too, since my memory seems to be less effective every year. Another good reason to write a blog!

What do you think? Was it a lucky guess, or was Bush a true visionary?


March 06, 2012

Solr essentials Training: Lucid Imagination

Our friends over at Lucid Imagination are running a 5 hour instructor-lead training class on Tuesday March 13 that looks like a good deal for anyone looking at Solr as a potential enterprise search platform. You'll see how to get Solr installed and how to index content and search. The class will also cover faceted search, relevance tuning and query analysis.

Note this class covers Solr, and does not dive into the LucidWorks Enterprise, the enterprise-packaged version of Solr that you can license from Lucid Imagination. Nonetheless, if you have wondered whether open source can replace your existing search infrastructure, you should attend this class. Even though the class is on-line, space is limited so get registered today!