53 posts categorized "Autonomy"

April 25, 2012

Vivisimo: Another one bites the dust

Earlier today, IBM announced that it was acquiring Vivisimo for an undisclosed sum. Now the tough question: what’s it all about? For the answer, let's take a quick trip to the early years of the decade.

Vivisimo was founded in 2000 out of Carnegie Mellon University. The first time we saw them, in 2004, they were marketing 'Clusty', a web clustering product that could examine huge numbers of web pages and then associate - or cluster - documents on specific terms. They also had some really strong federation capabilities built in. And the product was highly scalable. In fact, Vivisimo had great success in a number of huge government sites including the US Social Security site, FirstGov, the Defense Intelligence Agency, and commercial sites such as Ely Lilly. One thing all of these sites have in common? Lots of data. We have a term for that now: 'big data'.

IBM has made huge investments in open source search over the last 10 years, specifically yin Lucene/Solr. Hadoop is the Apache answer for big data, and trust me; Hadoop is a hot topic this year.

What does Vivisimo bring IBM? Well... for one thing,  clustering algorithms (and probably patents); a reputation for being able to handle huge data sets; and federation.

What should Vivisimo customers do now? Well, based on IBM's strong customer ethic, I think the answer is "don't panic" = do nothing for now'. Assuming Velocity is working for you, this acquisition should cause you no concern.

If you are evaluating Vivisimo, that's a bit more difficult. Some acquisitions like Verity's acquisition by Autonomy resulted in a wholesale replacement of the platform. Some customers made the switch early on and were happy; others fought to make IDOL work like K2, even with the 'compatibility mode; and never succeeded. You'll also remember that Microsoft, after  acquiring FAST Search, dropped the entire non-Windows platforms a year later which impacted upwards of 70% of the FAST  installed base.

If you are willing to acquire a platform for a couple of years and see what happens, go for it. You may look back and discover you made the right choice. On the other hand, former President Reagan had a saying: Trust, but verify". You might take a look around to see what platform is right for you now and into the future.

 

April 10, 2012

Autonomy 'King of the Cloud'

Years ago, my friend Jerry Gross in PR at HP related a funny story about how companies work with the press. He met with an editor of some electronics magazine to announce the new, improved memory chips that HP had created that actually provided 4K on a single chip! (This was a while ago!) 

Way back then this *was* news; but to Jerry and the reporter, it was yet another memory product to announce and the meeting was just specs and details. Jerry, a prankster at heart, decided to throw in a twist: he said that HP decided to use ROUND chips in this new product, rather than conventional rectangular ones. 

This piqued the reporter's interest: round chips? Yes, when you think about it, in rectangular chips, some of the bits are in a corner, so it takes longer for those bits to be accessed. By making the chip round, all chips were equidistant from the center so all could be accessed at the same speed!

The reporter was eating this up - something new and exciting! He wrote a quick paragraph for his publication before Jerry broke out laughing.  Luckily, their relationship was a good one and both had a great laugh about it. The round memory chip never made it to the world media.

Today I read an article in the London Business Weekly, reporting that Autonomy now has "world’s largest private cloud", more than "50 petabytes of data including web content, video, email and multimedia data". Granted Autonomy has a great service business in hosting and search-enabling all sorts of multimedia content. But ... I wonder if the reporter ever wondered out loud about some other rather large 'private clouds' - perhaps Google? Or Microsoft? Amazon? 

Maybe none of these robust competitors are as big as Autonomy; maybe HP really became the cloud giant by acquiring Autonomy last year. Or maybe a round memory chip made it past a reporter today. What do you think?

 

 

March 29, 2012

Recommind looking to Predictive Coding to improve eDiscovery search

Recommind, the search vendor best known for its focus on eDiscovery, has recently blogged about predictive coding and their use of it in the product line.  What, you may ask, is predictive coding? The blog post above does a good job of providing background and some definition.

Craig Carpenter, the author of the post, makes it clear that predictive coding is an assist for human reviewers; it is technology that is useful in conjunction with people and workflow. It assists the process of identifying critical concepts and using the information - in this case, in a discovery matter.

Autonomy has blasted predictive coding as inferior to its 'meaning based coding'. Now, Autonomy has advertised the strength of its 'meaning based search' for a while now, so whether meaning based coding is something new for them, or a repackaging of the existing IDOL technology, we can't say.

We can say that eDiscovery is a growing field, and we've seen Recommind and hosted eDiscovery vendors like Catalyst (based on Mark Logic) make some big inroads. To have eDiscovery system up and running in days rather than months is a key advantage; and we think these two are among the strong contenders in the market. 

How long before predictive coding finds its way into your search platform's marketing material? Give us a few months and we'll tell you.

Do you see an advantage or disadvantage for meaning based coding over predictive coding?  Let us know.

 

 

 

March 28, 2012

The importance of context in enterprise search

For years we have talked about the important of context when it comes to enterprise search. we blogged about it as long ago as 2007 and we stressed that the context of the user, the content, and the query all need to be considered between the time the user click 'Search' and the search platform gets the extended query. As an example, we've used things like Google's special treatment of 12-digit numbers that match the algorithm for FedEx tracking numbers. 

Now it appears that Google has started plans to expand their use of context as published in the Wall Street Journal and called out in blog postings from Avalon's Joe Hilger and Mashable's Lance Ulanoff. Google's Amit Singhal spoke of the shift from keywords to meaning, a change not only at Google but, over time, in the enterprise search platforms most companies use internally every day.

Extended_search_processing_flowAs we talk about in a recent webinar 'Secrets your Search Vendor Won't Tell You', search platform vendors have always trailed user requirements; sometimes you just need to write your own custom code to create a search experience users are happy with. You often need to add your own pre-search processing code to analyze the user query and create an expanded query using the vendor-specific search operators; make the most of standard platform capabilities; and post-process the search result list in order to give yours a great, meaningful, helpful set of results and actions.

At ESS New York in May, we're doing a pre-conference workshop that will take a deep dive into this process. We'll talk about how you can do this extended processing in several popular search platforms, and will include some representative examples of how you can implement this type of contextual enhancement for several popular search platforms. If you're going to be in New York anyway, come to the workshop!

s/Miles

 

 

March 23, 2012

Making sense of the Lexmark/ISYS acquisition

The dust is settling from the news announced last Monday that Lexmark was acquiring independent search and tools company ISYS. Just a couple of weeks before, HP's Mike Lynch, founder of recently acquired Autonomy, speculated to the press that HP may produce an Autonomy Search Appliance, and furthermore that HP might integrate Autonomy IDOL with its high-end printers. That seemed funny at the time; does the world really need printers that can search? Risk management makes for strange bedfellows, I suppose.

But now, barely two weeks later, Lexmark's Perceptive Software company (a 'Lexmark Company'), acquires ISYS. One wonders if Mike was aware of those negotiations.

Stranger still: Perceptive also announced it had acquired Massachusetts-based Nolij, a leading provider of document imaging and workflow solutions. While their focus has been in the education marketplace, presumably there is some synergy between document imaging, scanning, search - and hard-copy output devices. And you'd think that growing a market among enterprises is a possibility for a well-respected imaging company formerly focused only on higher education markets.

So imagine a system that can scan and OCR textual content and full-text index the content into your enterprise search platform. Seamless magic. Do you think anyone would want that?

I think we'll see more and more of this kind of software/hardware integration, and not just 'because we can'. Consider this: New Zealand-based Pingar uses its software to perform entity recognition and extraction and is now embedded in high speed scanners to eliminate the need to manually enter data from standardized forms like insurance warranty claims. Disclosure: here at New Idea Engineering, we were so impressed with the value Pingar brings to enterprise search that we signed on as their first US-based partner

Can you imagine a scanner that can scan, OCR, extract entities, and create a search index in real time as the high speed scanner reads your forms. Federate that into your existing enterprise search platform with appropriate security, and some big problems for big companies are solved pretty much automatically. Seamless magic indeed!

What do you think? Would you find such a device useful at your organization?

 

 

 

 

February 26, 2012

How many gigabytes of memory on your printer?

I read an article originally tweeted by @nickpatience newly of search firm Recommind. In the FT article, HP's Mike Lynch talks about plans to introduce printers with embedded Autonomy IDOL.

At first, I had to chuckle. We've seen big systems brought to their knees indexing content with IDOL, and I imaged steam coming out of my HP laser printer as I print a long contract. (Maybe it was smoke... you know, printers need smoke to make them work. No, really. Ever seen a printer work after smoke came out of it?)

Then I realized that hundreds of companies bundle copies of IDOL with their products, and most implementations are quite successful with a relatively small footprint. And honestly, in another recent engagement, IDOL did provide the best 'out of the box' relevance. This is probably because of the way IDOL breaks documents into smaller units for indexing, and then reassembles them in the result list for human consumption.

But hang on for a minute. A printer with a search engine? I know IDOL is well known in eDiscovery applications; and I've also heard of cases where one team of lawyers will subpoena the disk drives from opposing client's printers. Correct me if I'm wrong, but if I'm printing a document, isn't there a good chance it exists on file servers that are already indexed with IDOL (or one of its competitors)? I'd think there is an audit trail back to the original document... no?

And what is the interface, do you suppose? Federated results in from an index within the printer? Traffic from the printer back to IDOL central servers to index the document as it passes through the network? I can imagine a way to reconstruct the document from the IDOL index; but that seems a bit extreme.

Anyway - it may just be that I'm too old-fashioned to understand this sort of thing. It feels to me like a technology - pardon me - in search for a market. I'd just as soon keep IDOL on my servers where I can understand what it's up to - and where it does a pretty darned good job!

What do you think?

 

January 11, 2012

Webinar: What users want from enterprise search in 2012

If you ask the average enterprise user what he or she wants from their internal search platform, chances are good that they will tell you they want search 'just like Google'. After all, people are born with the ability to use Google; why should they need to learn how to use their internal search?

The problem is that web search works so well because, at the sheer scale of the internet, search can take advantage of methodologies that are not directly applicable to the intranet. Yet many of the things that make the public web experience so good can, in fact, be adapted in the enterprise. Our opinion is that, beyond a base level, the success of any enterprise search platform depends on how it is implemented and managed rather than on the core technology.

In this webinar we'll talk about what users want, and how you can address the specific challenges of enterprise content and still deliver a satisfying and successful enterprise search experience inside the firewall.

Register today for our first webinar of the new year scheduled for January 25 : What enterprise users want from search in 2012.

 

 

 

 

 

 

January 10, 2012

ISYS filters to be used for SAP Platforms

ISYS announced today that SAP has selected the popular ISYS Document Filters to replace software from both Autonomy and Oracle in their popular suite of analytical products.

ISYS, which has marketed an enterprise search product successfully for years, recognized the need for high-capability and low cost document filters, and packaged their internally developed technology. Because of its capabilities, support and price, ISYS Document Filters have become the best choice for companies that need to extract content from hundreds of different formats.

We particularly like that the ISYS filters are lightweight, easy to implement, and priced such that any company can afford to use them in-house or bundled with product. For large companies that use  Lucene/Solr for search but insist on having supported up-to-date filtering technology can solve the problem at a competitive price with ISYS.

 

 

November 08, 2011

Are you spending too much on enterprise search?

If your organization uses enterprise search, or if you are in the market for a new search platform, you may want to attend our webinar next week "Are you spending too much for search?". The one hour session will address:

  • What do users expect?
  • Why not just use Google?
  • How much search do you need?
  • Is an RFI a waste of time?   

Date: Wednesday, November 16 2011

Time: 11AM Pacific Standard Time / 1900 UTC

Register today!

October 25, 2011

What search platform is best? Workshop at KMWorld

Next week in Washington DC, InfoToday runs their Fall enterprise search conferences - KM World, Enterprise Search Summit, SharePoint Symposium, and Taxonomy Boot Camp.. whew! Monday - Halloween Day! - I am giving a workshop at the conferences with the somewhat vague title 'Enterprise Search Technologies'.

What I'll be talking about is an overview of the platform vendors, with some detail on strengths and weaknesses of the vendors; and a drill down into what you need to do before you call the vendors (if you value your time).

You can still sign up for the workshop for $295US or the entire conference for a bit more; see you in DC in a week!

/s/Miles