4 posts categorized "Recommind"

January 20, 2015

Your enterprise search is like your teenager

During a seminar a while back, I made this spontaneous claim. Recently, I made the comment again, and decided to back up my claim - which I’ll do here.

No, really – it’s true. Consider:

You can give your search platform detailed instructions, but it may or may not do things the way you meant:

Modern search platforms provide a console where you, as the one responsible for search, can enter all of the information needed to index content and serve up results. You tell it what repositories to index; what security applies to the various repositories; and how you want the results to look.  But did it? Does it give you a full report of what it did, what it was unable to do, and why?

You really have no idea what it’s doing – especially on weekends:

 Search platforms are notorious for the lack of operational information they provide.

Does your platform give you a useful report of what content was indexed successfully, and which were not – and why? And some platforms stop indexing files when they reach a certain size: do you know what content was not completely indexed?

When it does tell you, sometimes the information is incomplete: 

Your crawler tells you there were a bunch of ‘404’ errors because of a bad or missing URL; but will it tell you which page(s) had the bad link? Chances are it does not. 

They can be moody, and malfunction without any notice:

You schedule a full update of you index every weekend, and it has always worked flawlessly – as far as you know. Then, usually on a 3-day weekend, it fails. Why? See above.

When you talk to others who have search, theirs always sounds much better than yours:

As a conscientious search manager, you read about search, you attend webinars and conferences, and you always want to learn more. But you wonder why other search mangers seem to describe their platform in glowing terms, and never seem to have any of the behavioral issues you live with every day. It kind of makes you wonder what you’re doing wrong with yours.

It costs more to maintain than you thought and it always needs updates:

When you first got the platform you knew there we ongoing expenses you’d have to budget – support, training, updates, consulting. But just like your kid who needs books, a computer, soccer coaching, and tuition, it’s always more than you budgeted. Sometimes way more!

You can buy insurance, but it never seems to cover what you really need:

Bear with me here: you get insurance for your kids in case they get sick or cause an accident, and you buy support and maintenance for your search platform.  But in the same way that you end up surprised that orthodontics are not fully covered, you may find out that help tuning the search platform, or making it work better, isn’t covered by the plan you purchased – in fact, it wasn’t even offered. QED.

It speaks a different vocabulary:

You want to talk with your kid and understand what’s going on; you certainly don’t want to look uncool. But like your kid, your search platform has a vocabulary that only barely makes sense to you. You know rows and columns, and thought you understood ‘fields’; but the search platform uses words you know but that don’t seem to be the same definition you’ve known from databases or CMS systems.

It's hard for one person to manage, especially when it's new:

Many surveys show that most companies have one (or less) full-time staff responsible for running the search engine – while the same companies claim search is ‘critical’ to their mission.  Search is hard to run, especially in the first few years when everything needs attention. You can always get outside help – not unlike day care and babysitters – but it just seems so much better if you could have a team to help manage and maintain search to make it behave better.

How it behaves reflects on you:

You’re the search manager and you’ve got the job to make search work “just like Google”.  You spent more than $250K to get this search engine, and the fact that it just doesn’t work well reflects badly on you and your career. You may be worried about a divorce.

It doesn’t behave like the last one:

People tend to be nostalgic, as are many search managers I know. They learned how to take care of the previous one, but this new one – well, it’s NOTHING like the earlier one. You need to learn its habits and behaviors, and often adjust your behavior to insure peace at work.

You know if it messes up badly late at night, even on a weekend or a holiday, you’ll hear about it:

If customers or employees around the world use your search platform, there is no ‘down time’: when it’s having an issue, you’ll hear about it, and will be expected to solve the issue – NOW. You may even have IT staff monitoring the platform; but when it breaks in some odd and unanticipated way, you get the call. (And when does search ever fail in an expected way?)

 You may be legally responsible if it messes up:

Depending on what your search application is used for, you may find yourself legally responsible for a problem. Fortunately, the chances of you personally being at fault are slim, but if your company takes a hit for a problem that you hadn’t anticipated, you may have some ‘career risk’ of your own. Was secure content about the upcoming merger accidentally made public? Was content to be served only to your Swiss employees when they search from Switzerland exposed outside of the country? And you can’t even buy liability insurance for that kind of error.

When it’s good, you rarely hear about it; when it's bad, you’ll hear about it:

Seriously, how many of you have gotten a call from your CIO to tell you what a great experience he or she had on the new search platform? Do people want to take you to lunch because search works so well? If you answered ‘yes’ to either of these, I’d like to hear from you!

In my experience, people only go out of their way to give feedback on search when it’s not working well. It’s not “like Google”. Even though Google has hundreds or people and ‘bots’ examining every search query to try to make the result better, and you have only yourself and an IT guy.

You’ll hear. 

The work of managing it is never done:

The wonderful southern writer Ferrol Sams wrote :

“He's a good boy… I just can't think of enough things to tell him not to do.” Sound like your search platform? It will misbehave (or fail outright) in ways you never considered, and your search vendor will tell you “We’ve never seen a problem like that before”. Who has to get it fixed? You have to ask?

Once it moves away, you sometimes feel nostalgic:

Either you toss it out, or a major upgrade from your vendor comes alone and the old search platform gets replaced. Soon, you’re wishing for the “Good old days” when you knew how cute and quirky the old one was, and you find yourself feeling nostalgic for it and wishing that it didn’t have to move out.

Do you agree with my premise? What  have I missed?

July 21, 2014

Gartner MQ 2014 for Search: Surprise!

Funny, just last week I tweeted about how late the Gartner Magic Quadrant for Enterprise Search is this year. Usually it's out in March, and here it is, July.

Well, it's out - and boy does it have some surprises! My first take:

Coveo, a great search platform that runs on Windows only, is in the Leaders quadrant, and best overall in the "Completeness of Vision". Don't get me wrong, it's a great search platform; but I guess completeness of vision does not include completeness of platform. Linux your flavor? Sorry.

HP/Autonomy IDOL is in the upper right quadrant as well, back strong as the top in 'Ability to Execute' and in the top three on 'Completeness of Vision'. IDOL has always reminded me of the reliable old Douglas DC-3, described by aviation enthusiasts as 'a collection of parts flying in loose formation', but it really does offer everything enterprise search needs. And, because it loves big hardware, everything that HP loves to sell.

BA Insight surprised me with their Knowledge Integration Platform at the top of the Visionaries quadrant. It enhances Microsoft SharePoint Search, or runs with a stand-alone version of Lucene. It's very cool, yes. But I sure don't think of it as a search engine. Do you? More on this later.

Attivio comes in solid in the lower right 'Visionaries' quadrant. I'd really expected to see them further along on both measures, so I'm surprised.

I'm really quite disappointed that Gartner places my former employer Lucidworks solidly in the lower left 'Niche players' quadrant. I think Lucidworks has a very good vision of where they want to go, and I think most enterprises will find it compelling once they take a look. I don’t think I'm biased when I say that this may be Gartner's big miss this year. And OK, I understand that, like BA Insight's Knowledge product, Lucidworks needs a search engine to run, but it feels more like a true search platform.

Big surprise: IHS, which I have always thought as a publisher, has made it to the Gartner Niche quadrant as a search platform. Odd.

Other surprises: IBM in the Niche market quadrant, based on 'Ability to Execute'. Back at Verity, then CEO Philippe Courtot got the Gartner folks to admit that the big component of Ability to Execute was really about how long you could fund the project and I have to confess I figured IBM (and Google) as the MQ companies with the best cash position.

If you're not a Gartner client, I'm sorry you won't get the report or the insights Whit Andrews (@WhitAndrews _), a long time search analyst who knows his stuff. You can still find the report from several vendors happy to let you download the Gartner MQ Search from them. Search Google and find the link you most prefer, or call your vendor for a full copy.

/s/Miles

March 29, 2012

Recommind looking to Predictive Coding to improve eDiscovery search

Recommind, the search vendor best known for its focus on eDiscovery, has recently blogged about predictive coding and their use of it in the product line.  What, you may ask, is predictive coding? The blog post above does a good job of providing background and some definition.

Craig Carpenter, the author of the post, makes it clear that predictive coding is an assist for human reviewers; it is technology that is useful in conjunction with people and workflow. It assists the process of identifying critical concepts and using the information - in this case, in a discovery matter.

Autonomy has blasted predictive coding as inferior to its 'meaning based coding'. Now, Autonomy has advertised the strength of its 'meaning based search' for a while now, so whether meaning based coding is something new for them, or a repackaging of the existing IDOL technology, we can't say.

We can say that eDiscovery is a growing field, and we've seen Recommind and hosted eDiscovery vendors like Catalyst (based on Mark Logic) make some big inroads. To have eDiscovery system up and running in days rather than months is a key advantage; and we think these two are among the strong contenders in the market. 

How long before predictive coding finds its way into your search platform's marketing material? Give us a few months and we'll tell you.

Do you see an advantage or disadvantage for meaning based coding over predictive coding?  Let us know.

 

 

 

February 26, 2012

How many gigabytes of memory on your printer?

I read an article originally tweeted by @nickpatience newly of search firm Recommind. In the FT article, HP's Mike Lynch talks about plans to introduce printers with embedded Autonomy IDOL.

At first, I had to chuckle. We've seen big systems brought to their knees indexing content with IDOL, and I imaged steam coming out of my HP laser printer as I print a long contract. (Maybe it was smoke... you know, printers need smoke to make them work. No, really. Ever seen a printer work after smoke came out of it?)

Then I realized that hundreds of companies bundle copies of IDOL with their products, and most implementations are quite successful with a relatively small footprint. And honestly, in another recent engagement, IDOL did provide the best 'out of the box' relevance. This is probably because of the way IDOL breaks documents into smaller units for indexing, and then reassembles them in the result list for human consumption.

But hang on for a minute. A printer with a search engine? I know IDOL is well known in eDiscovery applications; and I've also heard of cases where one team of lawyers will subpoena the disk drives from opposing client's printers. Correct me if I'm wrong, but if I'm printing a document, isn't there a good chance it exists on file servers that are already indexed with IDOL (or one of its competitors)? I'd think there is an audit trail back to the original document... no?

And what is the interface, do you suppose? Federated results in from an index within the printer? Traffic from the printer back to IDOL central servers to index the document as it passes through the network? I can imagine a way to reconstruct the document from the IDOL index; but that seems a bit extreme.

Anyway - it may just be that I'm too old-fashioned to understand this sort of thing. It feels to me like a technology - pardon me - in search for a market. I'd just as soon keep IDOL on my servers where I can understand what it's up to - and where it does a pretty darned good job!

What do you think?