178 posts categorized "Enterprise search"

June 28, 2017

Poor data quality gives search a bad rap

If you’re involved in managing the enterprise search instance at your company, there’s a good chance that you’ve experienced at least some users complain about the poor results they see. 

The common lament search teams hear is “Why didn’t we use Google?” when in fact, sites that implemented the GSA but don’t utilize the Google logo and look, we’ve seen the same complaints.

We're often asked to come in and recommend a solution. Sometimes the problem is simply using the wrong search platform: not every platform handles every user case and requirement equally well. Occasionally, the problem is a poorly or misconfigured search, or simply an instance that hasn’t been managed properly. Even the renowned Google public search engine doesn’t happen by itself, but even that is a poor example: in recent years, the Google search has become less of a search platform and more of a big data analytics engine.

Over the years, we’ve been helping clients select, implement, and manage Intranet search. In my opinion, the problem with search is elsewhere: Poor data quality. 

Enterprise data isn’t created with search in mind. There is little incentive for content authors to attach quality metadata in the properties fields of Adobe PDF Maker, Microsoft Office, and other document publishing tools. To make matters worse, there may be several versions of a given document as it goes through creation, editing, reviews, and updates. And often the early drafts, as well as the final version, are in the same directory or file share. Very rarely does a public facing web site content have such issues.

Sometimes content management systems make it easy to implement what is really ‘search engine optimization’ or SEO; but it seems all too often that the optimization is left to the enterprise search platform to work out.

We have an updated two-part series on data quality and search, starting here. We hope you find it helpful; let us know if you have any questions!

June 22, 2017

First Impressions on the new Forrester Wave

The new Forrester Wave™: Cognitive Search And Knowledge Discovery Solutions is out, and once again I think Forrester, along with Gartner and others, miss the mark on the real enterprise search market. 

In the belief that sharing my quick first impression will at least start a conversation going until I can write up a more complete analysis, I am going to share these first thoughts.

First, I am not wild about the new buzzterms 'cognitive search' and "insight engines". Yes, enterprise search can be intelligent, but it's not cognitive. which Webster defines as "of, relating to, or involving conscious mental activities (such as thinking, understanding, learning, and remembering)". HAL 9000 was cognitive software; "Did you mean" and "You might also like" are not cognition.  And enterprise search has always provided insights into content, so why the new 'insight engines'? 

Moving on, I agree with Forrester that Attivio, Coveo and Sinequa are among the leaders. Honestly, I wish Coveo was fully multi-platform, but they do have an outstanding cloud offering that in my mind addresses much of the issue.

However, unlike Forrester, I believe Lucidworks Fusion belongs right up there with the leaders. Fusion starts with a strong open source Solr-based core; an integrated administrative UI; a great search UI builder (with the recent acquisition of Twigkit); and multiple-platform support. (Yep, I worked there a few years ago, but well before the current product was created).

I count IDOL in with the 'Old Guard' along with Endeca, Vivisimo (‘Watson’) and perhaps others - former leaders still available, but offered by non-search companies, or removed from traditional enterprise search (Watson). And it will be interesting to see if Idol and its new parent, Microfocus, survive the recent shotgun wedding. 

Tier 2, great search but not quite “full” enterprise search, includes Elastic (which I believe is in the enviable position as *the* platform for IoT), Mark Logic, and perhaps one or two more.

And there are several newer or perhaps less-well known search offerings like Algolia, Funnelback, Swiftype, Yippy and more. Don’t hold their size and/or youth against them; they’re quite good products.

No, I’d say the Forrester report is limited, and honestly a bit out of touch with the real enterprise search market. I know, I know; How do I really feel? Stay tuned, I've got more to say coming soon. What do you think? Leave a comment below!

November 16, 2016

What features do your search users really want?

What features and capabilities do corporate end-users need from their search platform? Here's a radical concept: ask stakeholders what they want- and what they need - and making a list. No surprise: you'll have too much to do.

Try this: meet with stakeholders from each functional area of the organization. During each interview, ask people to tell you what internet search sites they use for personal browsing, and what capabilities of those sites they like best. As they name the desired features, write them on a white board.

Repeat this with representatives from every department, whether marketing, IT, support, documentation, sales, finance, shipping or others - really every group that will use the platform for a substantial part of their days. 

Once you have the list, ask for a little more help. Tell your users they each have $100 "Dev Dollars" to invest in new features, and ask them to spend whatever portion they want to pay for each feature - but all they have is $100 DD.

Now the dynamics get interesting. The really important features get the big bucks; the outliers get a pittance -  if anything. Typically, the top two or three features requested get between 40DD and 50DD; and that quickly trails off. 

I know - it sounds odd. These Dev Dollars have no true value - but people give a great deal of thought to assigning relative value to a list of capabilities - and it gives you a feature list with real priorities.

How do you discover what users really want? 

 

 

May 31, 2016

The Findwise Enterprise Search and Findability Survey 2016 is open for business

Would you find it helpful to benchmark your Enterprise Search operations against hundreds of corporations, organizations and government agencies worldwide? Before you answer, would you find that information useful enough that you’re spend a few minutes answering a survey about your enterprise search practices? It seems like a pretty good deal to me to have real-world data from people just like yourself worldwide.

This survey, the results of which are useful, insightful, and actionable for search managers everywhere, provides the insight into many of the critical areas of search.

Findwise, the Swedish company with offices there and in Denmark, Norway Poland, Norway and London, is gathering data now for the 2016 version of their annual Enterprise Search and Findability Survey at http://bit.ly/1sY9qiE.

What sorts of things will you learn?

Past surveys give insight into the difference between companies will happy search users versus those whose employees prefer to avoid using internal search. One particularly interesting finding last year was that there are three levels of ‘search maturity’, identifiable by how search is implemented across content.

The least mature search organizations, roughly 25% of respondents, have search for specific repositories (siloes), but they generally treat search as ‘fire and forget’, and once installed, there is no ongoing oversight.

More mature search organizations that represent about 60% of respondents, have one search for all silos; but maintaining and improving search technology has very little staff attention.

The remaining 15% of organizations answering the survey invest in search technology and staff, and continuously attempt to improve search and findability. These organizations often have multiple search instances tailored for specific users and repositories.

One of my favorite findings a few years back was that a majority of enterprises have “one or less” full time staff responsible for search; and yet a similar majority of employees reported that search just didn’t work. The good news? Subsequent surveys have shown that staffing search with as few as 2 FTEs improves overall search satisfactions; and 3 FTEs seem to strongly improve overall satisfaction. And even more good news: Over the years, the trend in enterprise search shows that more and more organizations are taking search and findability seriously.

You can participate in the 2016 Findwise Enterprise Search and Findability Survey in just 10 or 15 minutes and you’ll be among the first to know what this year brings. Again, you’ll find the 2016 survey at http://bit.ly/1sY9qiE.

November 05, 2014

Search Owner's Dilemma

In my session today at Enterprise Search & Discovery, I finished up with a rendition of the old rhyme called "The Engineer's Dilemma", updated for the folks who manage enterprise search in large organizations. Folks seemed to like it, so I'll share it here for those who were unable to be at the conference. I call it "The Search Manger's Dilemma"

It's not my job to pick our search
The call's not up to me.
It's not my place to say how much
The cost of search should be.
It's not my place to tune the thing, not even do it well,
But let the damn thing miss a page And see who catches hell!

Enjoy!

August 25, 2014

Is Elasticsearch really enterprise search?

Not too long ago, Gartner released it's the 2014 Magic Quadrant which I’ve written about here and which has generated a lively discussion on the Enterprise Search Engine Professionals group over on LinkedIn.

Much of the discussions I’ve seen about this year's MQ deals with the omission of several platforms that most people think of as 'enterprise search’. Consider that MQ alumni Endeca, Exalead, Vivisimo, Microsoft FAST, and others don’t even appear this year. Over the last few years larger companies acquired most of these players, but in the MQ it's as if they simply ceased to exist.

The name I've heard mentioned besides these previous MQ alumni is Elasticsearch, a relatively new start-up. Elasticsearch, based on Apache Lucene, recently had a huge round of investment by some A-List VCs. What's the deal, Gartner?

Before I share my opinion, I have to reiterate that, until recently, I was an employee of Lucidworks, which many people see as a competitor to Elasticsearch. I believe my opinions are valid here, and I believe I’m known for being vendor-neutral. I think the best search platform for a given environment is a function of the platform and the environment – what data source, security, management and budget apply for any given company or department. “Search engine mismatch’ is a real problem and we’ve written about it for years.

Given that caveat, I believe I’m accurately describing the situation, and I encourage you to leave a comment if you think I've lost my objectivity!

OK, here goes. I don't believe Elasticsearch is in the enterprise search space. For that reason, if for no other, it doesn’t belong on the Gartner Magic Quadrant for search.

You heard it here. It's not that I don't think Elasticsearch isn’t a powerful, cool, and valuable tool. It is all that, and more. As I mentioned, it’s based on Apache Lucene, a fantastic embedded search tool. In fact, it's the same tool Solr (and therefore Lucidworks' commercial products) are based on.  But Lucene by itself is a tool more than a solution for enterprise search.

Let me start by addressing what I think Elasticsearch is great for: search-enabled data visualization. The first time I attended an Elasticsearch meet-up, they were showing the product in conjunction with two other open source projects: Logstash and Kibana. The total effect was great and made for a fantastic demo! I was fully and completely impressed, and saw the value immediately - search driving a visualization tool that was engaging, interactive, and exciting! 

Since then, Elasticsearch has apparently hired the guys who created those two respective open source projects, and has now morphed into a log analytics company - more like Splunk with great presentation capability, and less like traditional enterprise search. Their product is ELK - Elasticsearch Logstash Kibana. You can download all of these from GitHub, by the way.

(Lucidworks has also seen the value of Kibana to enterprise search, and has released their own version of Logstash and Kibana integrated with Solr called SiLK (Solr-Integrated Logstash and Kibana).

Now let me tell you why I do not think of Elasticsearch as an enterprise search solution. First, in my time at Lucid, I'm not aware of any enterprise opportunities that Lucidworks lost to ELK. I could be wrong, and maybe the Elastic guys know of many deals we never saw at Lucid. But with no crawler and other components I consider ‘required’ as part of an enterprise search product, I'm not sure they're interested - yet, at least.

Next, check the title of their home page: "Open Source Distributed Real Time Search". Doesn't scream 'Google Search Appliance replacement', does it? Read Elasticsearch founder Shay Banon on the GSA.

Finally, Wired Magazine has an even more interesting quote: Shay Banon on SharePoint. “We're not doing enterprise search in the traditional sense. We're not going to index SharePoint documents”.

Now, with the growth and the money Elasticsearch has, they may change their tune. But with over $100M in venture capital now, I think their investors are valuing Elasticsearch as a Splunk competitor, and perhaps a NoSQL search product for Hadoop - not a traditional enterprise search engine. 

So the real question is: which space are you in? Enterprise Search with SharePoint and other legacy data sources? Web content and file shares you need a crawler for? Is LDAP or Active Directory security important to you? Well - I won't say 'no way' - but I'd want to see it before I buy.

Do you use Elasticsearch for your enterprise search? Let me hear from you!

 

 

 

August 21, 2014

More on the Gartner MQ: Fact or fiction?

There is a lively discussion going on over in the LinkedIn ‘Enterprise Search Engine Professionals’ group about the recent Gartner Magic Quadrant report on Enterprise Search. Whit Andrews, a Gartner Research VP, has replied that the Gartner MQ is not a 'pay to play'. I confess guilt to have been the one who brought the topic up in these threads, at least, and I certainly thank Whit for clarifying the misunderstanding directly.

That said, two of my colleagues who are true search experts have raised some questions I thought should be addressed.

Charlie Hull of UK-based Flax says he's “unconvinced of the value of the MQ to anyone wanting a comprehensive … view of the options available in the search market'. And Otis Gospodnetić of New York-based Sematext asks "why (would) anyone bother with Gartner's reports. We all know they don't necessarily match the reality". I want to try to address those two very good points.

First, I'm not sure Gartner claims to be a comprehensive overview of the search market. Perhaps there are more thorough lists- my friends and colleagues Avi Rappoport and Steve Arnold both have more complete coverage. Avi, now at Search Technologies, still maintains   

www.searchtools.com with a list that is as much a history of search as a list of vendors. And Steve Arnold has a great deal of free content on his site as well as high quality technology overviews by subscription. Find links to both at arnoldit.com.

Nonetheless, Gartner does have published criterion, and being a paid subscriber is not one of them. His fellow Gartner analyst French Caldwell calls that out on his blog. By the way, I have first-hand experience that Gartner is willing to cut some slack to companies that don't quite meet all of their guidelines for inclusion, and I think that adds credence to the claim that everything.

A more interesting question is one that Otis raises: “why would anyone bother with Gartner's reports”?

To answer that, let me paraphrase a well-known quote from the early days of computers: "No one ever got fired for following Gartner's advice". They are well known for having good if not perfect advice - and I'd suspect that in the fine print, Gartner even acknowledges the fallibility of their recommendations. And all of us know that in real life, you can't select software as complex as an enterprise search platform without a proof of concept in your environment and on your content.

The industry is full of examples where the *best* technology loses pretty consistently to 'pretty good' stuff backed by a major firm/analyst/expert. Otis, I know you're an expert, and I'd take what you say as gospel. A VP at a big corporation who is not familiar with search (or his company's detailed search requirements) may not do so. And any one on that VP's staff who picks a platform based solely on what someone like you or I say probably faces some amount of career risk. That said, I think I speak for Otis and Charlie and others when I say I am glad that a number of folks have listened to our advice and are still fully employed!]

So - in summary, I think we're all right. Whit Andrews and Gartner provide advice that large organizations trust because of the overall methodology of their evaluation. Everyone does know it's not infallible, so a smart company will use the 'trust but verify' approach. And they continue to trust you and I, but more so when Gartner or Forrester or one of the large national consulting companies conforms our recommendation. And of not, we have to provide a compelling reason why something else is better for them. And the longer we're successful with out clients, the more credible we become.

 

 

July 29, 2014

Big data: Salvation for enterprise search?

Or just another data source?

With all the acquisitions we've seen in enterprise search in the last several years, it's no wonder that the field looks boring to the casual observer. Most companies have gone through two or more complex, costly search implementations to a new search platform, users still complain, and in some quarters, there seems to be 'quality of search fatigue'. I acknowledge I'm biased, but I think enterprise search implemented and managed properly provides incredible value to corporations, their employees, and their customers/consumers. That said, a lot of companies seem to treat search as 'fire and forget', and after it's installed, it never gets the resources to get off the ground in a quality way.

It's no surprise then that the recent hype bubble in 'Big Data' has the attention of enterprise search companies as they see a way to convince an entirely new group of technologists that search is the way. 

It's certainly true that Hadoop's beginning was related to search - as a repository for web crawler Nutch in preparation for highly scalable indexing in Lucene/Solr no doubt. Hadoop and its zoo* of related tools certainly are designed for nerds. At best, it's a framework that sits on top of your physical disks; worst case it's a file system that does support authentication but not really security (in the LDAP/AFD sense). And it's a great tool to write 'jobs' to manipulate content in interesting ways to a data scientist. How is your Java? Python? Clojure? Better brush up.

The enterprise search vendors of the world certainly see the tremendous interest in Hadoop and 'big data' as a great opportunity to grow their business. And for the right use cases, most enterprise search platforms can address the problem. But remember that, to enterprise search, the content you store in Hadoop is simply content in a different repository: a new data source on the menu.

But remember, big data apps come with all the same challenges enterprise search has faced for years plus a few more. Users - even if data scientists and researchers - think web-based Google is search; and even though - as a group - this demographic may be more intelligent than your average search users, they still expect your search to 'just know". If you think babysitting your existing enterprise search solution is touch, wait until you see what billions of documents does for you.

And speaking of billions of records - how long does your current search take to index your current content? How long does it take to do a full index of your current content? Now extrapolate: how long will it take to index a few billion records? (Note: some vendors can provide a much faster 'pipe' for indexing massive content from Hadoop. Lucidworks and Cloudera are two of the companies I am familiar with; there may be others)

A failure in search? Well, it depends what you want. If you are going to treat Hadoop as a 'Ha-Dump' with all of your log files, all of your customer transactional data, hundreds of Twitter feeds for ever and ever, and add your existing enterprise data, you're going to have some time on your hands while the data gets indexed.

On the other hand, if you're smart about where your data goes, break it into 'data lakes' of related content, and use the right tool for each type of data, you won't be using your enterprise search platform for use cases better served with analytics tools that are part of the Apache Zoo; and you’ll still be doing pretty well. And in that universe, Hadoop is just another data source for search - and not the slow pipe through which all of your data has to flow.

Do you agree?

 

*If you get the joke, chances are you know a bit about the Apache project and open source software. If not, you may want to hold off and research before you download your first Hadoop VM.   

 

July 21, 2014

Gartner MQ 2014 for Search: Surprise!

Funny, just last week I tweeted about how late the Gartner Magic Quadrant for Enterprise Search is this year. Usually it's out in March, and here it is, July.

Well, it's out - and boy does it have some surprises! My first take:

Coveo, a great search platform that runs on Windows only, is in the Leaders quadrant, and best overall in the "Completeness of Vision". Don't get me wrong, it's a great search platform; but I guess completeness of vision does not include completeness of platform. Linux your flavor? Sorry.

HP/Autonomy IDOL is in the upper right quadrant as well, back strong as the top in 'Ability to Execute' and in the top three on 'Completeness of Vision'. IDOL has always reminded me of the reliable old Douglas DC-3, described by aviation enthusiasts as 'a collection of parts flying in loose formation', but it really does offer everything enterprise search needs. And, because it loves big hardware, everything that HP loves to sell.

BA Insight surprised me with their Knowledge Integration Platform at the top of the Visionaries quadrant. It enhances Microsoft SharePoint Search, or runs with a stand-alone version of Lucene. It's very cool, yes. But I sure don't think of it as a search engine. Do you? More on this later.

Attivio comes in solid in the lower right 'Visionaries' quadrant. I'd really expected to see them further along on both measures, so I'm surprised.

I'm really quite disappointed that Gartner places my former employer Lucidworks solidly in the lower left 'Niche players' quadrant. I think Lucidworks has a very good vision of where they want to go, and I think most enterprises will find it compelling once they take a look. I don’t think I'm biased when I say that this may be Gartner's big miss this year. And OK, I understand that, like BA Insight's Knowledge product, Lucidworks needs a search engine to run, but it feels more like a true search platform.

Big surprise: IHS, which I have always thought as a publisher, has made it to the Gartner Niche quadrant as a search platform. Odd.

Other surprises: IBM in the Niche market quadrant, based on 'Ability to Execute'. Back at Verity, then CEO Philippe Courtot got the Gartner folks to admit that the big component of Ability to Execute was really about how long you could fund the project and I have to confess I figured IBM (and Google) as the MQ companies with the best cash position.

If you're not a Gartner client, I'm sorry you won't get the report or the insights Whit Andrews (@WhitAndrews _), a long time search analyst who knows his stuff. You can still find the report from several vendors happy to let you download the Gartner MQ Search from them. Search Google and find the link you most prefer, or call your vendor for a full copy.

/s/Miles

A new V for Big Data: Visitor

About an hour after I wrote my most recent post, What does it take to qualify as 'Big Data'? about the multiple Vs of Big Data. As you can imagine, I had words that start with the letter 'V' bouncing around in my head. Then it hit me: another V word for Big Data is Visitor

For years, I've been writing about the importance of context in search - basically data about the search user. Without context, we can show some basic full-text search results. With context, we try to get into the user's head and understand what the search terms mean to him or her. 

On the Internet, this usually means the physical location of the user IP address, previous searches he/she may have done, or products the searcher browsed. (Aside: Just last week, I was on Amazon looking for an adapter that would let me convert my desk so i could work standing up rather than sitting down. A few days later, I searched Google for a news story I had seen. One of the ads that showed up on the results? An ad from Amazon for a stand up desk adaptor. Now that's what I mean by 'context'!)

Inside the organization, user context might include things like the user's department, physical location, job title, native language, or product specialty. And if you want to do search right, you need to start finding a way to use that context. 

And as I was thinking about V words.. Visitor came to mind. Whether you call it context, signals, search and browse history or just environment, all of this is critical to successful implementation of enterprise search. This is true whether you are searching big data or your SharePoint repository. And as far as most search platforms go, you still have to do it yourself.