22 posts categorized "Search Analytics"

May 31, 2016

The Findwise Enterprise Search and Findability Survey 2016 is open for business

Would you find it helpful to benchmark your Enterprise Search operations against hundreds of corporations, organizations and government agencies worldwide? Before you answer, would you find that information useful enough that you’re spend a few minutes answering a survey about your enterprise search practices? It seems like a pretty good deal to me to have real-world data from people just like yourself worldwide.

This survey, the results of which are useful, insightful, and actionable for search managers everywhere, provides the insight into many of the critical areas of search.

Findwise, the Swedish company with offices there and in Denmark, Norway Poland, Norway and London, is gathering data now for the 2016 version of their annual Enterprise Search and Findability Survey at http://bit.ly/1sY9qiE.

What sorts of things will you learn?

Past surveys give insight into the difference between companies will happy search users versus those whose employees prefer to avoid using internal search. One particularly interesting finding last year was that there are three levels of ‘search maturity’, identifiable by how search is implemented across content.

The least mature search organizations, roughly 25% of respondents, have search for specific repositories (siloes), but they generally treat search as ‘fire and forget’, and once installed, there is no ongoing oversight.

More mature search organizations that represent about 60% of respondents, have one search for all silos; but maintaining and improving search technology has very little staff attention.

The remaining 15% of organizations answering the survey invest in search technology and staff, and continuously attempt to improve search and findability. These organizations often have multiple search instances tailored for specific users and repositories.

One of my favorite findings a few years back was that a majority of enterprises have “one or less” full time staff responsible for search; and yet a similar majority of employees reported that search just didn’t work. The good news? Subsequent surveys have shown that staffing search with as few as 2 FTEs improves overall search satisfactions; and 3 FTEs seem to strongly improve overall satisfaction. And even more good news: Over the years, the trend in enterprise search shows that more and more organizations are taking search and findability seriously.

You can participate in the 2016 Findwise Enterprise Search and Findability Survey in just 10 or 15 minutes and you’ll be among the first to know what this year brings. Again, you’ll find the 2016 survey at http://bit.ly/1sY9qiE.

September 18, 2014

Lucidworks ships Fusion 1.0 - Pretty exciting next gen platform.

OK, I've known about this coming for a while, just didn't know when until this afternoon - so I stayed up late to get the download started after midnight.

Fusion is more than an updated release of Lucidworks Search. It is Solr based, but it's a re-write from top to bottom. And it's not a bare bones search API only a developer can love. Connectors? Check. Security? Check. Analytics? Check. Entity extraction? Check. All included. 

But what it adds is where the real capabilities and contributions are. Machine learning? Check. Admin console? Check. Machine learning? Check. Log analytics? Check. A document pre-processing pipeline? Check. Deep signal processing (think 'automated context processing')? Check. 

Even if you think these new unique capabilities are not your style, then you can buy Solr support and still get licenses for connectors, entity extraction, and a handful of other formerly 'premium' products. Want it all? License the full product at a per-node price I always thought was underpriced. I'm sure you'll be hearing alot more in the coming days and weeks, but go - download - try - and see what it does for your sites. Your developers will love it, your business owners will love it, your users will love it, and I bet even your CFO will love it.  

Full disclosure: I am a former employee of Lucidworks; but I'd be just as excited even if I were not. Go download it for sure and try it on your content. But be sure to check out the  'search as killer app' video on Lucid's home page www.lucidworks.com

s/ Miles

 

 

September 09, 2014

Sometimes you're just wrong! (Maybe).

OK, this one falls into the 'eat your own words' category, so I have to come clean. Well, partly clean. Let me explain.

I was out of town last week, but just before I left I wrote an article asserting that Elasticsearch really isn't 'enterprise' search. The article drew alot of attention and comments from both sides of the argument. I have to say I still think that's the case, but an announcement by Microsoft seems to differ, and end up a net positive for Elasticsearch. Microsoft tells us that Elasticsearch is the platform under the covers of Microsoft's Azure search offering. It looks like you have a couple of options - as long as you're on Azure:

a) You can download and use the open source Elasticsearch platform available on GitHub; or

b) Use Microsoft's managed service 'Facetflow Elasticsearch' which incorporates (some of) the open source code in various places.

Microsoft calls this "a fully-managed real-time search and analytics service" while, according to ZDNet, it is for 'web and mobile application developers looking to incorporate full-text search into their applications'. 

Either way, it's certainly yet another step forward for Elasticsearch, and is a big step forward in visibility for the company. It's not clear what kind of revenue they will receive from the deal - Microsoft being relatively famous for being quite frugal. And after all, smart search folks like Kevin Green of Spantree Technology Group talk about its strengths and liabilities, saying it *is* fast ('wicked fast'); fault-tolerant; distributed and more. But it is not a crawler; a machine learner; a user-facing front end, and it is not secure. 

So I'll agree a partial 'mea culpa' is in order; adding capabilities to an open source project can make it more enterprise ready. But I think the jury may still be out on the rest of my piece. Stay tuned!

September 11, 2012

Are you Tracking MRR? - "Mean Reciprocal Rank" Trend Monitoring

MRR is a simple numerical technique to monitor the overall relevancy performance of search engines over time. It is based on click-throughs in the search results, where a click on the top document is scored as 100%, a click on the second document is 50%, 3rd document is 33%, etc. These numbers are collected and averaged over units of time.

The absolute value of MRR is not necessarily the important statistic because each site has different content, different classes of users, and different search technology. However, the trend of MRR over time can allow a site to spot changes quickly. It can also be used to score "A/B" testing.

There are certainly more elaborate algorithms that can be used, and MRR doesn’t account for whether a user liked the document once they opened it. But having even possibly imperfect performance data that can be trended over time is better than having nothing.

Reference: http://en.wikipedia.org/wiki/Mean_reciprocal_rank

Walter Underwood (of Ultraseek, Netflix, MarkLogic fame) gave a presentation (in PPT/PowerPoint) of this topic a couple years ago about NetFlix's use of MRR.

January 11, 2012

Webinar: What users want from enterprise search in 2012

If you ask the average enterprise user what he or she wants from their internal search platform, chances are good that they will tell you they want search 'just like Google'. After all, people are born with the ability to use Google; why should they need to learn how to use their internal search?

The problem is that web search works so well because, at the sheer scale of the internet, search can take advantage of methodologies that are not directly applicable to the intranet. Yet many of the things that make the public web experience so good can, in fact, be adapted in the enterprise. Our opinion is that, beyond a base level, the success of any enterprise search platform depends on how it is implemented and managed rather than on the core technology.

In this webinar we'll talk about what users want, and how you can address the specific challenges of enterprise content and still deliver a satisfying and successful enterprise search experience inside the firewall.

Register today for our first webinar of the new year scheduled for January 25 : What enterprise users want from search in 2012.

 

 

 

 

 

 

December 12, 2011

New Phrase for determining Sentiment Analysis / Customer Interest

If you lookup:

fedex "Package not due for delivery"

which is one of the status messages you can get when tracking a package, you'll see a lot of postings asking about it.

FYI: It means your new toy has arrived in the city you live in, but will NOT be delivered today, because they didn't promise to get it to you until tomorrow.  Whether this is to force customers into paying for express service, or simply a logistics issue, or a mix of the two, depends on your view of companies and I won't get into that here.

However, you'll notice a lot of the postings asking about it are from folks waiting for delivery of things they're very excited to get, often some big-ticket peice of shiny electronics.  They're dying for Fedex to deliver it - they're so anxious and upset about the delay that they motivated enough to go online and search, and make ranting posts - all because their "toy" is delayed.

So we have particular emotional response, often about an upscale product, with a reasonably distinct search phrase - cool!

Yes, yes, of course you could say that the customers are mad about the percieved injustice of it, the Occupy Wall Street spin, or that sometimes the package could be really important for other reasons, which are certainly valid points.  I'm not taking sides or passing judgement - and I found discovered this today looking for a friend's overdue toy - that's not the point.  I'm just saying that I bet there's a good statistical correlation, and of course it wouldn't apply 100% of the time - which would actually be quite rare in such things.

November 08, 2011

Pingar and New Idea Engineering Partnership

I'm happy to announce that our company, New Idea Engineering, has announced a partnership with Pingar, a New Zealand-based company that provides tools to extend and enhance the capabilities of enterprise search. New Idea Engineering is Pingar's first North American reseller.

Pingar markets libraries that provide tools for entity extraction, document summarization, redaction for key documents, autocomplete and a number of other capabilities that organizations can use to improve the user search experience.

In the developer area, Pingar provides access to view the various capabilities in action. For example, you can paste in the text of a document and see the summarization or view the redaction or any of the other Pingar capabilities. Developers can download an API key to test the code yourself. Pingar supports both C# and Java.

We'll be writing more about Pingar in action over the coming months.

 

July 12, 2011

A really good book by Lou Rosenfeld

Search Analytics for Your Site: Conversations with Your Customers is out, and while I've only just started reading it, it's a keeper.

Lou, a long-time pro in search analytics, relates not only the problem, but the solutions as well.

Early in the book, there is a telling anecdote: major relevancy problems can be caused by the omission of a single configuration file; or even a single badly set option.

When you roll out a major new system, have two different sets of eyes check everything!

/s/Miles

 

March 21, 2011

Time After Time - Zero Search Results for AD CAMPAIGNS and Model Numbers

Print Ads -> Website -> Zero Results

I see a print advertisement for something and go to the company's web site.  No sign of their advertised product.  Do a search, zero results.  Oh, and no suggestion about who I might email, both for pruduct info, nor for reporting site problems.

So, how much did that color print ad cost to run?  And how response rates you say?  I wonder how the next staff meeting goes.  "Clearly the problem is with the print ad's font", or hey "maybe we just need to rename the product!?"

I've also seen this when a product gets a short writeup in the "what's new" section of an industry magazine.  Granted, a new product might take a little while to thread into the website, but print publications have a lead time as well.

Product -> Model Number -> Website -> Zero Results

Same thing for products. I'm holding a physical product in my hand, with a model number silk-screened onto the plastic.  Go to that company's site, type it in, verbatim, zero results.  In this case I feel sorry for them, maybe an issue with punctuation, so I'll try without dashes, maybe no spaces, try leaving off the end of the model number and put an asterisk.  No results.

Causes and Solutions?

I've changed my mind on this over the years.

In both of these problems, when I used to dig deeper, or manage to engage a human, there'd be some "logical" explanation, "oh, that's the worldwide site, this was on the US site", or "consumer vs. corporate", or "oh yeah, we're having trouble with search".

Now I just get depressed and either give up or try Google's public search

Site after site has multiple problems, and search is just one of them.

I'm sure the IT departments and webmasters get yelled once in a while, or the search vendor, but there are bigger issues here....

Quality Starts at the Top

I've decided it's the CEO's fault at least to some extent, or in a larger company maybe the EVP of that division, for not noticing the patterns of annoying problems like this.

I no longer believe my experiences are isolated cases.  I'm possibly an atypical user, and more likely to actually mention the problem to the company, but trust me, there are usually many other problems on these sites.

Does the CEO or VP use the web site?  Do they talk to clients or prospects?

When a large print ad is proposed, does the VP go to their own website to see if they can find the damn thing, before signing off on a large campaign.

And maybe these are "details" in larger companies, but then, there will be failure after failure like this.  A pattern of mistakes should be noticed.  And if not, then the manager one level up should notice their direct report's failure to spot patterns of problems and address them.

You ever eat at a restaurant and get poor service, again and again, no matter the server or the day?  That's a MANAGEMENT problem, not a problem with the harried wait-staff.  Vs. a restaurant where you routinely see the owner or manager going around.

Some megastores have poor service at all of their branches, coast to coast, thousands of miles apart, this is a management problem.

Years ago an email was leaked from Bill Gates, blasting issues in Windows, and the reply from the VP was also leaked.  The issue was not "we don't seem to be spotting problems", no, the response was to obsess about individual issues, but no SYSTEMIC analysis.

Ultimately these system problems ought to be noticed by management.  If the CEO or EVP doesn't have a need to visit the website on a regular basis, maybe the site sells industrial parts, and the CEO doesn't buy those himself, then he/she should become super sensitive to any feedback they get.  Maybe make frends with a few individuals for some big accounts.  Maybe talk to the young interns who are not used to things sucking and talk to them frequently, maybe have some pizza brought in on a regular basis, and make sure to attend, and listen carefully.

So.... CEO's and VPs, if you spot problems and report them to your subordinates, do they just fix ONLY those specific items?  Do they notice the bigger patterns?  Try holding back 50% of your specific observations, see if they get cleared up too.  Actually, your VPs shoulld already be noticing these PATTERNS.

I propose that companies that have poor web sites year after year probably also have poor customer service, bad documentation, annoying sales people, and a host of other systemic problems.

The "Times 10" Factor in Complaints

If somebody actually manages to report a website or search problem to the right person, there's a tendency to think this is an isolated incident.  A very dangerous attitude!

I don't remember the exact number, or where I saw the statisitic, but my rule of thumb is that at least 10 times as many people have noticed a problem, or that at least 10 other similar incidents have occurred.  And it's difficult to report site problems, then it's likely higher.  And as customers or employees notice multiple problems and start to "give up", I actually think that ratio is much higher, possibly 100 or 1,000x.  That ratio skyrockets because most users simply abandon the system and go elsehwere.  Potential customers take their business to other sites, and employees abandon the "company portal" and just ask each other for info, or look it up on Google.

The nice thing about an abdoned website or portal?  Those complaints eventually go back down to zero, mission accomplished!  Seriously, this happens.  There are many rationales for only fixing things that peopple notice, or fixing only the top N problems, etc.  When complaints go back down, people who subscribe to these theories seldom ask themselves whether there's another possible explaination.  A very dangerous game.

January 31, 2011

Great new tool for Pharmaceutical researchers

Topic_Explorer Our partners over at Raritan Technologies Inc. have recently released a great tool they developed using the  Lexalytics, Inc. Salence toolkit. The product, Topic Explorer, provides a way for users to dig through content and explore concepts from Raritan's extensive knowledgebase of medical terminology, augmented by the text analytics capabilities provided by Lexalytics. Many of you will remember Lexalytics as the company that provided great sentiment analysis in the original FAST ESP product prior to the acquisition by Microsoft.

Raritan co-founder Ted Sullivan gives a great video demo of the product you should see.

What's really great about Topic Explorer is that it isn't limited to just pharma. With the right taxonomy, it can be a great research tool for just about any vertical - risk management, eDiscovery, patent research, and more.

Topic Explorer is a search technology neutral product, so it will work with your current solution whether you're using Lucene/Solr or a popular commercial technolgy. Contact Raritan at 908-668-8181 Extentsion 110. Tell them you read it here!