January 14, 2020

Conversational Search

The magic in early instances of what we now call 'enterprise search' was being able to find content by typing in a few keywords. It wasn't as cool as the HAL 9000 computer featured in "2001 - A Space Odyssey", but it was good enough to draw a large number of people - myself included - into the business.

Along the way, Google perfected a search platform based on the theory that, at scale, just about any query you could think of had already been used by thousands, if not millions. of other humans. All Google needed to do is keep track of what pages other humans viewed following a query and promoting the page to the top. Essentially, they created a 'crowd-sourced search'. 

The bad news for those of us who work on search designed for use within the enterprise is that there just isn't sufficient content - or query activity - to deliver results as accurate as those we experience on the public web. Consider: Google marketed the Google Search Appliance for the enterprise. It didn't deliver the kinds of results public-facing Google does, and Google pulled the product from the market. For great search, size matters.

Nonetheless, some of the companies that market enterprise search products are now adding elements of machine learning with their products; and while perhaps not as accurate as web-based Google, they do deliver results that start out pretty well and get better with age, as the platforms learn what documents humans view following queries.

And if you've not noticed, some leading vendors are now integrating - and encouraging - what is known as 'conversational search'. Think about it: when you need to find a document in your organization, you may ask a colleague. But you don't simply say "sales'. Chances are you'll ask "where is the new sales report".

It's encouraging to see an increasing number of vendors delivering these capabilities in their commercial products.  The most recent to announce conversational search is Algolia, although I have to say I'm quite disappointed in the Wikipedia write-up on them. In my spare time, should I ever find any,  I should go do some edits, but this 'spare time' thing is rare for me.

Nonetheless, I'm happy to see an increasing number of commercial search vendors beginning to integrate these advanced capabilities into their products. Search in the enterprise has challenges: but hang in there: it's getting better! 

Note: How has your experience been with machine learning and AI integrated with your enterprise search? I'd love to hear your experiences - even if under NDA!

 

January 06, 2020

It's a new year: Time for better metadata!

The new year is a time when most of us resolve to make changes in our personal lives: losing weight, exercising more, spending more time with a spouse and/or the kids. We start the year with great energy to meet our goals, but sadly many of us fall short through the year.

This often happens in the enterprise as well. Improving internal search is a common resolution at the time of the year. For eCommerce sites, January generally means fewer site visitors once the holiday rush is done; so making changes won’t have a great impact on sales. For corporations, it’s a time of new budgets and great expectations: and more than a few of the clients I’ve we’ve worked with over the years tell me how poorly their internal search performs compared to the public search sites like Google, Bing, and DuckDuckGo. Why do these search platforms work so well? And why can’t your site search match their success? It’s a numbers game. By definition, public search platforms index millions of sites; and many of these contain similar if not identical content. This makes is easy to find what you’re looking for because thousands of sites have relevant results for just about any query you may try.

Intranet sites are different, Usually, there is only one page with the information you are looking for. But often, content authors, who have read about how to promote consent on Google, will add keywords using Microsoft Word’s “Properties” field in an effort to promote their documents. This attempt to ‘game’ the internal search platform generally interferes with the platform’s relevance functions and results in poor result relevance. Even the Document Properties the Microsoft Word provides can interfere with search effectiveness.

Years ago, we were working with a client who was interested in knowing which employees were contributing to the intranet content. When the data was processed, it turned out that an Administrative Assistant in Marketing had authored more documents than anyone else in the corporation. After a quick review, we discovered why this one person was apparently more prolific than any other employee. That person had created all of the template forms used throughout the company, so the Word Document Properties listed that employee’s name as the author of virtually every standard template throughout the company.

So in the spirit of the new year, I’d suggest that you spend a day or two performing a data audit to discover where your content – or lack thereof – is negatively impacting your enterprise search results. And if you find any doozies – I’d love to hear about it!

 

 

December 10, 2019

A Working Vacation

The month of January is associated with the Roman god Janus who, with two heads, could look forward and back. That said, I find December a quiet time that provides the opportunity to review the current year and to plan the coming new year. As I tweeted yesterday at @miles_kehoe, this is the most stressful time of the year for most sites focused on eCommerce. Changes are generally 'off-limits' - even an hour offline can put a dent in sales.

But for those responsible for corporate internal and public-facing sites, this is the time to review content, identify potential changes, and even new content. And if planned well, the holidays are often a great time to update intranet sites: from late November through the new year, activity tends to slow for more corporate sites. Both IT and content staff should be using this quiet time to make changes, from updates to current content - the new vacation schedule is just one the comes to mind - to minor restructuring. (Note: while the holidays are a great time to roll out major changes, these should have been in planning months ago: it's a holiday, not a sabbatical!)

For the search team, this is time to review search activity: top queries, zero hits, misspellings, and synonyms come to mind as a minimum effort. It's also a good time to identify popular content, as well as content that was either never part of any search result or was included in result lists but never viewed.


So - December is nearly half over: take advantage of what is normally a quiet time for intranets and make that site better!

Happy Holidays!

 

November 06, 2019

Big Money in Enterprise Search

VC firms seem attracted to the Enterprise Search space

Just today, it was announced that Canadian-based Coveo closed a ~170MUS round, following the Lucidworks’ recent $100MUS round. Earlier this year we’ve seen  Algolia come in with $110M of funding, and of course the recent Elasticsearch’s IPO – sure looks it looks like 2019 will have been a good year for the leading technologies.

Stay tuned as we learn more about the trend!

November 14, 2018

Do you have big data? Or just lot of it?

Of course, I’m a search nerd. I've been involved in enterprise search for over 20 years. I see search and big data as related technologies, but in most cases, I do not see them as synonymous.

And I'd also say that, while most enterprises have a lot of data, the term ‘big data’ is not applicable to most organizations.

Consider that Google (and others) define ‘big data’ as “extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions”.

Yes, the data that Amazon, Google, Facebook, and others collect qualifies as big data. These companies mine everything you do when you're using their sites. Amazon wants to be able to report “people like you bought …” to sell more product; Google wants to know what ‘people like you’ look at after a query so they can suggest it to the next person like you; Facebook.. well, they want to know what to try to sell you as you chat with and about your friends. Is search involved? Maybe; but more often some strong machine learning and internal analytics are key.

Do consulting firms like Ernst & Young or PWC have big data? Well, my bet is they have alot of information about their clients, business practices, accounting, etc.. but is it ‘big data’? Probably not.

Solr, Elastic and other search technologies can search-enable huge sets of data, so often big data is indexed to be searchable by humans. And both Solr and Elastic come with some great analytical tools.. Kibana on Elastic, and Banana, the port of Kibana for Solr based engines.

But again, is that big data Or just lots of it?

I’d vote the latter.