26 posts categorized "Search Analytics"

November 08, 2011

Pingar and New Idea Engineering Partnership

I'm happy to announce that our company, New Idea Engineering, has announced a partnership with Pingar, a New Zealand-based company that provides tools to extend and enhance the capabilities of enterprise search. New Idea Engineering is Pingar's first North American reseller.

Pingar markets libraries that provide tools for entity extraction, document summarization, redaction for key documents, autocomplete and a number of other capabilities that organizations can use to improve the user search experience.

In the developer area, Pingar provides access to view the various capabilities in action. For example, you can paste in the text of a document and see the summarization or view the redaction or any of the other Pingar capabilities. Developers can download an API key to test the code yourself. Pingar supports both C# and Java.

We'll be writing more about Pingar in action over the coming months.


July 12, 2011

A really good book by Lou Rosenfeld

Search Analytics for Your Site: Conversations with Your Customers is out, and while I've only just started reading it, it's a keeper.

Lou, a long-time pro in search analytics, relates not only the problem, but the solutions as well.

Early in the book, there is a telling anecdote: major relevancy problems can be caused by the omission of a single configuration file; or even a single badly set option.

When you roll out a major new system, have two different sets of eyes check everything!



March 21, 2011

Time After Time - Zero Search Results for AD CAMPAIGNS and Model Numbers

Print Ads -> Website -> Zero Results

I see a print advertisement for something and go to the company's web site.  No sign of their advertised product.  Do a search, zero results.  Oh, and no suggestion about who I might email, both for pruduct info, nor for reporting site problems.

So, how much did that color print ad cost to run?  And how response rates you say?  I wonder how the next staff meeting goes.  "Clearly the problem is with the print ad's font", or hey "maybe we just need to rename the product!?"

I've also seen this when a product gets a short writeup in the "what's new" section of an industry magazine.  Granted, a new product might take a little while to thread into the website, but print publications have a lead time as well.

Product -> Model Number -> Website -> Zero Results

Same thing for products. I'm holding a physical product in my hand, with a model number silk-screened onto the plastic.  Go to that company's site, type it in, verbatim, zero results.  In this case I feel sorry for them, maybe an issue with punctuation, so I'll try without dashes, maybe no spaces, try leaving off the end of the model number and put an asterisk.  No results.

Causes and Solutions?

I've changed my mind on this over the years.

In both of these problems, when I used to dig deeper, or manage to engage a human, there'd be some "logical" explanation, "oh, that's the worldwide site, this was on the US site", or "consumer vs. corporate", or "oh yeah, we're having trouble with search".

Now I just get depressed and either give up or try Google's public search

Site after site has multiple problems, and search is just one of them.

I'm sure the IT departments and webmasters get yelled once in a while, or the search vendor, but there are bigger issues here....

Quality Starts at the Top

I've decided it's the CEO's fault at least to some extent, or in a larger company maybe the EVP of that division, for not noticing the patterns of annoying problems like this.

I no longer believe my experiences are isolated cases.  I'm possibly an atypical user, and more likely to actually mention the problem to the company, but trust me, there are usually many other problems on these sites.

Does the CEO or VP use the web site?  Do they talk to clients or prospects?

When a large print ad is proposed, does the VP go to their own website to see if they can find the damn thing, before signing off on a large campaign.

And maybe these are "details" in larger companies, but then, there will be failure after failure like this.  A pattern of mistakes should be noticed.  And if not, then the manager one level up should notice their direct report's failure to spot patterns of problems and address them.

You ever eat at a restaurant and get poor service, again and again, no matter the server or the day?  That's a MANAGEMENT problem, not a problem with the harried wait-staff.  Vs. a restaurant where you routinely see the owner or manager going around.

Some megastores have poor service at all of their branches, coast to coast, thousands of miles apart, this is a management problem.

Years ago an email was leaked from Bill Gates, blasting issues in Windows, and the reply from the VP was also leaked.  The issue was not "we don't seem to be spotting problems", no, the response was to obsess about individual issues, but no SYSTEMIC analysis.

Ultimately these system problems ought to be noticed by management.  If the CEO or EVP doesn't have a need to visit the website on a regular basis, maybe the site sells industrial parts, and the CEO doesn't buy those himself, then he/she should become super sensitive to any feedback they get.  Maybe make frends with a few individuals for some big accounts.  Maybe talk to the young interns who are not used to things sucking and talk to them frequently, maybe have some pizza brought in on a regular basis, and make sure to attend, and listen carefully.

So.... CEO's and VPs, if you spot problems and report them to your subordinates, do they just fix ONLY those specific items?  Do they notice the bigger patterns?  Try holding back 50% of your specific observations, see if they get cleared up too.  Actually, your VPs shoulld already be noticing these PATTERNS.

I propose that companies that have poor web sites year after year probably also have poor customer service, bad documentation, annoying sales people, and a host of other systemic problems.

The "Times 10" Factor in Complaints

If somebody actually manages to report a website or search problem to the right person, there's a tendency to think this is an isolated incident.  A very dangerous attitude!

I don't remember the exact number, or where I saw the statisitic, but my rule of thumb is that at least 10 times as many people have noticed a problem, or that at least 10 other similar incidents have occurred.  And it's difficult to report site problems, then it's likely higher.  And as customers or employees notice multiple problems and start to "give up", I actually think that ratio is much higher, possibly 100 or 1,000x.  That ratio skyrockets because most users simply abandon the system and go elsehwere.  Potential customers take their business to other sites, and employees abandon the "company portal" and just ask each other for info, or look it up on Google.

The nice thing about an abdoned website or portal?  Those complaints eventually go back down to zero, mission accomplished!  Seriously, this happens.  There are many rationales for only fixing things that peopple notice, or fixing only the top N problems, etc.  When complaints go back down, people who subscribe to these theories seldom ask themselves whether there's another possible explaination.  A very dangerous game.

January 31, 2011

Great new tool for Pharmaceutical researchers

Topic_Explorer Our partners over at Raritan Technologies Inc. have recently released a great tool they developed using the  Lexalytics, Inc. Salence toolkit. The product, Topic Explorer, provides a way for users to dig through content and explore concepts from Raritan's extensive knowledgebase of medical terminology, augmented by the text analytics capabilities provided by Lexalytics. Many of you will remember Lexalytics as the company that provided great sentiment analysis in the original FAST ESP product prior to the acquisition by Microsoft.

Raritan co-founder Ted Sullivan gives a great video demo of the product you should see.

What's really great about Topic Explorer is that it isn't limited to just pharma. With the right taxonomy, it can be a great research tool for just about any vertical - risk management, eDiscovery, patent research, and more.

Topic Explorer is a search technology neutral product, so it will work with your current solution whether you're using Lucene/Solr or a popular commercial technolgy. Contact Raritan at 908-668-8181 Extentsion 110. Tell them you read it here! 

December 05, 2010

Share your successes at ESS East next May

ESSSpringLogo Our friends over at InfoToday who run the successful Enterprise Search Summit conferences have asked us  to announce that the date for submitting papers to their Spring show in New York in May 2011 has been extended until Wednesday, December 8. You can find out what they are looking for and how to submit your proposal online at http://www.enterprisesearchsummit.com/Spring2011/CallForSpeakers.aspx.

Michelle Manafy, who runs the program again next May, really likes to have speakers who have found creative and successful ways to select, deploy, or manage ongoing enterprise search operations. We've co-presented with several of our customers in the past, and trust me, it's great fun and not bad for your career! And - no promises - the weather at ESS East has been great for just about every year - and we've been there for nearly 6 years now!

A friend told me something years ago that I've always fond helpful; I hope you'll take it to heart: 'Everything you know, someone else needs to know'. Don't worry if your search project isn't perfect; or worry that someone will find fault with what you've done. Trust me: there are many organizations newer to enterprise search than you are, and anything you found helpful will sure be valuable for them as well. And you get to attend al of the sessions, so you might learn more as well! A 'win-win' situation if I've ever seen one!

See you in New York!




August 27, 2010

There's an Ant on your Southwest Leg!

The WSJ has an interesting article on how language effects how we think.  I particularly liked the example of a indigenous language where anything you discuss involves absolute cardinal directions (north, south, east, west etc.). You literally can't say "There is an ant on one of your legs". Instead you say something like "There's an ant on your southwest leg." To say hello you'd ask "Where are you going?", and an appropriate response might be, "A long way to the south-southwest. How about you?" If you don't know which way is which, you literally can't get past hello. 

Dr. Kevin Lim reviewed Search Engine Society , a book which explores the effect search engines have on politics, culture and economics. He is not your typical reviewer since he also mentioned in the book, due to his recording a large part of his life using cameras (one he wears, another at his desk points at him) while a GPS device tracks his movements.

Google throws its weight behind Voice Search by Stephen Lawson discusses how voice search is based on statistical models of what sequences of words are most likely to occur, and how they train a new language model. Another example of that would be Midomi , a web site where you search for music by singing a fragment of the song. 

Multilingual Search Engine Breaks Language Barriers discusses how the the UNL Society uses the pivot language UNL to return a precise answer in the language in which the question was formulated. This seems to be still a research project, with some related projects such as LACE trying to extract data from parallel corpora as a cheaper way to populate a lexical database.

XBRL Across The Language Divide by Jennifer Zaino discusses how XBRL (eXtensible Business Reporting Language) may be one of the few areas that benefits from the Monnet project , which attempts to "provide a semantics-based solution for accessing information across language barriers". It tries to "build software that breaks the link between conceptual information and linguistic expressions (the labels that point back to concepts in ontologies) for each language." When that works, it makes it easier and quicker to perform analytics across multiple languages.

The Cross-Language Evaluation Forum (CLEF) is working on infrastructure for testing, tuning and evaluation of systems that retrieve information in European languages, and benchmarks to help test it. One of its papers for example, compares lexical and algorithmetic stemming in 9 languages using Hummingbird SearchServer

May 25, 2010

Google and TV: "prevening" on the Bing Bang Theory

When a popular TV show mentions an odd word, there's a tendency for people to look it up online and/or blog about it.

Our staff likes "The Big Bang Theory". One of the characters mentions the term "prevening" referring to the time between mid afternoon and the early evening.

When I first heard the term:

  • Mon 5/24/2010, 10:49pm PDT
  • Google shows 7,000 hits
  • including an entry Urban Dictionary from 2008

When watching a rerun of a different episode this evening, I remembered this post and went back to check:

  • Mon 7/19/10 11:04pm PDT
  • Google shows 96,000 hits
  • more than a 10x increase, pretty cool

Some years later, reviewing old blog posts, checked again:

  • Thu 5/23/2013 17:13 PDT
  • Google shows 95,800 hits
  • With 0.2 % of reading 3 years ago, so I'll call it quiescent

I had another more colorful entry from the Jay Leno show, but apparently too off color for our blog.  :-(

May 10, 2010

Google's Opt-Out Option for Behavioral Targeting

Last month Google announced that they would provide a browser plug-in to allow users to opt-out of Google Analytics tracking.  Joseph Stanhope's post explained why it was highly doubtful that this would do substantial harm to Google Analytics and its customers. Several posts such as one by Felipe Miyata suggested that this was an “insurance” move to silence opposition from privacy supporters, perhaps in preparation for doing more web analytics within the U.S. Federal Government.

Anil Batra's post has a quite different explanation - he suggests that it is really an attempt by Google to make more money by taking another step towards behavioral targeting.

February 24, 2010

Enterprise Search Summit 2010 - DC

Even as we prepare for ESS East in New York (ESS NY from now on?), Information Today has issued its call for papers for the first ever ESS-DC to be held in Washington DC November 16-18 2010.

Follow this link to find background on what InfoToday is looking for; or jump right to the submissions page. Don't be shy: everyone who presents papers had, at one time, never done it before. What you know, someone else needs to know!

In our experience, the kind of content InfoToday likes is the information that can help an organization select or manage search and related technologies. Generally, real-world stories about how other companies and organizations have succeeded with search are the ones that attendees appreciate the most. 

We'll also be having a searchdev dinner at ESS DC this year. Details to come late in summer, but plan for it now!

Are you doing search now? Have you been successful getting it going on time and under budget? Tell your story. Submit your idea now!

February 10, 2010

News front: Convera files for dissolution

Convera, one of the companies offering 'vertical search' to help publishers and other content owners monetize their content, has filed to dissolve and liquidate the business. It was de-listed from NASDAQ Monday afternoon.

Convera, not unlike SearchButton.com which NIE spun off in 1998, was a hosted search company - now called 'site search' or 'search as a service'. It was a great idea, but when things imploded in 2001, Convera went after the market of monetizing content. That lead to Convera becoming a victim of the same problems faced by newspapers and publishers around the world, who they counted as their market: how do you sell content that is freely available from companies like Google, Yahoo, and Bing; and from blogs world-wide?

Google has a pretty darned reasonable site search service, by the way.