« Changes at Lucid Imagination | Main | Mind the gap »

August 20, 2012

What "Totally Automatic!" Really Means: AI / NLP / Machine Learning Considerations in Search Technology

Many advanced technologies use statistical machine learning and other numerical methods, or other techniques that come close to the claims previously made by AI software companies. While progress is being made, there are several points to consider when looking at these systems:

  • Can you override the system’s default behavior? Some vendors’ claims of “completely automatic” may actually mean it operates as a black box, with few diagnostic tools or adjustments. Such systems may not be suitable to put directly in front of customers, or at least not for driving the central content on a page.
  • Detect vs. Judge - Does the system simply detect trends and changes, or is it also making value judgments about those changes? Statistical methods have made much more progress on the former than on the latter. For high value customer experiences, it’s better for the computer to prioritize things for human operators to look at, and perhaps offer operators various convenient actions they can select from.
  • Supervised vs. Unsupervised – Although there’s a technical definition for this distinction, it really boils down to whether you can train the system and/or have predefined categories, etc., or whether the system is totally automatic. Although totally automatic sounds like less work, it’s less likely to give impressive results for primary customer facing activities.
  • Pretty numbers and graphs – There’s a tendency for some software companies to bring forth grids of floating point numbers, out to 6 or 8 decimal places, as proof that their software works well. Or they may claim things like “our software improves relevancy by 30%!” A good POC or A/B testing is a much better “proof” of software efficacy.
  • Machine generated graphics are a mixed bag. Graphs that simply reinforce arbitrary relevancy improvements aren’t really more useful than the numeric claims.
    However, graphics that help to visualize large amounts of data in innovative ways, spotting trends and differences, especially if they are interactive and can “drill down” into particular areas, can be very helpful. Of course they still require decent input, and reasonably trained people to interpret the graphs.
    Also, simple “cluster” graphs, while potentially useful, are no longer novel tools by themselves, and have rarely been exposed directly to end users, despite the suggestive demos of search vendors. To really leverage data visualization tools companies need to staff and train at a higher level than for just “running a search appliance”.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c84cf53ef017744395039970d

Listed below are links to weblogs that reference What "Totally Automatic!" Really Means: AI / NLP / Machine Learning Considerations in Search Technology:

Comments

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.