October 08, 2008

Gartner Magic Quadrant 2008 Now Available

If you have not seen it, the new Gartner Magic Quadrant for Information Access - their name for intranet and customer facing search - has been published and is available for viewing on the Gartner web site thanks to a pointer from Microsoft's Analyst Relations page.

The big story, one which must have them fuming in England, is that Autonomy has dropped down a bit, and the combined Microsoft-FAST offerings have moved up a bit. This puts Autonomy a bit higher up on the 'Completeness of Vision' scale - by a few pixels - but a decent quarter-inch below Microsoft on the 'Ability to Execute' scale. Endeca, IBM, ZyLAB and Vivisimo squeaked into the upper right quadrant, while Google moved right to the link splitting the 'Challengers' from the 'Leaders', but ever so close - one could say the Google dot is on the line. It's odd that Google is not higher on the 'Ability to Execute' scale, since that usually means how well funded the company is. Perhaps they are looking at the budget/sales for only the Google appliance; but even then, Steve Arnold's numbers put them above the others on the scale.

Some excellent search products fell off the list this year, as Gartner has changed their methodology. The products we feel still qualify for the report include Dieselpoint, SLI Systems, and X1 Technologies, as well as newcomer Attivio. The article has more details. And as the con artist Fagan said in the play base don Dicken's Oliver Twist, '...if you happen to pass the Tower of London, have a look at the Crown Jewels'.

July 18, 2008

Microsoft Terms for SQL Server Search Components

I found a nice article about Microsoft Language Packs and MS SQL Server, including some info on Japanese and CJK handling, but another tidbit of info they had was how Microsoft refers to certain parts of their search engine:

  • What most vendors refer to as "indexing" MS refers to as "population" (into an index)
  • What most vendors call a "collection" Microsoft calls a "catalog" - we've seen other vendors use that term in the past.
  • And what most vendors call "tokenzation" or "tokenizers", Microsoft calls "word breakers", which is actually a bit more descriptive to a non programmer.

I actually wrote an article a few years ago comparing traditional relational databases to full-text search engines, which included a table of equivalent terms and concepts (near the end of the article).  If you're already familiar with databases, this will get you up to speed much faster!

June 18, 2008

Search Quality: You Can't Improve What You Don't Measure

In our latest survey of new newsletter subscribers we found that 29% had no formal metrics for measuring quality of search results.  Search metrics allow you to keep search on the right track and can be a powerful tool for managing your systems.  They are a wonderful source for insights and trends.  We thought we would share a couple that we think work well. Many of these are covered in greater depth in Interpreting Your Search Activity Reports in the Enterprise Search newsletter.

  • Count the number of people who use search  
  • Count the total number of searches  
  • Count the number of zero search results  
  • User feedback on top 100 searches  
  • Track email complaints about search  
  • Measure number of clicks on navigators (navigation menu items)  
  • Business Goals  
  •    
    • Reduce call volume (normallized for growth in customer base) by enabling self-service from search: results are good enough to reduce calls.
    • Reduce e-mail volume (again adjusted for growth in customer base) by enabling self-service from search: results are good enough to reduce e-mails. 
    • Revenue       
    • Add-on revenue       

May 30, 2008

Some interesting Enterprise search events the week of June 2nd

There are two really interesting events happening next week that might be of interest.

First, Leslie Owens of Forrester is presenting a the Forrester Wave Enterprise Search platform webinar  on Monday morning, June 2 at 8AM. There is a nominal fee, but I think you will find it interesting.

Then, Leslie and several other interesting speakers will be at a free one day seminar hosted by FAST on Wednesday the 4th in Redwood Shores California at the Sofitel Hotel. In addition to Leslie Owens' presentation on 'Technology Populism', speakers will include Jeff Spataro of Microsoft; Hadley Reynolds of FAST; and senior IT managers from Cisco and National Instruments.  Hadley, by the way, speaks and writes on Search Centers of Excellence and other innovations in the application of enterprise search. Be sure to register for the free FAST Search event.

May 08, 2008

A proposed standard for enterprise search

Dieselpoint has announced support for a technology it calls OpenPipeline, which can enhance the task virtually every enterprise search technology uses to get documents into the search index. They will be showing the pipeline at the upcoming Enterprise Search Summit on May 20-21 integrated with their new Dieselpoint Search 4.0, also on display.

The Dieselpoint press release claims:

OpenPipeline provides a common architecture for connectors to data sources, file filters, text analyzers and modules to distribute documents across a network. It is fully functional out of the box and includes an installer, a job scheduler, file scanner and crawlers, doc filters, and point and click interface with drag and drop module installation.

OpenPipeline is compatible with IBM's UIMA (Unstructured Information Management Architecture), and is designed to connect UIMA annotators to other systems.

Document processing can be centralized or parallelized as needed. The transport mechanism is simple, web-services XML over HTTP. RSS/Atom feeds are also possible.

The development philosophy behind OpenPipeline stresses simple, elegant design, and massive scalability. Minimal external dependencies and straightforward plug-in implementation ensure that the learning curve is low.

OpenPipeline can be downloaded without charge from http://www.OpenPipeline.org. It's available under the Apache License.


Making this technology open source makes sense. The core technology for an enterprise search company, their 'secret sauce', is optimizing the index and making search great, not creating new code to parse the latest version of Microsoft Office or of Documentum. By embracing OpenPipeline, presumably we will start to see pipeline stages created by a number of smaller companies and individuals, easing the burden on enterprise search companies. And companies that provide possible sources of data like Content Management Systems, can create a single pipeline stage for their product that could work for every search technology, and be done with it.

To create a searchable index, all search technologies need to create a stream of text. If the source document is a binary file - Microsoft Word, for example - search vendors need to provide some way to read the format and convert it to text. The same is true of content stored in a relational database: each row represents a virtual document which needs to be extracted from the database and turned into a stream of text. This conversion is typically done as one stage of a pipeline. Other stages may include adding metadata, performing entity or sentiment extraction, or even enhanced language processing.

The concept of a 'pipeline' applies directly to many existing search technologies, each with a proprietary method of accessing content. On top of that, no search technology companies have cooperated with competitors to create standards. In the relational database world, standards have made life much better: consider ODBC and JDBC. Because of these standards, developers can write code that can connect to just about any relational database. Not so in search. Maybe this effort will help break the ice. Stay tuned...

As enterprise search users, are you glad to see an open source solution for part of the search puzzle?

April 27, 2008

FAST and Microsoft tie the knot: It's official

On  Friday April 25 it became official: FAST is now a fully owned subsidiary of Microsoft.

As Hadley Reynolds writes, FAST is now officially part of the now-somewhat-larger Microsoft Enterprise Search Group (MESG). FAST is now officially "FAST, A Microsoft® Subsidiary". Both Hadley and Microsoft's Kirk Koenigsbauer stress that FAST employees and customers have little to worry about: Microsoft intends to continue to support FAST products on Windows, Linux and Unix as they today.

As we've speculated, this is just another in the continuing consolidation in enterprise search. What's curious now is trying to figure out what name is big enough for Mike Lynch and his board to be willing to be acquired by anytime soon: SAP? Oracle? Google? And will Endeca decide it needs to be part of a larger player? SAP has already invested in them, and their technology seems to fit nicely with Oracle. Stay tuned


 

March 03, 2008

Deep Web proposes federation resource site

Sol Ledeman of Deep Web Technologies wants to create a one-stop demo center for federation technology and has invited all of the major vendors to participate.

Federated search is becoming increasingly popular as more corporate customers are looking for ways to delivery results from multiple enterprise search installations, often from many different vendors. Sometimes the issue is technical, sometimes political, but nearly all companies have three or more search vendor technologies running somewhere behind the firewall.

The one thing we'd like to have seen in Sol's challenge is security, since that's what we think separates the winners from the also-rans in federation. It's not always easy, but it is 'real world' in companies. Nonetheless, a demo site where users can compare vendor solutions 'apples to apples' on the same data sources would be nice.

By the way, we've seen some confusion among our customers and prospects on the subject, so we've taken a shot at defining 'federated search' in our Enterprise Search newsletter. We hope that helps some.

February 18, 2008

Flashbacks to an earlier user group meeting

As I have been listening to some of the speakers here at FASTForward08, I realized there is an interesting coincidence here. Just about two years ago, I was here in Orlando at a different search technology user group meeting for a different company that had just been acquired by a another player in the space. Was this history repeating itself?

In 2006, it was the very last Verity User Group meeting held at one of the Disney properties here in Orlando and Autonomy was present to tell us all about the future. Why, just weeks before, Mike Lynch had announced that the conversion tools from K2 to IDOL were complete, and all the Verity users had to do was run the tools and everything was rosy. One of our customers had a one-on-one with the Autonomy guys, after which he told me "The conversion is really very easy: just send money". Maybe they have just not sent in enough cash yet, or maybe the exchange rate has dinged them, but I don't think they are finished the process yet, two years later. Your mileage may vary.

So is this a repeat of 2006? Could this be the last FASTForward?

Well, Microsoft is here at FF in numbers - engineers, mostly, maybe a project manager here or there. Alot of folks from Seattle, but some from Europe as well. New owners checking out the property? Maybe. But I think Microsoft is not thinking as FAST as a tear-down, to be replaced by one of their engines. And I don't think it's looking at FAST for SharePoint search only. I think the MS guys are looking to get into enterprise search in a big way, and I think FAST will be in the enterprise search space for a long time. Microsoft did not buy FAST for its customer list - rather, it wants the technology. Consider FAST's heritage: AlltheWeb.com, one of the earliest web search engines. Desktop search, intranet search (hosted as well as installed), web search, all together? Would Microsoft be interested in that?

Stay tuned.

The known, the unknown and searching

This week we're at FastForward08 in Orlando, so you'll see much more FAST-based postings than usual. By the way the FAST blog, fastforwardblog.com, has a number of other items and links that may be of interest.

One of the first banners that greets attendees as they walk into the main foyer says "There are things known and things unknown. The rest we search for". I started thinking about that from the enterprise search point of view -  usually we think of search for things that we need to find - the unknown. In reality, those things we search for are things we individually may not know, but which we expect are known somewhere within in the company. Do we really search our intranets for things that we don't expect to at all?

Sorry if this sounds like a Donald Rumsfeld speech, but I suppose there are always opportunities to discover the things we didn't know were unknown. On the public web,   I suppose stumbleupon helps us find sites of interest based on our behaviors and profiles, but I'm looking forward to how FAST addresses this message this week. Stay tuned.

February 06, 2008

Is there life after FAST?

It looks like there may be, and it doesn't involve Microsoft in this case.

Ali Riaz, former COO of FAST Search & Transfer, has announced a new enterprise search company called Attivio. Their secret sauce is what they are calling "AIE", which apparently stands for "Active Intelligence Engine". Attivio, which was started by Riaz and a number of former FAST engineers, has started filing patent applications and recently closed a funding round of $6.2M.

The Attivio web site reports that they intend to provide a tool that can combine both structured and unstructured content, and make the Java-based software available via an API or via SaaS.

InfoToday reports that Attivio has used open-source search code from Lucene as well as other open source and commercial technologies and will be contributing some of their work back to the Apache project. Garnter reported last year that Lucene wouldn't be accepted in the corporate market unless a big player got behind it, and we differed in our October 2007 posting, but this is yet another data point that shows the Lucene engine to be quite satisfactory as a foundation for enterprise search that corporations can use.

Search Blog Archive

Dr Search

  • Dr. Search is the technical genius of enterprise search. Feel free to Ask the Doctor any questions you may have about enterprise search.

Enterprise Search Newsletter

Other Resources