Main | March 2007 »

7 posts from February 2007

February 21, 2007

Search Engine or Search Site?

<rant state=on>
So I just need to go 'on record' on something that I see every day that bugs me regarding the term 'search engine'.  Back in the day, a search engine was the core piece software that let you perform search on one or more sites - software like Verity, Fulcrum, Convera, Excalibre, and the granddaddy of all web search technologies, WAIS. 

Sites like Altavista, Yahoo, and Google were search sites, not search engines.  In fact, for a while there the search engine that powered Yahoo was Google.

The good news is that  search engine companies like FAST Search, Autonomy, IBM Omnifind, Lucene, and even the Google Search Appliance (and others) have evolved and adopted a new name for what they offer - 'enterprise search', or even 'search platforms'. It is these products - what I'd call search engines if I were as old fashioned as my age might suggest - that are the site search engines of today.

There, I've said it.

February 18, 2007

Some notes on FAST ESP and Virtual Machines

Needless to say, using virtual machines such as VMWare is TOTALLY UNSUPPORTED by FAST; this should be considered for internal testing only.

Finding 1: Had trouble with Microsoft Virtual PC on some machines.  Found fixes on Google that did NOT fix the problem; gave up and went with VM Ware.

Finding 2: The absolute minimum RAM for a VMWare client that will run a "clean" FAST install without complaining: 1.25 GB  (1,280 Megs).  My desire for a minimum was to allow me to run more virtual machines under one physical host.  This was Windows XP Pro and ESP 5.0.7.  Of course the actual documented requirement is 4 GB on a real machine, running Windows Server 2000 or 2003.

Finding 3: To "clone" or "reinstall" ESP.  My idea was to have a base virtual machine with ESP 5.0.7 installed and ready to go.  Just clone the virtual machine, change the settings, and then ready to create FAST collections, etc.

This did NOT work due to host name issues.  After cloning a machine, you would typically rename it, so that it doesn't conflict on the network with other cloned machines.  This renaming is done in the "My Computer" propoerties in the "Name" tab; it is NOT enough to just click on "My Computer" and rename.

The problem was what within FAST the host name appears in zillions of places.  I did actually try to track them all down, but after quite a bit of tweaking it's still broken.  Whatever the "correct" procedure is, it's certainly longer than just reinstalling FAST.

So from now on I'll start with a baseline virtual machine image, clone it, rename the clone, THEN install FAST.  In hindsight this also allows for a much smaller VM (2 G vs 6 G) and is more flexible - aftere all who knows how long ESP 5.0.7 will be the latest and greatest.


February 15, 2007

FAST Foward '07: ESP Studio: a rich UI for managing corporate Linguistic Assets

Vocabulary: Linguistic Assets are things like your company's lists of standard abbreviations and synonyms, your product names, etc.

FAST already offers simple linguistics tools in their Search Business Center, a web based admin console.

But on Friday (2/9/07) Johannes Stiehler from FAST gave a preview of a new upcoming tool called ESP Studio.  Though I missed the rundown of all its features, what I did see was a rich Windows based tool that let you have extreme control over abbreviations, thesaurus terms, etc.  You can work locally, adjusting each vocabulary nuance just the way you like it, then re-connect to your FAST search engine farm and "deploy" the changes.

Note that this is not a web based UI - it's an actual compiled application (Windows-only for now) - Johannes explained (I'm paraphrasing here) that to have maximum control over the UI's functionality, a stand-alone application is still the way to go.  This also has the benefit of letting a linguistics expert work on their machine locally, even if disconnected from the network, such as on an airplane or in meetings.  Simple lingustic changes can still be made via the existing Search Business Center web UI.  FF'07 Johan's talk

Looking foward to when this ships.

FAST Foward '07: ESPDeploy: rapid multi-node deployment and change control

Like other highly scalable engines, FAST can take advantage of dozens or even hundreds of machines in a single installation.  While such scalability is a powerful tool, it also implies lots of setup time for large installations.

However, on Friday (2/9/07) Steve Bower from FAST gave a presentation on this impressive bit of add-on code from FAST that gracefully manages large deployments.  With a relatively simple configuration file you can specify a large FAST server farm.  The tool includes software, file and patch distribution, global control of all processes by class, and even the ability to auto-reconstruct a downed node.  The tool is notable for handling tasks by the category of service (AKA software "role") vs. by machine; so you can do things like "shut down all the document processors", vs. "shut down processes 2, 7 and 9 on machines 13, 14 and 15".  Steve's talk at FF'07

I need to look around for the doc on this, looked VERY helpful.  Hope to make some more notes here.

February 14, 2007

Fast Forward 07

We've finally recovered from a week at FASTForward07 in San Diego last week which a number of smart folks are talking about. David Weinberger is just one of the many well known bloggers, editors and academics who attended and/or spoke at the conference. The show is really becoming quite the place to go to catch up on what's happening in Enterprise 2.0 (E2.0) and especially in what we call Enterprise Search 2.0. While many user groups are simply product-fests and opportunities for sales folks to meet their prospects, FASTForward is rapidly becoming the place to go to learn what trends are beginning to evolve in the search space.

One blogger who has not been recognized by the folks in Norway is Bruce who writes a candid analysis of this show on his blog bruceandmo. He's one of our customers, one of the sharpest guys we've ever worked with, and probably the only Luddite we know with his own web site. It must be the thin air of Colorado.

February 12, 2007

What is Enterprise Search?

So what is 'enterprise search', and how is it different from the so-called 'search engines' we use every day - Google, Yahoo, Ask, and dozens (if not millions) of others?

Enterprise search is the name generally applied to search technology implemented within corporations, government and educational institutions. It provides the ability to search-enable content specific to a given web site - whether that site is on the public web with little or no security, or whether the site is populated with highly sensitive information intended only for internal users.

Why the distinction? After all, you can limit search to a specific site on Google. But what you typically cannot find on Google - or other web search engines - is content intended for internal use within the company or organization. It also generally applies to 'site search' as well, the ability to search a specific web site for content. You can search the HP site for drivers; but when you do so from, you are using their internal site search engine.

What makes site search so different from web search?

Continue reading "What is Enterprise Search?" »

February 01, 2007


Miles Kehoe, New Idea Engineering, Inc.

Welcome to the Enterprise Search blog! Mark Bennett and I will be the initial posters here because it lets us do something we love to do - talk about enterprise search. Our postings will follow the technical and business issues in and around the enterprise search marketplace - products and services from companies like Fast Search and Transfer, Autonomy, IBM OmniFind and the IBM Yahoo! Edition, the Google Search Appliance, Lucene and many  others.

Enterprise search is no longer just a search box on a web page - it is a platform on which companies are building advanced applications. Search is the driving factor, and making search work is critical in keeping customers and employees satisfied, in meeting RIO objectives, and even in meeting compliance and regulatory requirements.

Continue reading "Welcome" »