81 posts categorized "FAST Search & Transfer"

June 28, 2017

Poor data quality gives search a bad rap

If you’re involved in managing the enterprise search instance at your company, there’s a good chance that you’ve experienced at least some users complain about the poor results they see. 

The common lament search teams hear is “Why didn’t we use Google?” when in fact, sites that implemented the GSA but don’t utilize the Google logo and look, we’ve seen the same complaints.

We're often asked to come in and recommend a solution. Sometimes the problem is simply using the wrong search platform: not every platform handles every user case and requirement equally well. Occasionally, the problem is a poorly or misconfigured search, or simply an instance that hasn’t been managed properly. Even the renowned Google public search engine doesn’t happen by itself, but even that is a poor example: in recent years, the Google search has become less of a search platform and more of a big data analytics engine.

Over the years, we’ve been helping clients select, implement, and manage Intranet search. In my opinion, the problem with search is elsewhere: Poor data quality. 

Enterprise data isn’t created with search in mind. There is little incentive for content authors to attach quality metadata in the properties fields of Adobe PDF Maker, Microsoft Office, and other document publishing tools. To make matters worse, there may be several versions of a given document as it goes through creation, editing, reviews, and updates. And often the early drafts, as well as the final version, are in the same directory or file share. Very rarely does a public facing web site content have such issues.

Sometimes content management systems make it easy to implement what is really ‘search engine optimization’ or SEO; but it seems all too often that the optimization is left to the enterprise search platform to work out.

We have an updated two-part series on data quality and search, starting here. We hope you find it helpful; let us know if you have any questions!

March 15, 2013

Open Source Search Myth 2: Potentially Expensive Customizations

This is part of a series addressing the misconception that open source search is too risky for companies to use. You can find the introduction to the series here; this is Part 2 of the series; for Part 3 click Skills Required In House.

Part 2: Potentially Expensive Customization

Which is more expensive: open source or proprietary search platforms?

Commercial enterprise search vendors often quote man-years of effort to create and deploy what, in many cases, should be relatively straightforward site search.  Sure, there are tough issues: unusual security; the need to mark-up content as part of indexing; multi-language issues; and vaguely defined user requirements.

Not to single them out, but Autonomy implementations were legend for taking years. Granted, this was usually eDiscovery search, so the sponsor - often a Chief Risk Officer - had no worries about budget. Anything that would keep the CRO and his/her fellow executives out of jail was reasonable. But even with easier tasks such as search-enabling an intranet site, took more time and effort than it needed because no one scoped out the work. This is one reason so many IDOL projects hire large numbers of IDOL contractors for such long projects.

FAST was also famous for lengthy engagements. 

FAST once quoted a company we later worked with a one year $500K project to assist in moving from ESP Version 4.x to ESP Version 5.x. These were two versions that were, for all purposes, the same user interface, the same API, the same command line tools. Really? One year?

True story: I joked with one of the sales guy that FAST even wanted 6 months to roll out a web search for a small intranet; I thought two weeks was more like it. He put me on the spot a year later and challenged me to help one of his customers, and sure enough, we took almost a month to bring up search! But we had a constraint: the new FAST search had to be callable from the existing custom CMS, which had hard-coded calls to Verity K2 - the customer did not have time to re-write the CMS.

Thus, part of our SOW was to write a front-end that would accept search requests using the Verity K2 DLL; intercept the call; and perform the search in FAST ESP. Then, intercepting the K2 results list processing calls, deliver the FAST results to the CMS that thought it was talking with Verity. And we did it in less that 20% of the time FAST wanted to index a generic HTML-bases web site.

On the other hand, at LucidWorks we frequently have 5-day engagements to set up the Solr and LucidWorks Search; index the user's content; and integrate results in the end user application. I think for most engagements, other Solr and open source implementations are comparable. 

Let me ask: which was the more "expensive" implementation?

February 14, 2013

A paradigm shift in enterprise search

I've been involved in enterprise search since before the 'earthquake World Series' between the Giants and the A's in 1989. While our former company became part of LucidWorks last December, we still keep abreast of the market. But being a LucidWorks employee has brought me to a new realization: commercial enterprise search is pretty much dead.

Think back a few years: FAST ESP, Autonomy IDOL (including the then-recently acquired Verity), Exalead, and Endeca were the market. Now, every one of those companies has become part of a larger business. Some of the FAST technology lives on, buried in SharePoint 2013; Autonomy has suffered as part of HP because - well, because HP isn't what it was when Bill and Dave ran it. Current management doesn't know what they have in IDOL, and the awful deal they cut was probably based on optimistic sales numbers that may or may not have existed. Exalead, the engine I hoped would take the place of FAST ESP in the search market is now part of Dassault and is rarely heard of in search. And Endeca, the gem of a search platform optimized for the lucrative eCommerce market, has become one of three or four search-related companies in the Oracle stable. 

Microsoft is finally taking advantage of the technology acquired in the FAST acquisition for SharePoint 2013, but as long as it's tied to SharePoint - even with the ability to index external content - it's not going to be an enterprise-wide distribution - or a 'big data' solution. SharePoint Hadoop? Aslongf as you bring SQL Server. Mahout? Pig? I don't think so. There are too many companies that want or need Linux for their servers rather than Windows.

Then there is Google, the ultimate closed-box solution. As long as you use the Google search button/icon, users are happy – at least at first. If you have sixty guys named Sarah? Maybe not.

So what do we have? A few good options generally from small companies that tend to focus on hosted eCommerce - SLI Systems and Dieselpoint; and there’s Coveo, a strong Windows platform offering.

Solr is the enterprise search market now. My employer, LucidWorks, was the first, and remains the primary commercial driver to the open source Apache project. What's interesting is the number of commercial products based on Solr and it's underlying platform, Lucene.

Years ago, commercial search software was the 'safe choice'. Now I think things have changed: open source search is the safe choice for companies where search is mission. Do you agree?

I'll be writing more about why I believe this to be the case over the coming weeks and months: stay tuned.

/s/Miles

 

December 18, 2012

Last call for submiting papers to ESS NY

This Friday, December 21, is the last day for submitting papers and workshops to ESS in NY in May 21-22. See the information site at the Enterprise Search Summit Call for Speakers page.

If you work with enterprise search technologies (or supporting technologies), chances are the things you've learned would be valuable to other folks. If you have an in-depth topic, write it up as a 3 hour workshop; if you have a success story, or lessons learned you can share, submit a talk for a 30-45 minute session.

I have to say, this conference has enjoyed a multi-year run in terms of quality of talks and excellent Spring weather.. see you in May?

 

 

August 21, 2012

Mind the gap

A few weeks ago, a former client asked me about the 'lay of the land' in enterprise search - which companies were the one to be considered for evaluation. It's something I'm frequently asked, and one big reason why I strive to stay current with all of the leading commercial and open source vendors in the market.

As I pulled together the list, it occurred to me that recent consolidation has led to an odd situation: there is no longer a 'mid-market' in enterprise search.

Under $25,000(US), there are a number of options from free and low-cost open source (SearchBlox and my employer LucidWorks come to mind). 

Google has discontinued its low cost (blue) search appliance, and raised the cost of its regular (yellow) one to apparently be well above $25K.

We also have the old-school major commercial vendors - like FAST (now Microsoft SharePoint Search); Autonomy (now HP); Endeca (now Oracle), and finally Vivisimo (now IBM). Trend or not, these enterprise search products command high initial outlay, often significant implementation costs, and high ongoing 'support' once you've rolled it out. Looks like the mid-market is gone.

So now the question is: What do you get for the difference in price? I'd suggest not much in the way of capability; nothing in terms of scalability; and very very little in the way of flexibility.  I guess it's 'caveat emptor' - buyer beware!

What about some products/projects I haven't mentioned? Well, the focus of my article here is on enterprise search. Great candidates like Coveo are 'windows only' which disqualifies them from my list. I suppose you could consider the GSA as not enterprise ready, but I think appliances make the OS issue irrelevant. I've also omitted mentioning other projects because they have not yet shipped a 'Version 1.0' release - that's testware, no matter who it's from. And I'm sure there are open source projects where a single person is making all the calls - I don't consider that enterprise ready either.

I’ll be looking for the day when the big guys start value pricing their software licenses and help bring the market into line with today’s reality.

If you think I've unfairly represented the market, let me know - I'm not shy about posting comments that differ with my viewpoint.

 

s/Miles

 

May 10, 2012

Lucene Revolution: MS talks of being more open

Lucene Revolution: MS talks of being more open

At yesterday’s kickoff of Lucene Revolution 2012, Lucid CEO Paul Doscher introduced Gianugo Rabellino, Microsoft's Director of Open Source Communities. Gianugo said little about search per se, but he did confess to having been a fan of Lucene and Solr for a while now. In his talk, he told the audience that Microsoft has changed with respect to open source, and he went on to tell everyone how they have become more involved in open standard like HTML5, CSS3; and in hardware specifications like USB. He went so far as to say 'Microsoft's survival depends on open source software'.

News to me, and perhaps to others in the room, was the extent to which Microsoft is supporting a number of open source products and languages. Gianugo reported that Linux is now a 'first-class guest operating system' on Microsoft HyperV; and provides support for PHP, Ruby on Rails, node.js and other projects on Azure (and presumably for 'on premises' systems).

A number of folks from large commercial organizations seemed to appreciate the news about Microsoft's shift towards supporting open source; but a number of the open-source folks in the room felt this offered little new, and some even felt it was an unrelated 'sales pitch'. Even though we are Microsoft partners, I'm glad to see more support for open source products like PHP and Linux.

The finniest part of the talk came as Gianugo was describing how SharePoint data was easily accessible to other non-Microsoft' search platforms. An attendee asked if he felt there was a role for other platforms to be used as the primary engine for search in SharePoint; as he paused to craft a reply, Paul Doscher (loudly) pronounced his belief that there was, much to the pleasure of the crown.

There was not much else in the way of Microsoft news; but it was interesting to see how many people and how much effort Microsoft is putting into open source projects.

 

 

April 25, 2012

Vivisimo: Another one bites the dust

Earlier today, IBM announced that it was acquiring Vivisimo for an undisclosed sum. Now the tough question: what’s it all about? For the answer, let's take a quick trip to the early years of the decade.

Vivisimo was founded in 2000 out of Carnegie Mellon University. The first time we saw them, in 2004, they were marketing 'Clusty', a web clustering product that could examine huge numbers of web pages and then associate - or cluster - documents on specific terms. They also had some really strong federation capabilities built in. And the product was highly scalable. In fact, Vivisimo had great success in a number of huge government sites including the US Social Security site, FirstGov, the Defense Intelligence Agency, and commercial sites such as Ely Lilly. One thing all of these sites have in common? Lots of data. We have a term for that now: 'big data'.

IBM has made huge investments in open source search over the last 10 years, specifically yin Lucene/Solr. Hadoop is the Apache answer for big data, and trust me; Hadoop is a hot topic this year.

What does Vivisimo bring IBM? Well... for one thing,  clustering algorithms (and probably patents); a reputation for being able to handle huge data sets; and federation.

What should Vivisimo customers do now? Well, based on IBM's strong customer ethic, I think the answer is "don't panic" = do nothing for now'. Assuming Velocity is working for you, this acquisition should cause you no concern.

If you are evaluating Vivisimo, that's a bit more difficult. Some acquisitions like Verity's acquisition by Autonomy resulted in a wholesale replacement of the platform. Some customers made the switch early on and were happy; others fought to make IDOL work like K2, even with the 'compatibility mode; and never succeeded. You'll also remember that Microsoft, after  acquiring FAST Search, dropped the entire non-Windows platforms a year later which impacted upwards of 70% of the FAST  installed base.

If you are willing to acquire a platform for a couple of years and see what happens, go for it. You may look back and discover you made the right choice. On the other hand, former President Reagan had a saying: Trust, but verify". You might take a look around to see what platform is right for you now and into the future.

 

March 22, 2012

The sorry state of FAST training

Suppose your company uses SharePoint 2010. Suppose your company uses FAST Search for SharePoint as well. Where would you go for training?

Today I decided to see where we could get training for one of our new guys. My first stop was the old FAST University, which now directs you to Microsoft Learning at

www.microsoft.com/learning/en/us/training/fast-university.aspx

There, you can read about all of the classes, except that when you click Register, you find a page with class schedules for January and February of.. 2012. None after that. If you log in, you find a few classes for people with advanced Microsoft certifications, but nothing helpful.

I called the number where it suggested I could contact FAST University; and a very nice person directed me to the Microsoft Learning site (note: the title on the page I was looking at said 'Microsoft Learning' but apparently that is really the old FAST site at mzinga.)

On the real Microsoft Learning portal - microsoft.com/learning - I did a search for 'fast search' only to find:

Nohits

I went back to search for simply 'search' and did find a single class with FAST Search in the title - sadly, one offered August of 2011 - 6 months ago. 

After a call back to the nice person on the other Microsoft Learning page at mzinga, she told me there are no more FAST classes (meaning FAST ESP, I guess); and that for Fast Search for SharePoint classes, I need to find a partner. Go back to the (real) Microsoft Learning site, search for 'class locator' to find a partner. Use care: if you click on the class description, you'll see the date first published, not the date of any classes: to get that, you need to click on 'instructor lead'. 

So yes, there were three training providers here in Silicon Valley. And one claims to have classroom training for next week! But when I called them (at 1:15PM) an answering service picked up and told me that because I called 'outside of normal business hours' he's have to have them call me back. Nothing yet, day after. Maybe an early Easter holiday?

Second training partner: went to their web site - no phone number, but a 'Live Chat'. "Sorry, no operators are available; you'll be connected to the next one free'. An hour later, nada. I left.

Final partner - ONLC - picked up the phone; confirms that they teach the class, but will need a few more students to register before they can confirm it will happen. If it does happen, they teach it remotely. I can go into San Jose at their center, or even take it from here in our offices. Cool. But they can't find four students in the US to take it?

Kudos to ONLC; but it's a shame to see how far down the line training is for Microsoft with respect to FAST Search for SharePoint. Luckily, there are a number of good former FGAST ESP partners - including us - who can help you with what you need, be it training, remote support, or even appdev.

What do you do for Microsoft search training?

 

 

 

 

 

January 11, 2012

Webinar: What users want from enterprise search in 2012

If you ask the average enterprise user what he or she wants from their internal search platform, chances are good that they will tell you they want search 'just like Google'. After all, people are born with the ability to use Google; why should they need to learn how to use their internal search?

The problem is that web search works so well because, at the sheer scale of the internet, search can take advantage of methodologies that are not directly applicable to the intranet. Yet many of the things that make the public web experience so good can, in fact, be adapted in the enterprise. Our opinion is that, beyond a base level, the success of any enterprise search platform depends on how it is implemented and managed rather than on the core technology.

In this webinar we'll talk about what users want, and how you can address the specific challenges of enterprise content and still deliver a satisfying and successful enterprise search experience inside the firewall.

Register today for our first webinar of the new year scheduled for January 25 : What enterprise users want from search in 2012.

 

 

 

 

 

 

November 22, 2011

Webinar: Improving SharePoint search with the FAST indexing pipeline

For those of you still at your desks this short Thanksgiving week, you might be interested in a webinar we'll be doing with our partner SurfRay early next month.

"Everyone knows that great metadata is key to a great user search experience, but what can you do if your existing content falls short? The FAST Search for SharePoint pipeline provides a way to enhance document metadata during the indexing process so your content has better metadata and users will experience better search results.

During the webinar we’ll talk about what the pipeline is, give examples of how it can improve your metadata, and describe some real-world scenarios where having access to the pipeline resulted in better search quality and happier users."

How can the indexing pipeline improve search quality? You'll have to come to the webinar to hear our take, but a hint: you can add and improve metadata to the document during the indexing process - which means better search.

The webinar is planned for Friday, December 9 at 2PM Eastern/11AM Pacific.  You can register for the event now.