30 posts categorized "SharePoint"

May 19, 2011

Content owners don't care about metadata

Or do they?

Our recent post about Booz & Company's 'men named Sarah' highlights just how important good metadata can be in order to provide a great search experience for employees and customers.

One of our customers who spoke at the recent ESS 2011 in New York provided some great insights into the problems organizations have getting employee content creators to include good metadata with their documents.

During the ESS talk, they report that content owners don't really seem motivated when asked to help improve the overall intranet site by improving document metadata. However - and this is a big one - when a sub-site owner sees poor results on their own site, they are willing to invest the time to provide really good metadata.

[A bit of background: This customer provides a way to individual site owners within the organization to add search to their 'sub site' pretty much automatically - sort of a 'search as a service' within the enterprise.]

So if you've been thinking of adding the ability to search-enable sub-sites within your organization, but solving the relevance problem is your first task, you might reconsider your priorities!

/s/Miles

May 16, 2011

Sixty guys named Sarah

We're always on the lookout for anecdotes to use at trade shows, with our customers and prospects, and of course here in the blog, so I have to report that we heard a great one last week at Enterprise Search Summit in New York.

The folks from Booz & Company, a spinoff from Booz Allen Hamilton, did a presentation on their experience comparing two well respected mainstream search products. They report that, at one point, one of the presenters was looking for a woman she knew named Sarah - but she was having trouble remembering Sarah's last name. The presenter told of searching one of the engines under evaluation and finding that most of the top 60 people returned from the search were... men. None were named 'Sue'; and apparently none were named Sarah either. The other engine returned records for a number of women named Sarah; and, as it turns out, for a few men as well.

After some frustration, they finally got to the root of the problem. It turns out that all of the Booz & Company employees have their resumes indexed as part of their profiles. Would you like to guess the name of the person who authored the original resume template? Yep - Sarah.

One of the search platforms ranks document metadata very high, without much ability to tune the weighting algorithms. The other provides a way to tune the relevance; but it also tends to rank people relevance a bit differently - probably stressing documents about people less than the individual people profiles. The presentation was a bit vague about whether any actual tuning that might impact these differences on either platform.

The fact that one of the engines did well, and one did not, is not the big story here - although it is something for you to consider if you're evaluating enterprise search platforms. The real lesson here is that poor metadata makes even the best of search platforms perform poorly in some - if not most - cases.

 

May 05, 2011

FAST for SharePoint Seminar in NY During ESS

Our friends over at Arcovis are hosting a talk "Automating the Top 5 FAST Search for SharePoint Customizations" Wednesday evening, May 11.. Brent Groom, a Senior Engineer at Microsoft with deep experience in enterprise search, is doing the presentation.

The registration site seems to be down right now, but the link to register is http://events.linkedin.com/events/521196/clickthru. You can find information on the seminar/webinar on LinkedIn as well.

You can also attend in person. The session will be held at the Microsoft offices:

Microsoft Corporation
1290 Avenue Of The Americas
New York, NY 10104 US

if you're in New York for the Enterprise Search Summit, and you are in town Wednesday evening, this is only a few blocks form the hotel; show up in person!

 

 

March 24, 2011

Entity Extraction in Fast Search for SharePoint: Great article!

I just discovered a great article on a great blog about FAST Search for SharePoint (FS4SP) by Trond Øivind Eriksen of Comperio in Norway. Comperio is a FAST partner, and has been involved in a number of innovative projects involving FAST ESP and now FS4SP.

The article that originally caught my attention is about 'entity extraction' - what Microsoft now calls 'property extraction' in FS4SP. He addresses 'black list' and 'white list' terms that you want to include in the facets/properties you display in results lists; and, even cooler, he provides the example the way God intended things to be run - via scripting (in this case, PowerShell).

Actually I found his blog most helpful; I'm certainly adding it to my 'must read' list. You may find it helpful as well!

 

/s/Miles

February 02, 2011

Make your search engine seem psychic

People tell us that Google just seems to know what they want - it's almost psychic sometimes. If only every search engine could be like Google. Well, maybe it can.

Over the years, the functions performed by the actual 'search engine' have grown. At first, it was simply a search for an exact match - probably using punch card input. Then, over time, new and expanded capabilities were added, including stemming... synonyms... expanded query languages... weighting based on fields and metadata.. and more. But no matter what the search technology provided, really demanding search consumers pushed the technology, often by wrapping extra processing both at index time and at query time. This let the most innovative search driven organizations stay ahead of the competition. Two great examples today: LexisNexis and Factiva.

In fact, the magic that makes public Google search so good - and so much better than even the Google Search Appliance - is the armies of specialists analyzing query activity and adding specialized actions 'above' the search engine. 

One example of this many of us know well: enter a 12 digit number. if the format of the number matches the algorithm used by FedEx in creating tracking numbers, Google will offer to let you track that package directly from FedEx. For example, search for 796579057470 and you see a delivery record; change that last 1 to a zero, and you get no hits. How do they know?

The folks at Google must have noticed lots of 12 digit numbers as queries; and being smart, they realized that many were FedEx tracking numbers. I imagine, working in conjunction with FedEx, Google implemented the algorithm - what makes a valid FedEx tracking number - and boosted that as a 'best bet'.

Why is this important to you? Well, first it shows that Google.com is great in part because of the army of humans who review search activity, likely on a daily basis. Oh, sure, they have automated tools to help them out - with maybe 100 million queries every day, you'd need to automate too. They look for interesting trends and search behavior that lets them provide better answers.

Secondly, you can do the same sort of thing at your organization. Autonomy, Exalead, Microsoft, Lucene, and even the Google Search Appliance, can all be improved with some custom code after the user query but before the results show up. Did the user type what looks like a name? Check the employee directory and suggest a phone number or an email address. Is the query a product name? Suggest the product page. You can make your search psychic.

Finally, does the query return no hits? You can tell what form the user was on when the search was submitted - rather than a generic 'No Hits' page. Was the query more than a single term? Look for any of the words, rather than all; make a guess at what the user wanted, based on the search form, pervious searches, or whatever context you can find.

So how do you make your search engine seem psychic? Learn about query tuning and result list pre-processing; we've written a number of articles about query tuning in our newsletter alone.

But most importantly: mimic Google: work hard at it every day.

/s/Miles

 

 

 

 

December 10, 2010

SPTech February 7-9 2011

SPTCCA2011_150x57 December is sometimes a tough month to get much business done unless you're an eCommerce company. Nonetheless, 2011 will be here soon, and a hectic January may keep you from noticing a really great SharePoint conference in February: SPTechCon. It's the largest independent SharePoint conference, the kind where the Kool-Aid is just a refreshing beverage.

The early-bird registration that can save a few bucks from your professional development budget ends next Friday, December 17. It's not an easy three days: sessions start at 8:30AM and end at - or after - 5PM. There is time to meet with vendors and with other attendees, but it's certainly a conference you attend to work. The program (yes, which includes yours truly as a speaker), lists more than 100 workshops and classes, and you'll surely find them educational and professionally valuable.

Since you asked, my session is 'Which SharePoint search is Right for You'. With Microsoft and SharePoint, you have four or five choices in search technology to use just from Microsoft. Throw in a couple of other search products that work well with SharePoint and you've got the potential for some serious confusion. Come by the event, tell me that 'Dr Search sent me' and let's talk about your concerns one-on-one after the Wednesday 8:30AM (sunrise) session.

So while you're enjoying some quiet time leading up to the holidays, get out and register today! See you in February!

/s/Miles

 

 

Microsoft Search Partners blog is a little behind the times

You can't talk yourself out of something you behaved yourself into.

It's a simple truth that applies to kids, employees, bosses and vendors - even pets (you animal lovers will understand)

So when I finally had time to browse blogs I like to follow, one of the ones I opened was the Microsoft Enterprise Search Partners blog. When they acquired FAST, Microsoft jumped into enterprise search with both feet. They finally had a real solution to the problem, and they were happy to have a cadre of skilled partners who knew search. Hey, they even started a blog!

I know it's tough to keep up--to-date when you write a bog, especially for a small company. Trust me, I know: keeping up to date on writing blogs, tweeting, running a company, writing a book AND managing a couple of intense projects is tough. I'm always in awe of big companies like Google and Microsoft with large staffs to handle the important corporate communications.

So when I finally went back to the Microsoft Enterprise Search Partners blog, imagine my surprise when I discovered that the most recent entry there was from April 16 2010 - nearly 9 months ago! The article - 'Calling all Partners!' - talks about the 'terrific momentum' in the program. Working overtime - right up to the layoffs.

So I have to ask: which is it? Are search partners (and enterprise search) important to Microsoft? Or was it an urgent problem in SharePoint that, now solved, can be pushed to the back burner? Microsoft, come on: restart the blog.. or let it die.

/s/Miles

 

 

 

November 08, 2010

Enterprise Search Summit DC November 15-18

The new home for the Fall ESS show is the Renaissance Hotel in downtown Washington, DC... so much for ESS-West! The new locale should bring a large number of new attendees and visitors, and a new co-located conference: SharePoint Symposium. InfoToday knows a trend when they see one!

In addition to the usual sessions provided to show sponsors, there are some interesting sessions by Tom Reamy of KAPS Group; Martin White of Intranet Focus; and eDiscovery expert Oz Benamram, CKO of White and Case LLP. Tony Byrne of Real Story Group will also be there, moderating the session I'll be participating in: Stump the  Search Consultant on Wednesday afternoon November 17th.

I really expect the show to have a large number of government folks in attendance, given how hard it's been for these good folks to travel to previous ESS conferences in New York and San Jose. InfoToday reports higher pre-registration this year than in the past; and I'll be happy to find out I'm wrong about most of the attendees being government or government-related folks.

Come by the session Wesnesday afternoon at 3PM; or leave a comment here if you want to get together.

 

 

October 06, 2010

SharePoint and FAST ESP Search Book SHIPS!

Wow, pent up demand never hurts!

We knew that there were a bunch of people waiting for the Advanced Microsoft Search Book, but we were kind of surprised by today's Amazon results. The book shipped Monday - we just got our 'official' copies yesterday; but Amazon reports our book ended up in the top 10, at least in one category:

Amazon Bestsellers Rank: #6,609 in Books (See Top 100 in Books)
   #7 in  Books > Computers & Internet > Microsoft > Networking
   #24 in  Books > Computers & Internet > Software > Business
   #28 in  Books > Computers & Internet > Programming > Software Design, Testing & Engineering > Software Development 

Well, I'm impressed. Thank you for those who are buying the book! We all hope it is useful for you. And we all want to know what you liked and didn't like, and where you need more content. Here's the official Wrox site for the book (for errata and updates) ; but we've also kicked off a blog just for the book where all four of us will be adding things like PowerShell scripts and taking questions.


September 30, 2010

What we did over our summer vacation

We've been very fortunate that the last several months have been unusually busy for us - in fact, almost too busy. We've been active in a number of interesting and demanding customer projects, and have invented some very powerful and useful technology along the way which we'll have more to say about over the next month or two.

However, it's time for us to formally come clean about the thing that really took up our evenings and weekends for most of the last year: 'the book'.  Since early last year, we've been working on a book project with Jeff Friend (Microsoft) and Natalya Voskresenskaya (Acrovis). The book, Advanced Microsoft Search: FAST Search, SharePoint Search, and Search Server, is finally available. The ISBN number is 0470584661 and you can order it from Amazon today.

A few good friends have known of the book for a few months now, and while we all felt we finished the book in record time, we're sure it seemed like geologic time to our editors and some potential readers. And I'd be remiss if I did not give special thanks for John Kane (HP), Carl Grimm (Avanade), and Jason Noble (Neudesic) for their roles as technical reviewers.

We cover the entire Microsoft search family with technical coverage on both the SharePoint and the ESP product families. We also have a few chapters on the business side of search: Centers of excellence, operations, selection, and so forth. We even have chapters that deal with ESP under Linux, possibly a first for a Microsoft search book!

We invite you to have a look at our work; and let us know how we did. It took me 20 years to forget what a time sink it is to write a book; and yet now that it's done, we're already talking of a fourth. I guess my memory isn't what it used to be!

Enjoy!

/s/Miles