« Search 2.0 - example of odd suggestion of related product based on social factors | Main | Visit the NIE Booth at Enterprise Search Summit, New York, May 20-21, 2008 »

March 19, 2008

Advanced Duplicate Detection (also related to spam detection and clustering)

We need to do a dedicated article about this area, but I wanted to share some material here that we have written about it, and that will likely re-appear in a future article.

In our recent newsletter article, we covered the problem of generic duplicate detection in search, and them duplicate detection in federated search.

A SearchDev posting Mark talked more about why checksums aren't always enough for duplicate detection, in messages 485 and 490


TrackBack URL for this entry:

Listed below are links to weblogs that reference Advanced Duplicate Detection (also related to spam detection and clustering):


Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Comments are moderated, and will not appear until the author has approved them.