A while back, I posted an idea for checking to see the degree to which two differently named memes overlap in content. Looking back, what I was really talking about was tuning a folksonomy. What we really want is a way to see how much overlap there is between two tags so that we can merge or split them when appropriate.
It now looks like somebody has taken the first step in this direction. A group of folks (including Lilia Efimova of Mathemagenic fame) are building an application called Blogtrace that, among other things, compares two web pages to see the degree to which their key terms (i.e., ontologies) overlap. You can read a brief description and even see a screen grab on Anjo Anjeweirden’s blog.
It seems to me that if you just extend this model a bit you could compare two folksonomy tags by spidering the content of each and performing the same textual analysis. I think that would be very useful. Going a bit further, it would be interesting if, after performing the analysis, it could spit out a list of URLs for each tag that have a low degree of overlap with the other tag. This could be helpful in understanding why the overlap is not perfect and even how the tags could be refactored to make them more useful.
Anyway, blogtrace looks interesting. If you’re curious, Lilia has posted a schematic of how the application works.