If we are to incorporate a folksonomy into our online library catalogs in a meaningful way that minimizes chaos, then we must find a way to evaluate how well a tag cloud describes a work.
This raises two questions:
- How are we to evaluate tags when all tags have meaning?
- How are we to evaluate a folksonomy without turning it into a taxonomy?
Any editorial decisions made about tags immediately are problematic. First, if all tags have meaning than you cannot delete or edit them. Second, editorial decisions imply a type of rule making that would destroy a folksonomy by turning it into a taxonomy. Therefore the only way to evaluate a tag cloud is to do so quantitatively. A quantitative metric removes the judgment and rule making of what tags are acceptable and what are not. Further such a metric could be done non-invasively so that the folksonomy can continue to grow undisturbed. Such a metric would be iterative so as to adjust as tag clouds change. The table below summarizes the advantages and disadvantages of various metrics we considered to evaluate the descriptive quality of a tag cloud.
We decided that the best metric to use is one based on the h index created by J. E. Hirsch, which evaluates an academic’s citation rates (An index to quantify an individual’s scientific research output). For our purposes the h index is defined as follows:A tag cloud has an index of h if h of its N unique tags have been repeated at least h times, and the other (N-h) tags have no more than h citations each.
The h index is directly proportional to number of unique tags and frequency of tags, both of which lead to better description.