by Laena McCarthy & David Conners (December 2006)

Creating an Image and Idea Index

In an attempt to meet the patrons' needs in our library, we embark on a user-generated six week tagging project. After a discussion on what it means to evaluate a tag cloud of a work, we settle on the need for a quantitative non-editorial evaluation. We propose the index h, defined as the number of tags that have been repeated a number higher or equal to h, as a metric to quantitatively evaluate tag clouds. The h indexes for a random sample are generated to see what range is normal for h. Lastly we propose using the h index as a way to constructively bring a folksonomy into an online catalog and encourage a conversation with the existing taxonomy.

Introduction

The scenario: if you stop by Pratt Manhattan Library between the hours of 4pm and 8pm on any given evening you will find the 4th aisle filled with students browsing the shelves or slumped on the floor with an eclectic pile of Art & Design books.

The problem: Art & Design Students cannot find specific images and ideas in books using traditional library cataloging and classification. We wanted to help them.

Why tags: tags allow design students to tag their books with the idea or image they are looking for, creating a trail through the books that can lead other students down the same path. Incorporating tags makes searching specific: where once students searched for books now they search for the metadata.

Why LibraryThing: this online service allows people to catalog their books. You can access the catalog from anywhere—even on your mobile phone. Because everyone catalogs together, LibraryThing also connects people with the same books, comes up with suggestions for what to read next, and creates a community of ideas out of a book. Social data makes it easier to find what you are looking for.

We needed to test our theory, so we decided to employ user tagging on the Art & Design students in the Communication Design Program at Pratt Institute.

Folksonomies and Tags

Tags are user-generated keywords supplied to describe content. Tags serving as metadata for digital systems are often called a folksonomy. Thomas Vander Wall coined the term folksonomy in 2005 by combining the words taxonomy and folk (Folksonomy: A Game of High-tech (and High-stakes) Tag). Using tags to create a folksonomy is to create a social classification system. A work is described by user submitted tags, which are grouped together in what is called a tag cloud. The relative weight of repeated tags is shown either by parenthetical numbers following the tag or by an increase in font size.

A relatively stable tag cloud emerges quickly when you have multiple users freely contributing tags. For the popular URL tagging site del.icio.us, a stable tag pattern emerges after 100 users bookmark the site. (Folksonomy). In a study, Adam Mathes has shown that the distribution of tags in a tag cloud follows a power law. There are a few tags used by very many users, a large number of tags used by only a few users, and a very large number of tags used by just one or two users (Folksonomies: Tidying up Tags?). Marieke Guy and Emma Tonkin surveyed two tagging sites, Flickr and del.icio.us, and found similar results. Two power curves, one linear and one logarithmic, from that study and published in Folksonomies: Tidying up Tags? are reproduced below:




Tagging, as a means of classification, improves with scale. As more users contribute, tags with shared meaning are repeated while tags with less shared meaning are relegated to the “long tail” of the curve. For our library patrons, we were interested in both popular tags and tags found in the long tail. Popular tags in a tag cloud capture the shared “aboutness” of a work, while less frequent tags capture the idiosyncrasies of a work. For example a patron once needed images of drug use. The catalog did not return any hits on the subject of drug use. A book in our collection, though, may have only one image of drug use. If so, then no Library of Congress Subject heading, a formal taxonomy, would be assigned to it. This information could get recorded with the folksonomy, in the long tail of the power curve for tags describing that work.

Folksonomies have both advantages and disadvantages over using a more traditional hierarchical taxonomy. Folksonomies are flexible and self-moderating. Unlike a taxonomy, a folksonomy is inclusive by letting the user participate and is often cheaper. Detractors are quick to note that folksonomies are chaotic and imprecise. Ambiguous and inexact tags can hinder searching while overly personalized and misspelled tags can make it even harder. The challenge then is how to maximize the potential of folksonomies while minimizing the disadvantages?

User Tagging

Methodology
We asked six Communication Design students to be participants in a tag project called the Creative Image and Idea Index.

For the next six weeks, every time they checked out a book in the course of their regular study, research, or reading, a form was attached with 2 simple questions:
  • What types of images or ideas were you looking for? (Examples: Nike logo, Corporate brands)
  • What keywords (aka "tags") should we use to describe the images and ideas you found in this book? (Examples: Nike, marketing, trademarks, logos)
Results
After six weeks of tagging, the 6 Communication Design students tagged 19 books and used 95 tags. A list of popular tags appears on the right. The most frequently used tags were "graphic design", "black and white" and "logos". The tags from our study conformed to the power law described earlier; meaning most of the tags were used only once. No book was tagged by more than 1 person.

Problems
Time and size of study: Six weeks and six people were not enough to garner the kind of data we had hoped for. During observations in the library, we witnessed the same books being used repeatedly, but our study was not expansive or long enough to reflect this usage.

Given the number of tags we collected, we concluded that there was no way to implement the tags as meaningful descriptors without editing them and turning them into taxonomies. To avoid this problem in the future, we would need to recreate the study with more students participating for a longer amount of time. This problem, though, led us to wonder if there was a way to evaluate tags without destroying a folksonomy.

Evaluating Tag Clouds

All tags have some meaning. Even the most obscure tag has meaning to the user who submitted it. Clay Skirky argues that in a folksonomy there is no such thing as a synonym because each user generates a particular tag for a reason (Folksonomies). In fact, this is one of the greatest strengths of a folksonomy. It allows for the existence of terms that would normally be considered synonyms in a thesaurus-based taxonomy. While a taxonomy may state that “movies” and “cinema” are equivalent terms, many film critics would disagree. This strength can turn into a weakness when users submit misspelled tags or otherwise avoid tagging best practices (outlined to the right). Poor quality tags, though embedded with meaning, are what lead to folksonomies being labeled as chaotic systems.

If we are to incorporate a folksonomy into our online library catalogs in a meaningful way that minimizes chaos, then we must find a way to evaluate how well a tag cloud describes a work.
This raises two questions:
  • How are we to evaluate tags when all tags have meaning?
  • How are we to evaluate a folksonomy without turning it into a taxonomy?
Any editorial decisions made about tags immediately are problematic. First, if all tags have meaning than you cannot delete or edit them. Second, editorial decisions imply a type of rule making that would destroy a folksonomy by turning it into a taxonomy. Therefore the only way to evaluate a tag cloud is to do so quantitatively. A quantitative metric removes the judgment and rule making of what tags are acceptable and what are not. Further such a metric could be done non-invasively so that the folksonomy can continue to grow undisturbed. Such a metric would be iterative so as to adjust as tag clouds change. The table below summarizes the advantages and disadvantages of various metrics we considered to evaluate the descriptive quality of a tag cloud.


We decided that the best metric to use is one based on the h index created by J. E. Hirsch, which evaluates an academic’s citation rates (An index to quantify an individual’s scientific research output). For our purposes the h index is defined as follows:

A tag cloud has an index of h if h of its N unique tags have been repeated at least h times, and the other (N-h) tags have no more than h citations each.

The h index is directly proportional to number of unique tags and frequency of tags, both of which lead to better description.

H-index Examples




Here is a simple example of an h index:

The work Elements of Chemistry by Antoine Laurent Lavoisier is tagged in LibraryThing. If you list his tags from most frequent to least frequent you get:

  1. Science (8)
  2. Chemistry (4)
  3. Nonfiction (4)
  4. Great books (2)
  5. Attic-V (1)
  6. Britannica Great Books (1)
  7. Chemical Analysis (1)
By the definition, this tag cloud has an h index of 3 because only 3 tags have been repeated at least three times: science, chemistry, and nonfiction.



Two works with the same h index scores are comparable in terms of their tag clouds overall descriptive impact, even if the total number of unique or repeated tags is very different.




For two works with the similar number of unique tags, the work with the higher h index has a tag cloud that is more fully descriptive.


Here we ask the old cataloger question, “Would a searcher who typed in “people, portfolio, fotos” be happy with the result of David Shirgley’s URL?” What about a user who searched for “design, plugins, collaboration”, would they be interested in structured blogging? We think the answer is more likely yes in the latter case.

To get a sense of what is a common h index, we took random samples of 100 works from both del.icio.us and library thing and calculated the h index for their tag clouds. In both cases, we found most of the h indexes fell within the 3-7 range. More testing is needed to see if this pattern holds.


Folksonomies in the OPAC

Can folksonomies provide the edge that online catalogs need to be competitive in a world dominated by Google and Amazon?

Folksonomies are flexible and self-moderating, but when combined with a more traditional hierarchical taxonomy they may be able to offer the personal, interactive and productive search experience that people are looking for in an online catalog. Next generation interfaces by vendors such as Innovative Interfaces are betting on just this. Their new browser, Encore, allows users to search and add tags to the OPAC.

Once we have integrated a folksonomy into the online catalog, an exciting opportunity arises. Works would be described both by the taxonomic Library of Congress Subject Headings, but also by the folksonomic tag cloud. If we could benchmark the h index, to determine at what h value a tag cloud would sufficiently describe the work, then we could use the h index to create a conversation between the folksonomy and the taxonomy.

Conclusions

Imagine a catalog where users can freely assign tags to any work in the collection. The integrated library system that manages the online catalog can calculate the h index of the tag cloud after each new tag is added. This would be dynamic and hidden from the users. Also imagine that by this point studies have shown that if a tag cloud for a work scores an h index of 15 or higher, then it is a very descriptive tag cloud. The integrated library system can alert the librarians every time a new work reaches an h index of 15. The librarians can then review the tag cloud, without editing it, and see if any additional Library of Congress Subject Headings should be added.

In this scenario, the taxonomy and folksonomy are in conversation with each other as a proxy for a librarian-user dialog with the h index as the moderator. Here we get the best of both worlds: the flexible, adaptable, and cheap folksonomy can continue to monitor the changing meaning of a work, while the orderly taxonomy can prevent chaos from overriding the online catalog. Benchmarking such a useful metric as the h index will require a larger sampling of user-generated tags, possibly by discipline, genre, and form.

Acknowledgements

We would like to thank Professor David Cohen of Swarthmore College's Physics & Astronomy Department for introducing us to the h index as well as for some mathematical assistance. Our thanks go out to fellow student MLS student Peter Cherches for his help with formatting blogger after class. Lastly, thank you to Pratt Librarian Jean Hines for letting us experiment in her library.