The journal of Paul M. Watson.

Wednesday, September 14, 2005

Tagging tags

I want to tag my tags. I want my tags to have meaning. I want tags to have types, to be distinct in what they are; A location, a person, a relationship, an animal, a plant. I want to add one tag which by already having been tagged adds further meaning to the item without having to type in superfluous tags.

When I view a jaguar tag I want to see cars, not cats.

Flickr helps in this regard by offering tag clusters. Through some powerful data-mining they differentiate the tag jaguar for cars from the tag jaguar for cats.

The problem there is that it is a computer system doing the data-mining, doing the figuring out. Sure, it is using the tag data you and your friends entered but the clustering logic is not yours. A big point of folksonomies is the folk bit. You entered the keywords. You made it work the way you wanted it to work. The keywords you entered are the keywords you are comfortable with, the ones you remember and are familiar with. Tag clustering works by including other peoples' tags which brings along their methods of tagging, not yours.

A good deal of the time Flickr tag clustering works and works well. But when you want to be specific, when you want an unusual tag cluster, it fails. Try the orange cluster. Where is the fruit? I want to see photos of an orange, the fruit. I assume the reason for this is that in proportion to other uses of orange (colour, sunset, flower, M&Ms), fruit is not used much. If you try the oranges cluster though you do get fruit. Just a plural difference results in success or failure.

orange <= fruit or "tag the tag 'orange' with the tag 'fruit'. Anytime I want items tagged with orange and I mean fruit, I can get them. orange <= colour. Anytime I want items tagged with orange and I mean the colour, I can get them.

fruit <= plant. Now we can infer that orange is a plant. Or can we? We can but inference would also tell us that orange is a colour. That is a problem caused by the cloud and non-unique nature of tags. In a hierarchical system the orange in plant => fruit => orange is unique and separate to the orange in colour => orange. In tag clouds, orange is just orange.

If tags are tagged though at least through a suitable interface we can refine the list implicitly. Currently we can't, refining is done by data-mining and aggregating other peoples' methods of tagging.

You might notice that a pseudo-hierarchical system comes into being when tags are tagged. That is fine, it is not against the nature of tags because while we can infer a hierarchy out of tagged tags we don't hit the shortcomings of proper hierarchies. Remember the orange tag can still be a colour, a fruit or word that does not rhyme all without creating copies of orange as we would have to do in a hierarchical system.

Naturally we don't want to make tagging complicated. Tagging is working in large part due to how easy it is to tag. In most systems you have a free-text field that you just go wild in. Enter as many or as few or as strange tags as you wish.

So any tagged-tag system would have to retain the freeform nature and remain easy.

Here are some simple tagged-tags I can think of:
Bob Geldorf <= musician
Jimmy White <= friend
Cape Town <= city
Edna <= bride
Morgan <= groom

Those are simple relationships. How about adding some notation to imply other relationships.

Location: Cape Town :in: South Africa
Part-of: cog * watch
Synonym: T.V. ~ television

Synonyms are a good example of where tagging can fall down. When tagging a television show do you use TV, T.V. or television? Half the folks use one, some use the other and the rest use something else. Even a single person will vary, one day using bicycle, the next day using bike and then back to bicycle. When they go back to find all items tagged with bike they miss the bicycle items. I do it all the time, I try not to and it is simple to remember T.V. vs. television but there are plenty of situations where it is not so simple to remember. "Do I use web or www to mark items like this?"

Recently in Flickr I stopped tagging all my photos with South Africa. I realised it was polluting the system to have photos tagged with that when the photo was of my New Balance shoes. I felt that I should only use South Africa on photos that were distinctively South African e.g. of Cape Town city or of a Zulu hut. But 5 years from now I may ask "Where did I take that photo of my New Balance shoes?" and without a South Africa tag I won't remember.

I haven't quite figured it out though; how I can tag my shoes photo with South Africa and not pollute actual photos of South Africa also tagged with South Africa. I think tagging tags can help there, but the implementation details are bedeviling.

Still, tagging of tags can offer us more meaning without sacrificing the good qualities of tagging. For now I have to tag this photo with woman, Carolyn and friend when all I should have to tag it with is Carolyn which is in turn tagged with woman and friend.


Blogger Allolex said...

Hi Paul, what you're talking about amounts to a simple ontology (as opposed to a complex one). This is an approach a linguist might use to solve tagging ambiguity problems where polysemy and homography cause confusion. By taking other tags for the same image (for example) and seeing whether any of those cluster in the same sub-branch of the ontological tree structure, you can disambiguate them.

The problem is often a case of the tagging system not being able to guess at what the user wants. In my fantasy system, you would enter something like "jaguar" and the software would then ask whether you meant the CAT, or the CAR, or the COMPANY. Or whatever...

And all of this requires a pre-existing ontology. WordNet might help with this. :)

11:39 AM


Post a Comment

Links to this post:

Create a Link

<< Home