For some reason or another (lots of travel, several hats at home and work) I've had trouble finalizing this post. Earlier today though, I read Paul Miller's latest post on ZDNet. There seems to be some discussion about whether or not data is a commodity. I think there IS most definitely data that are a commodity.
Taxonomies are a valuable raw material in the management of information. A file that can be bought and sold and used to improve services. They can be generated by humans, machines, or even better: humans working with machines. Many taxonomies are a dime a dozen, with little to differentiate between versions of the same data. Some are like Kopi Luwak coffee - rare and extremely valuable. The word "taxonomy" is itself suffering from a kind of genericide. Classical definitions still apply: taxonomies have become commoditized.
The complexity of the controlled vocabulary will determine its value to a degree. A simple pick list should be easy and cheap to acquire - a list of countries, for example. Or colors, seasons, months - you get the idea. What is the value of a list of industries? Or companies? Maintenance is the primary cost factor - frequent changes require frequent updates, but an authority file in and of itself is not that complex. A broad and deep poly-hierarchical taxonomy I would expect to have more value. A poly-hierarchical taxonomy is one where a term in the taxonomy can have more than one parent term. Managing these relationships takes more time. An ontology - well, those aren't quite commodities yet, but they will get there. Why? Because they still require a great deal of thought and effort.
The source of the data will also help determine its value. Data from trusted sources - for whom integrity is paramount - should be valued higher. Is the data accurate? Is it maintained? Is it in a usable format? Does it have high availability? (Many quality vendors can be found at TaxonomyWarehouse.com.)
The uniqueness of the taxonomy will drive its value. Like our coffee example above, a taxonomy as ubiquitous as Starbucks will not be as valuable as say a pharmaceutical research vocabulary. Given the, uh, processes needed to produce Kopi Luwak, it is rare and therefore fetches a higher price, as would our R&D taxonomy.
The information security concerns also impact value. Our pharmaceutical company, or a financial services provider, is not about to release it's vocabulary into the wild. It is a significant intellectual asset that merits a substantial IT effort to protect.
I actually like the fact that taxonomies have become commoditized. Why? Competition drives improvement - in quality, in focus, in security and in usability. These are areas that the semantic web community needs to focus on - in my experience, security and usability need attention NOW. Good fences make good neighbors, and when we've got good fences, we can make more links and learn to trust. Icing on the cake!
Flickr image by INeedCoffee