The Incredible Chunk
Hierarchies are the ultimate chunking machine. They chunk items into containers of items (or folders). Then they chunk containers into more containers...such that in the end, you could have a tree of hundreds, thousands, millions, billions, infinite containers all contained in a single container. What a feat.
But what would happen if you ever tried to open such a tree. The guts would pour out first at a slow trickle, but with each successive turn of the caret, more and more containers would flow out until you ended up with a...really big tree, a tree so big, it defied grokking. You actually wouldn't need such a big tree to challenging human comprehension: Try wrapping your head around this tree:
Given what you can see about the branches of the tree that are open, can you predict what will appear when you open up the closed branches: Assistants, Application Support, Address Book Plug-Ins?
* hierarchy_clean.png:
In other words, Hierarchies are manufacturers of "fake chunking". They reduce the amount of data to grok through a cheating mechanism of "hiding" most of the data such that you start out with only the tip of the iceberg in view. But in order to actually get at any of the data, you have to open the tree, and once you open the tree, you're immediately lost in an undifferentiated mass of folders within folders within folders within folders.
You're damned if you do. You're damned if you don't. The problem with generic relationships.
Hierarchies are constructed out of generic Parent-child relationships. The fact that they are generic is great, because they can be used to model all kinds of relationships, the only limitation being that the relationship has to be some variant of the generic Parent-child prototype.
But you lose something by painting in such broad, all-encompassing strokes. You lose data. You lose semantics. You lose context. Generic and broad brushes dumb down everything they touch. They turn brightly colored, multi-dimensional objects into homogeneous goop and generic designs can quickly start to feel as oppressive as systems that are too specific and too small.
In the end however, generic Parent-child relationships fail on every level because not only are they incapable of expressing the subtle nuances of different flavors of Parent-child relationships, they also fail to account for the whole universe of relationships that are neither Parent-child nor Sibling. Parent-child and Sibling are simultaneously too general and too specific.
The real world is full of specifics, rich with semantics and embedded in context. Unlike the virtual world of computers, where things can float in a conceptual vacuum of nothing, things in the physical world always have a place, a location, a time, a textured and nuanced environment.
- If I say, the paper is inside the drawer, it means that the paper is physically inside the drawer
- if I say that metaphysics is a branch of philosophy, it mens that it is conceptually contained inside of philosophy
- if I say that purple is a type of color, it means that purple is conceptually contained inside of color because it is an example of a color
- if I say, economics is only one of the areas you study as a sociologist, it means that economics is a tool used by sociologists.
- if I say, this subset of the population in Wisconsin all have blue eyes, it means that blue is a characteristic of a subset of people in Wisconsin
When you say "inside" to a human being, they take this very generic notion of parent-child and parse it into different flavors of "inside", effectively chunking it down, and based on the context in which it is used, they deduce the appropriate meaning of the word.
All of these things however are represented (inaccurately) in Hierarchies as generic Parent-child relationships. Another way of saying it is that generic Paren-child relationships in Hierarchies fail to express the subtle, yet significant distinctions between different types of relationships.
- The paper is truly physically inside the drawer, there is no part of that piece of paper that is outside of the drawer.
- On the other hand, economics is not even conceptually inside sociology. It is a peer of sociology, should really be a Sibling of sociology in the hierarchy. The problem, economics overlaps sociology, but in a hierarchy, that can't happen. In order to express that overlap, you must put economics inside of sociology and sociology inside of economics...which is not only inaccurate but confusing as well.
- Similarly, blue eyes is not really wholly contained inside of these people in Wisconsin. Blue eyes is a trait that cuts across all people, even some animal species. Unlike economics and sociology, blue eyes is not a peer of Wisconsonites.
The Incredible Chunk fails to chunk
Inspite of it's over-muscled chunking abilities, Hierarchies fail to chunk down the potentially infinite number of relationships between groupings in the hierarchy. For every additional level of hierarchy created in a particular branch of the hierarchy, there is an nC(n-1) + n increase in the number of relationships created. Where n=number of folders in the newly created level.
- hierarchy_relationships.png:
When there are only 2 kinds of relationships and potentially thousands of item-container and container-container relationships, dividing thousands of relationships by 2 still leaves you with a lot of Parent-child relationships and alot of Sibling relationships ripe for richer, semantically meaningful parsing and typing.
The problem of generic categories
This same problem applies to the generic nature of the containers themselves in Hierarchies as well. Since all containers are created equal, there is no opportunity to express differences between
- a container that groups data together based on some shared characteristic (ie. Digidesign plugins) versus
- a container that groups data together based on a different shared characteristic (ie. VST plugins) versus
- meta-containers that are used to group not the plugins themselves but containers of plugins grouped by Application
In this sense you can think of the container Plug-ins as the Attribute name. And Digidesign and VST as the Attribute value.
- Plug-in: Digidesign
- Plug-in: VST
An then there's the issue of the Components folder which isn't a grouping based on Application, but istead describes things that cross Applications?
The point we're beating to death here is that the subtle, contextual differences between Plug-in Components, Digidesign Plug-ins and CST Plug-ins are no where stored or expressed in the data system. The hierarchy is dumb and it is up to the human to sort out the contextual nuances. This of course means that the human must individual parse each folder in the hierarchy one at a time. And given what we know about the strengths and weakness of how people understand information, such a system imposes formidable barriers to human comprehension of the data set as a whole.
- hierarchy_containertypes.png:
Most of the time however, people don't event attempt to classify their containers. They simply build trees of characteristics or attribute values or tags and don't bother with typing the containers. Which is fine in most cases, especially if you are a single user managing personal information. However, like most things having to do with organization, such undisciplined systems start to fall apart as soon as you try to share them with someone who is not already familiar with the data and did not take part in constructing the hierarchy.
If I were to share with you a bunch of folders labeled: 1/21, 2/21, 3/21, 4/21...would you be able to guess the contents of those folders? Does 1/21 refer to a date? Jan 21st? Or is this the first of twenty-one folders? And if it is a date, is it a folder of things created on Jan 21st? received on Jan 21st? due on Jan 21st?
This inability to type or classify containers means yet again losing data, losing semantics, losing context to the generic dumbness of hiearchies.