r6 - 12 Jul 2007 - 10:43:05 - MimiYinYou are here: OSAF >  Journal Web  >  MimiYinNotes > ClassificationPaperCondensed

Why do people organize

When it comes to software (Footnote: I make the distinction primarily because the nature of information in the virtual world is so different from physical objects in the real world. So much more malleable, ephemeral, metamorphic, numerous and variegated. In your house there may be a few dozen truly fundamentally different kinds of objects: Clothing, Appliances, Furniture, Fixtures, et cetera. On your computer there are maybe more like hundreds or thousands of different kinds of information: give examples.) organization really amounts to 2 things:

  1. Putting things somewhere so you can find them later
  2. Putting things together so you can see them together

The former amounts to large sets of data that share some common characteristic (ie. a folder of everyone's RSVPs for a dinner party) that you think you'll remember about them when it comes time to look for them (as opposed to all items that happen to have the integer 3 in them). You don't necessarily ever look at the data set for the sake of looking at it as a group. The items themselves don't really cohere to form some kind of bigger picture in your mind. It's simply a means for targeted search and retrieval at some later date.

The latter usually results in small to medium groupings of data that oftentimes don't share a readily available common characteristic (ie. a folder of emails you need to review before your one-on-one with your manager).

As search improves and tag and label-based systems like Gmail, delicious and flickr, fewer and fewer people will feel the need to resort to actively organizing items into folders in order to accomplish #1. #2 will always be around, but they may become ephemeral groupings that go away once the task at hand has been done.

Does this mean that we will eventually live in a world where organization is irrelevant? Where the gathering of data into compartments will become unecessary? Where the construction of those compartments into some kind of structure of groupings will become simply a relic of a more primitive era? Is Google really the end of road?

What's missing from this picture is the guts of why people actually organize: To wrap their head around the data. To turn that data into some form of knowledge that wasn't there previously. To make sense of it all. To impose coherence, shape and scope where there was only a blob of stuff.

[INSERT DDC EXAMPLE]

This is why some people will try to bring items together in groupings and constructions of groupings no matter how advanced search technology gets. It's a basic human instinct, our tireless search to recognize patterns and extract meaning from everything we interact with. Another way of saying it is that we don't like to be overwhelmed, confused, or disoriented.

The principles of grok

People can't wrap their heads around more than 5-6 items at a time. 10 is stretching it. Any more than that is a sign that you need to chunk things down. It's not to say you can't display more than 5-6 pieces of data at any given time. No, that oversimplification has spawned too many vapid artsy websites that are generous on style and stingy with content. [NYT example]

It's just that the data needs to be grouped.

Principle #1 The more chunking the better. The fewer chunks the better.

2 and 1/2 approaches to organization

  • there are semantics to chunking in facets

The nature of hierarchies

Hierarchies are the ultimate chunking machine. They chunk items into containers of items (or folders). Then they chunk containers into more containers...such that in the end, you could have a tree of hundreds, thousands, millions, billions, infinite containers all contained in a single container. What a feat.

But God help you if you ever tried to open such a tree. The guts will pour out first at a slow trickle, but with each successive turn of the caret, like the Gingerbread mother in the Nutcracker, more and more containers will flow out until you have a...really big tree.

In other words, hierarchies are manufacturers of "fake chunking". They reduce the amount of data to grok through a cheating mechanism of "hiding" most of the data such that you only see the tip of the iceberg. But in order to actually get at any of the data, you have to open the tree, and once you open the tree, you're immediately lost in an undifferentiated mass of folders within folders within folders within folders.

GENERIC relationships

  • 100s of generic parent-child relationships
  • notion of parent-child is too generic...it's why the term parent-child is used, it's a geek term
    • people are used to specifics and content in the real world
    • if I say, the paper is inside the drawer, it means that the paper is physically inside the drawer
    • if I say that metaphysics is a branch of philosophy, it mens that it is conceptually contained inside of philosophy
    • if I say that purple is a type of color, it means that purple is conceptually contained inside of color because it is an example of a color
    • if I say, economics is only one of the areas you study as a sociologist, it means that economics is a tool used by sociologists.
    • if I say, this subset of the population in Wisconsin all have blue eyes, it means that blue is a characteristic of a subset of people in Wisconsin
    • basically when you say "inside" to a human being, they take this very generic notion of parent-child and parse it into different flavors of "inside", effectively chunking it down, and based on the context in which it is used, they deduce the appropriate meaning of the word.
  • all of these things however would be represented in the same old generic parent-child paradigm of hierarchies and they would represented inaccurately
    • the paper is truly physically inside the drawer, there is no part of that piece of paper that is outside of the drawer.
    • but economics is not even conceptually inside sociology. it is a peer of sociology, should really be a "sibling" of sociology in the hierarchy. the only problem is that, economics overlaps sociology and in a hierarchy, that can't happen. so in order to depict that overlap, you must put economics inside of sociology and sociology inside of economics...which is inaccurate and confusing because overlap is conceptually very different from being wholly contained inside something else, which is what parent-child connotes.
    • similarly, blue eyes is not really wholly contained inside of these people in Wisconsin. blue eyes is a trait that cuts across all people, even some animal species. and unlike economics and sociology, blue eyes is not a peer of Wisconsonites.
    • all three of these things are very different, yet in a hierarchy they are all represented as the same thing, which is why big hierarchies are overwhelming.
  • doesn't tell you what kind of relationship it is

THERES ALSO THE PROBLEM OF GENERIC CATEGORIES

  • doesn't chunk things into category types

Faceted system

  • DIFFERENTIATES BETWEEN RELATIONSHIP TYPES
  • a facted system treats everything as if it were either economics and sociology: siblings, 2 things that are similar that might have overlap OR
  • as facets or independent traits that cut across each other. just as blue eyes cuts across all people. Wisconsinites have all kinds of eye colors.

  • ALSO DIFFERENTIATES BETWEEN CATEGORY TYPES

  • BUT ONLY ALLOWS FOR 1 KIND OF PARENT-CHILDHOOD: Attributes and Attribute values. Color: Purple. The other kinds of P-C relationships are not represented: Physical and Conceptual.

The result is more chunking. Easier to wrap your head around.

SO WHY WOULD YOU EVER USE A HIERARCHY?

Well, you could faking a lot of facets in the hierarchy

  • so you could fake it...
  • encode each level of the hierarchy with a facet
  • leave room for sub-folders

  • 4 types of containers: chunking
  • Differentiates between parent-child types, siblings and facets
  • Differentiates between category types

But you also get:

  • Not only that: guided navigation
  • Ordering of the world

But fixed structure a limitation

  • can't change structure for different things
  • can't bubble up favorites
  • lose semantics

People try to do this anyway

  • they go back to the messy hierarchy we started out with
  • undifferentiated category types
  • undifferentiated relationship types


Let's go back to facets

  • solves the changing structure problem
  • solves the semantics problem

But you lose something too

  • you lose the ability to represent the other P-C types
  • you lose the guided navigation
  • you lose the ordering of things
  • amounts to asking people to come up with their optimized sorting algorithms


Changing data: The element of time

  • both hierarchies and facets have problem with this
  • hard to evolve the structure as you go along
  • what if you get a new shirt that doesn't fit into any of your moods
  • only slowly over time do you realize that you're developing a new taste for dockers and boat shoes
  • where do you put those things in the mean time?
  • in PIM land, this happens at a high speed in a thousand different categories constantly
  • takes time to see patterns emerge from our data
  • by then it's too late, it's either in a big pile in our Inbox or it's misfiled, shoehorned into an ill-fitting hierarchy
  • but it's still a shirt
  • so you create a shirt categorya t the top of the tree...mucking up your tree

  • Facets are more flexible...
    • there's no structure to muck up, only find it under shirts
    • the shirt pile at the top of the tree is not different from the shirt piles at the bottom of the tree
    • at least that's better
    • There almost needs to be a waiting pile, for stuff you're not sure where, but sometimes we don't realize that something is of a new facet, and we cram it into old ones...and then it's not until there a lot of repetition--symptom that the structure is wrong
    • there needs to be a waiting pile


Tags
  • great for no committment
  • but can't stand on their own
  • really no chunking
  • no category types
  • no category relationship types either
  • no visualization
    • like cli
    • can't tell where you are
    • can't say: i've done all the stuff on the left (ie. hierarchy or faceted browsers)
    • items multiply like rabbits
    • can't look at neighbors


The problem of visualization:

Semantically meaning visualization, why do you need that, it's because you need to be able to chunk down the elements in the visualization into 4 quadrants. Why is something in the upper right versus in the lower left. Calendars do this. Upper right is early in the month, late in the week. Lower left is late in the month, early in the week.

  • x-axis is week
  • y-axis is month
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r6 < r5 < r4 < r3 < r2 | More topic actions
 
Open Source Applications Foundation
Except where otherwise noted, this site and its content are licensed by OSAF under an Creative Commons License, Attribution Only 3.0.
See list of page contributors for attributions.