How to grok such unruly information?
Definition of Grok: What we're aiming for
From en.wikipedia.org/wiki/Grok
Grok is a verb roughly meaning "to understand completely" or more formally intuitive understanding. The term originated in Robert Heinlein's novel Stranger in a Strange Land, where it is part of the fictional Martian language and introduced to English speakers by a man raised by Martians.
The Principles of Grok
The magic number 7
People can hold 7 (+/- 2) units of information in short-term memory, and that's if they're trying.
This little nugget of wisdom has resulted in a lot of minimalist designs that are stingy with content to the point of being unusable.
- Hierarchies chunk down data by hiding data in opaque folders.
- Wizards chunk down data by hiding data behind opaque tabs.
- Microsoft Bob anyone?
People can hold 7 (+/- 2) units of information in short-term memory = People can only handle 7 (+/- 2) units of information.
If we take the first statement as literally as some designs do, AI wouldn't be the insurmountable wall it remains today.
A more nuanced statement might be: Any quantity of information is digestible so long as it is chunked down to 7 (+/- 2) chunks of information and within each chunk, the information is further chunked down to 7 (+/- 2) chunks of information and so on and so forth.
In fact, information is infinitely
more digestible if the process of chunking it down doesn't end up
hiding the information inside opaque containers (ie. folders or tabs in a wizard). Instead, as much information should remain exposed as possible in order to fully express the texture and individuality of the chunk.
Case study: NYT
There is a lot of information on the NYT homepage, but it's digestible because it is well-chunked and the chunks are further prioritized by size and placement, effectively ordering or chunking the flow of data to the reader with respect to time.
- NYT_analysis.png:
How much less comprehensible is the NYT homepage when the chunks are opaque?
- NYT_analysis_opaque.png:
Oftentimes, people will start to group items before they can articulate the name of the group, implying that it's the contents of the grouping that matter, not the name of the grouping. Folder names are often innacurate representations of what's inside, either because they're way to generic or because it's way too hard to articulate all of the nuances of what's inside the folder. (ie. I have a Reviews folder which contains all emails about HR performance reviews, but it could easily refer to product reviews or concert reviews. Only I understand the context in which the folder was created, so only I understand what the folder name refers to.)
This has serious implications for well established accepted workflows for grouping items where you first create and name a folder and then add items to it. Or where you navigate to find items by browsing opaque folders where the only clue you have is the name of the folder. It works for personal information (because most of the context is stored in our head), but fails in bottom-up collaborative environments (ie. wikis) where individuals are creating ad-hoc, unregulated structure.
Case study: DDC
Below is the 0-100 categories of the Dewey Decimal System. Notice how the parts of the listing that are most comprehensible are the parts that have a lot of repetition, in other words, the parts that are chunked down into groupings: General encyclopaedic works, General serials & indexes, General organization and museology, etc.. [Use this example again for Chunking through time]
- generalities.png:
A morsel for a monarch... is one that let's you Get and Forget
The process of "chunking" down data to the requisite 7 (+/- 2) morsels is complex and one filled with pitfalls. The making of a morsel is an art-form.
- A well-formed morsel can take on an infinite number of items and still be easily digested.
- An ill-formed morsel could have just 2 items in it and still be incomprehensible.
Ultimately, the true measure of worth for any morsel of information is how easy it is to Get and Forget the chunk.
A group of items constitute a grokkable chunk when the individual items corale around an
indivisible concept or a
spectrum concept.
An
indivisible concept group usually means a relatively small set of data, less than 10 items. Chances are, if you have any more than 10 items, you have surpassed the limits of human memory and the chunk can probably be further sub-divided into smaller conceptual chunks.
A
spectrum group is a set of items of uniform size that fill out a conceptual spectrum from end-to-end with no gaps and no areas of overlap. Spectrum groups can accomodate a potentially infinite set of items simply because they allow you to
Get some governing principle for how items end up in the group and
Forget about needing to parse the individual items in the group. By getting the underlying mechanism by which the group is formed, you can predict it's membership without actually having to examine the contents of the group.
Example of a mal-formed chunk:
- South America
- New York
- Starbucks on 81st and 2nd Avenue
- Yunnan Province
- 40°29'40"N to 45°0'42"N 71°47'25"W to 79°45'54"W
- My bedroom
- Pacific Northwest
- Conference room A/1742
- 231 Main Street
- Apt 4A
- The Bowery
Example of a well-formed
indivisible concept group: Locations
- 231 Main Street
- 15 23rd Street
- 1 5th Avenue
- 82 5th Avenue
- 1992 Lafayette Street
Example of a well-formed
spectrum concept group: Months of the Year
- January
- February
- March
- April
- May
- June
- July
- August
- September
- October
- November
- December
Example of a well-formed
spectrum concept group: Integers: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50...
Example of a harder to grok
spectrum concept group: Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89...
The goal is to construct conceptual groupings that allow people to instantaneously understand the size, shape and nature of the data set either because it is so small that they can hold the entire data set in short-term memory or because it is such a homogeneous spectrum that someone could look at at just a handle of items in the set and predict the rest of the spectrum. Get and Forget.
A Well-formed spectrum concept group is filled out end-to-end and no areas of overlap
Filled out end-to-end means that there is never any doubt that you're missing out on some cache of misfiled information
- This week's status
- Monday
- Tuesday
- Wednesday
- Friday
- Where's Thursday?
People often
deduce the right answer, by eliminating all of the obviously wrong answers thereby whittling the list of choices down to the one that sounds the
least wrong (ie. SATs anyone). As a result, not being able to see all of the possible answers can sometimes be paralyzing.
No overlap diminishes confusion and ambiguity about where to look to find information.
Case study: Apple menus
The Mac's menu design illustrates the importance of filled-out spectrums to people's ability to
deduce the right answer. Menu itemson Mac never change, even when they are meaningless in the current view. For example, the Mail.app Mailbox menu always offers both
Go Online and
Go Offline as options, even if the user is already Online or Offline.
The user might not know a priori that the
Go Offline menu item will make the little whirly-gig animation stop going round and round without the aid of the greyed out
Go Online menu item which tickles their memory and helps them remember how they got the app to "Start Syncing" in the first place.
- mailboxes_menu.png:
Maintain stable context
- Related data should accrete as overlays or additions to the view rather than replace itself in mutually exclusive views (see TheNatureOfFacetedSystems for an example of the latter).
- Data in phyiscal proximity is easier to grok than Data in temporal promixity.
- In other words, it's hard to understand relationships between two sets of data if you can't seem them together (ie. What is the relationship between Urgency and Priority)
First the Priority list...
- Study for mid-terms (I'll fail if I don't do well. Lose my scholarship and get booted out of housing.)
- Call the plumber (Toilet's been a pain in the...)
- Get a free Nano on eBay deal (Sweet!)
- Get more aspirin (I'm
- Pay Billy back (Whatever, not my money.)
Then the Urgency list..(Don't cheat and look at both lists at the same time.)
- Get a free Nano on eBay deal (Deal expires in 5 seconds!)
- Call the plumber (The toilet overflowed and I have to use it now! but I could probably hold it for 5 seconds.)
- Pay Billy back (Billy's my plumber, but he'll do it if I beg!)
- Get more aspirin (I've got a headache now, but I'll live if I can't find it.)
- Feed the cat (Incessant meowing is driving me nuts, but she'll live if I can't feed her just now.)
- Study for mid-terms (I've got a week!)
Compare the lists above to the graph below. How does the overall picture change when you can see Priority AND Urgency metadata overlayed on top of each other? This is the kind of synthetic thinking we do in our minds that allows us to have a gut intuition about the order in which we should proceed with our tasks. But, when we're dealing with:
- Large sets of data (hundreds rather than dozens of tasks) and/or
- Unfamiliar sets of data (team tasks rather than personal tasks)
The ability to do all of this processing in our heads becomes challenging if not impossible. This is often when we start to say things like:
I feel overwhelmed. OR
There's so much to do, I don't know where to start.
- PriorityVersusUrgency.png:
Keep hierarchies shallow
- 2 levels of chunking is the maximum most people can grok (see NYT example above)
- Check out this treemap which tries to show you all the levels of the hierarchy at once. You can see which branches are the most deeply nested better than you can see in a traditional hierarchical view (primarily because the boundaries of the containers are clearer), but that's about all you can see. http://netscan.research.microsoft.com/treemap/
- See ChunkingOverTime for a more nuanced discussion of this issue
- See how the boundaries of traditional tree views are harder to discern than the treemap
- See how the relative sizes of containers are harder to discern than the treemap
-
- Later on in this paper, we will see more sophisticated data presentation techniques that can get around this cognitive limitation.