In a state of nature...
In a state of nature, things exist clothed in the rich finery of who, what and how they are. The metadata of real-life stuff, insofar as that which can be perceived by our six senses are on full display. Colors, shapes and textures for the eyes. Smells and scents for the nose. Frequency, amplitude, harmony, sonority, and aural texture for the ears. Heat, cold, and tactile textures for our sense of touch. General attitude, sense of style, body language and tone of voice for our gut-sense.
Taking in a room of eight people, your senses have a wealth of information to feast on, but it's never overwhelming even though if you were count the actual number of discrete units of data, it would quickly start to feel like trying to count the stars.
- Brand of sneakers
- Color of shirt
- Cut of skirts
- Hair styles
- Eyeliner above the eye
- Eyeliner above AND below the eye
- Pierced ears
- Hip-hugger jeans
- Gender
- Ethnicity
- Mood
- Watches on the left wrist
- Watches on the right wrist
- No watches at all
- Casual chic
- Hipster
- Young
- Middle-aged
- Geriatric
- Infant
- Baseball hats
- Wedding bands
- Charm bracelets
...and the list goes on
This enormous amount of heterogeneous data would break most data management systems, But id doesn't break our brains because our brains are designed to grok just this kind of messy data. The converse way of putting it is that things in nature obey the
PrinciplesOfGrok.
Data is presented as itself, not represented through some intermediate medium which encodes the data into some generic system of symbols (ie. alphabet) that you must then put effort into decoding.
The data presentation is static. The fundamental unit of data is a person and that never changes.
What can we learn from things as they exist in a state of nature?
- How can we simulate the way things exist in the physical world in the way we present intangible data to users.
- How do we maximize the amount of information people can absorb?
- How do we maximize the amount of knowledge people can extract from their data?
- How do we make the intangible, more tangible?
Be item-centric, not group-centric.
- In the example cited above of eight divers people in a room, the system is item-centric. The individual is the fixed point around which metadata groupings emerge. The individual exhibits their metadata (ie. height, girth, posture, hair color, eye color, clothing, fashion sense, gadgets, etc) and the viewer extracts from the display, groupings based on that metadata (ie. tall, wide, erect, blondes, blue-eyed, skirts, mod, geeks, etc)
- This is as opposed to a system that is group-centric* where the fixed point is the group and the items are moved around in order to fit within the boundaries of the group. (ie. folder hierarchies, venn diagrams.)
Metadata itself needs to be chunked to create a flexible framework that "normalizes" heterogeneous data
- This has to do with grouping attributes into Who, What, When, Where, Status and Value categories
Use metadata to visually differentiate items in a consistent manner to faciliate pattern recognition and grouping.
- Groupings based on Date and Time should look different from groupings based on Location.
- Today's items should look different from Yesterday's items. London items should look different from Bangalore items.
Represent different metadata appropriately with semantically meaningful visual cues.
- Represent Dates and Times by laying items out on a Calendar or Timeline view
- Represent Locations by laying items out on a Map
- Represent things like Popularity, Ratings or Importance with the Size of the item (a la Flikr and Delicious)
- Where ordering matters: Attributes that are continuous, linear spectrums should be represented as axes along which the data is plotted. ie. Dates, relative priorities, relative urgency, geophysical locations, relative sizes.
Piggyback on culturally understood visual cues.
- Shapes and glyphs. Stop signs, Men and Women bathroom signs, question marks, information "i", etc.
- Piggyback on cultural understanding of color: Red for Junk, Delete, Green for Todo and Yellow for Deferred
- Don't use color if you're going to have more than 5 values, lest you get incidental groupings. (See TheUseOfColor)
- Represent Density or Depth with Saturation and Brightness
Case study: Maps
- Taken from CA map by Raven maps
- california-popup.jpg:
Here is an enormous amounts of data consistently and meaningfully visually chunked and therefore eminently grokkable.
- Landscape is typed into: Desert, Verdant, Water and Man-made
- Man-made is further sub-typed into Municipalities and Roads
- Municipalities can be further sub-typed into: Capitals, Cities, Towns, Villages and Hamlets
- Roads can be further sub-typed into: Interstate highways, State highways, Highways, Roads, Streets and Dirt-roads.
Now, what would it be like to navigate the data in this map as a tree?
The "mapping" of the cartesian plane to longitude and latititude effectively creates "Get and Forget"
spectrum chunks:
- Things to the right are east
- Things to the left are west
- Things towards the top are north
- Things towards the bottom are south
- 5 general areas: Northeast, Southeast, Southwest, Northwest, Central
Case study: Chandler Sticky Planning section
At OSAF, we've been dogfooding our own paper prototype of the ultimate data presentation and visualization tool. We've found it incredibly useful to plan out our releases on a "sticky" board or rather several boards of stickies. And independent of the writing of this paper, Sheila, Lisa and Katie ended up with a sticky board very much in keeping with the spirit of
PrinciplesOfGrok
- The data is chunked in two ways: By release and By workflow tenet
- The chunks are transparent
- The chunks are presented by adorning the data items (individual stickies) with semantically meaningful visual cues
- Color represents high level workflow tenets: Scheduling (Calendar), Communications (Email), Collaboration (Sharing), Triage (Dashboard), Extensibility (Developer Platform), Overall usability (Visual and Interaction Polish) etc...
- Each board represents a release: 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 2.0
- The boards are arranged chronologically from left to right
Sticky board planning allows us to plan at a relatively micro-level (each sticky represents approximately 1 month's worth of work for a developer, but the overall effect of the micro-level planning is aggregated into a visual presentation that paints a big picture view of plan that facilitates top-down decision making.
-
- Are resources being evenly distributed across workflow tenets? Will all workflows converge and be usable by 1.0?
- Are the releases well-balanced? Given how many stickies were able to deal with in past releases (0.4, 0.5), are we being realistic about future releases? 0.7 and beyond?
Some limitations of the paper-prototype:
- It's hard to represent more than 2 facets of information at a time. The ones we've chosen are release number and Workflow tenet.
- It would be nice to be able to show which layer of the app each sticky belonged to as well: UI layer, Services, Repository, Dev Platform
- It would be nice to be able to show which developer each sticky belonged to * This would help with load balancing with respect to resources:
- Which teams are overloaded? Underloaded?
- Which developers are overloaded? Underloaded?
- It would be nice to be able to change on the fly, which 2 axes of information were displayed:
- Dot release versus Workflow track
- Dot release versus Layer of the App
- Dot release versus Developer
- Layer of the App versus Developer
- Workflow track versus Layer of the App, etc...
http://wiki.osafoundation.org/pub/Projects/ChandlerHome/roadmap.html
[INSERT PICTURE OF STICKY BOARD]
Counter Case study: Boundary-based groupings: Trees, Treemaps and Venn diagrams
The methodology and examples presented above assume an item-centric approach to chunking and grouping data.
Most organizational paradigms emphasize the group over the item. They present you with opaque groupings of items contained within some bounded container (ie. a folder, a circle).
So instead of dressing items in wealth of visual cues to represent metadata-based groupings (ie. all green items are unblocked tasks, all red items are blocked tasks), the system groups items by physically drawing them together into a bounded space, be it a folder in a hierarchy or a bubble in a venn diagram.
However, the group-centric approach to presenting data has serious limitations.
- It violates the principle of grok which states that groupings must be transparent because people gain most of their understanding about a grouping from its member items, not from the name assigned to the group.
- It's hard to visually distinguish between different kinds of groups in a semantically meaningful way: Which groupings are based on the same attribute? (ie. All timeframe-based groupings).
- Color isn't almost semantically meaningful. What color should date-based groupings be?
- Shape isn't always semantically meaningful. What shape should date-based groupings be?
- Texture file and border style are hard to understand beyond 2-3 variations and have the same problem as Color and Shape. What border style should date-based groupings be?
- This is because Color, shape and texture are all things better suited to representing attribute values (ie. Now, Later, Done) than attributes (ie. Triage Status).
- Attributes are better represented by different types of visual cues: Colors for Triage status, Size for Popularity, Position along the Y-axis for Importance, Position along the X-axis for Urgency.)
- It's almost impossible to decipher relationships between groups. (ie. There was a lot more infrastructure work done in the earlier releases.)
- It's impossible to lay out spectrum chunks in an order that makes sense (ie. Date-based groupings, rankings, ordering.)
For an example of a
group-centric visualization fo data:
http://netscan.research.microsoft.com/treemap/
- treemap.png: