r1 - 02 Jun 2003 - 00:12:00 - BrianDouglasSkinnerYou are here: OSAF >  Journal Web  >  MeetingNotes > SmallMeetingNotes > DataModelMeeting20030530

Meeting Notes -- 30 May 2003

When & Where

  • Friday 30 May 2003 -- 10:15 - 2:00
  • OSAF office in Belmont

Who attended

Agenda going in

Highlights of what we talked about

  • parcel programmers vs. "baseball fans"
    • One of the big themes of today's meeting was the idea that there are two distinct types of people who use schema information:
      • parcel programmers == These are people like the OSAF engineers, who are writing parcels like the OSAF calendar parcel and the OSAF e-mail parcel. Also third-party programmers who are writing their own parcels.
      • "baseball fans" == These are end-users. But not the users of task-specific parcels like the calendar parcel. Rather, just users who are using the general purpose info management tools, like the SuperWidget?.
    • Similar needs:
      • parcel programmers create schemas to represent domain-specific info, like calendar appointments, tasks, etc.
      • "baseball fans" create schemas to represent domain-specific info, like baseball teams, baseball players, baseball games, etc.
      • both types of people are happy to think about their domains in terms of items, and attributes, and kinds of item
    • Markedly different needs:
      • parcel programmers need fixed schemas, strong typing, and guarenteed enforcement of schema restrictions
      • "baseball fans" need flexible schemas, weak typing, and the ability to easily make ad-hoc changes
    • Different backgrounds:
      • parcel programmers are used to thinking about data modeling features from object-oriented programming languages -- inheritance, pointers, etc.
      • "baseball fans" are used to keeping their records in tools like Excel, or Filemaker Pro, or Access -- which offer slightly different data modeling features

  • schema changes
    • parcel programmers may make changes to the schema, but the changes will be carefully planned, and the new parcel code will be written so that it (a) can do data conversion on existing items, and (b) can deal with running into old-format data
    • "baseball fans" will also change the schema information, by doing things like adding attributes to a kind, or deleting attributes, or changing their names. Those changes will impact items that already exist. But baseball fans won't write data conversion routines, so Chandler will have to deal gracefully with that.
    • in both types of use case, there will be some thorny issues. once we have a data model design that describes what schemas look like, then we need to cycle back and think about what types of changes we will allow people to make to their schemas, and what we need to do to support those schema changes.

  • attribute inheritance via value inheritance
    • In yesterday's meeting we talked about "attribute inheritance" vs. "value inheritance". Today Andi pointed out that value inheritance could actually be used to implement attribute inheritance, by having Kind instances value-inherit their "Attribute Definition" values from across the super-kind Item-Ref.
    • A further optimization would be to cache, or "copy down" the Attribute Definitions in the Kind instance that inherits them.
    • Brian suggested that if you view value inheritance as a type of derivation rule, then the "copy down" optimization is an example of caching derivated values, which may be generally important for indexing.

  • lifecycle events
    • item instances go through lifecycle events, like instance creation, instance deletion, and instance cloning
    • Item-Refs can have definitions that dictate how to handle these lifecycle events -- when to do deep copies and when to do shallow copies -- whether to include sub-items in a delete operation
    • parcel programmers will think through these issues carefully, and may make schemas with carefully chosen Item-Ref settings
    • "baseball fans" will want to easily create new schemas, using simple default Item-Ref definitions that behave in simple, predictable ways

  • "domain attributes" vs. "house-keeping attributes"
    • any given item will have both domain attributes and house-keeping attributes
    • domain attributes are things that the end-user cares about, like a baseball player's "name", or "age", or "batting average"
    • house-keeping attributes are things that the chandler infrastructure code cares about, like "last-modified" time, or "version" number, or a "logically deleted" flag
    • domain attributes should always be visible to the user
    • house-keeping attributes may frequently be invisible to the user, although in some cases the user might want to be able to look at them (e.g. "creation date")
    • probably users should never be able to edit house-keeping attributes directly

  • "display names" vs. "identifier names"
    • an attribute definition can have a display name, which is what the end-user sees -- e.g. "Start Time"
    • an attribute definition can have an identifier name, which is something that might appear in Python code -- e.g. "startTime"

  • terminology
    • we settled on the following terminology:
      • reserved words:
        • "Item" -- a bunch of attribute values -- pretty much everything is an item -- e.g. "Lunch with Pat"
        • "Attribute Definition" -- e.g. "Start Time"
        • "Kind" -- a category of items -- a Kind has a set of Attribute Definitions -- e.g. "Calendar Appointment"
        • "Item-Ref" -- a reference from one Item to another -- e.g. "Employees<-->Department"
        • "Domain Schema" -- a set of Kinds and global Attribute Definitions -- e.g. the "Baseball Schema" or the "Chandler PIM Schema"
      • non-reserved words:
        • "thing" -- no special meaning in Chandler -- just another fuzzy English langauge word
        • "schema" -- no special meaning in Chandler -- just another fuzzy English langauge word
    • Andi raised the point that we're using the term "Item" both down in the "Building Blocks" layer and up in the higher level layers (actually all the way up to end-user terminology). Are the Building Block "Items" the same thing as higher level "Items"? If not, then we should probably have different terms to distinguish them.

  • "global attributes" vs. "local attributes"
    • we resolved to support "global attributes", shared between Kinds
    • we resolved to also support "local attributes", specific to a single Kind

  • "sub-attributes"
    • we talked about RDF's idea of sub-attributes
    • we resolved not to think too hard about this right now, on the theory that it should be something that could be added after 1.0 without breaking anything (although we did note that it might be difficult to write database code that would be efficient when processing queries on a super-attribute)

  • diagram
    • We settled on diagram showing how the schema info is organized. I don't have a good way to reproduce the diagram here, but here's what it shows:
      • A Domain Schema item has a collection of Kind items
      • A Domain Schema item has a collection of Attribute Definition items, representing global attributes
      • A particular data item (e.g. "Lunch with Pat") has a defining Kind item
      • A Kind item has a collection of Attribute Definition items
        • some of those Attribute Definition items may be local to this Kind item
        • some of those Attribute Definition items may be global attributes that are in the collection of Attribute Definition items pointed to by the Domain Schema item that is pointed to by the Kind item
        • some of those Attribute Definition items may be "imported" global attributes that are in the collection of Attribute Definition items pointed to by some unrelated Domain Schema item
      • An Attribute Definition item may be pointed to by more than one Kind item
      • An Attribute Definition item may be pointed to by at most one Domain Schema item

  • Attribute Definitions vs. Attribute Bindings
    • When a Kind item includes an Attribute Definition item, the Kind item uses all the general information defined in the Attribute Definition item, which is shared by all the Kind items that use the Attribute Definition item
    • In addition, there may be some specific information particular to the use of the Attribute Definition item in this specific Kind item -- information that not's associated with the Attribute Definition item itself, but with the binding of the Attribute Definition item to the Kind item
    • Here's a breakdown of what we decided about what information should be associated with the Attribute Definition item and what should be associated with the binding
      • Attribute Definition info
        • "type"
          • could be something like int, float, string, date...
          • could be a specific sort of Item-Ref
          • could be "Any", meaning any of the above
        • "one vs. many"
          • this is really "cardinality" info -- but we want to be clear that we're only offering the two choices, "one" and "many", rather than more complicated things like "4" or "6 to 8"
          • defaults to "many" when a "baseball fan" creates a new Attribute Definition
        • "identifier name"
          • used as a python token -- e.g. "startTime"
        • "display name"
          • appears in the UI -- e.g. "Start Time"
          • can be a simple ASCII string, or a Unicode string, or a "Polyglot string" (meaning a dictionary of localized string translations, keyed by langauge)
      • Attribute Binding info
        • "required"
          • a boolean value -- means the same as "not null" -- the attribute must be included in every instance, and must always have a value
    • There are a few different options for storing Attribute Binding info:
      • We could have separate Attribute Binding items between a Kind item and an Attribute Definition item
      • We could somehow associate it with the Item-Ref that relates a Kind item to an Attribute Definition item
      • We could somehow associate it with the the (attribute of the Kind item) that points to the Item-Ref that points to the Attribute Definition item
        • We might be able to do this just by using our "Compound Attribute" idea
      • We didn't pick which option we want -- we'll cross that bridge when we come to it

  • "Emergent" typing for kinds vs. Declarative typing for kinds
    • resolved to pick declarative typing for kinds
    • we won't provide any direct support for "emergent" typing, although a third-party parcel developer could write a parcel that had this feature

Game plan coming out

  • John, Andi, Katie, Brian
    • follow-up meetings next week to keep deciding what features should be provided by the Data Model


-- BrianDouglasSkinner - 02 Jun 2003

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Open Source Applications Foundation
Except where otherwise noted, this site and its content are licensed by OSAF under an Creative Commons License, Attribution Only 3.0.
See list of page contributors for attributions.