Engineering Proposal for Exportable addresses (2-layer addressing)
Introduction
The Chandler repository currently has its own layer of addresses -- repository paths and UIDs. These internal addresses aren't intended to have any semantic meaning or relationship to other features.
When Chandler makes its content browsable, it will have to expose external addresses for certain items. These external addresses could be defined only for one protocol/purpose (e.g. for WebDAV support, or for FTP support) and thus be special-purpose designed for each problem that needs to be solved. Or, rather than define special-purpose addresses, we could have a friendly item address scheme for broad usage. We prefer to attempt a single exportable addressing scheme, which uses friendly and flexible paths for parcels and applications to refer to items, and as a basis for external addresses for a number of protocols.
This proposal might add a new layer without affecting repository paths, or it might replace repository paths with a more flexible layer used both by parcel developers and export processes. However this proposal is certainly not intended to replace UUIDs, either for parcel developer usage or to permanently identify items.
We expect that exportable address will include, as a subset, user-facing addresses. Naive users might rarely see these addresses, but sophisticated users blur into programmers and would see these addresses occasionally. User-facing design requirements are being developed by the design team. See
DesignRequirmentsNamesAndAddresses.
Overview
These are the proposed characteristics of an exportable address:
| Characteristic | Explanation |
| Parent/child and slash-delimitered | An exportable addresss is slash-delimitered (like file system and HTTP addresses) to indicate a parent/child relationship, because WebDAV requires that. An item is a member of the collection identified by the address after removing the item's path segment (the last segment of the path). It is a WebDAV requirement that a child URL be dependent on that of its parent. |
| Not a completely flat address space | Exportable addresses must not simply be UUIDs (or any other 1-layer structure) because that offers no efficient way to browse a remote repository. Instead we must take advantage of parent collections, so that remote clients can ask for the list of children of the parent collection. That means at least a 2-layer structure, if not n-layer. |
| Independent of storage location | The exportable address has no relationship to how the items are stored -- the parent/child relationship is primarily there for purposes of remote browsing, not for storage. |
| Generated on creation | The exportable address is automatically generated, path-segment by path-segment, as a collection, view or item is created. A path-segment MAY be generated from an item's display name, but need not be. |
| More than one 'binding' | A given exportable address is certainly not necessarily the only address for an item. There may be multiple bindings to the same item. This allows users to share multiple collections, with the same resource linked into each collection, with a consistent collection-derived address space. It's also been my experience in developing several address-based repositories that bindings are eventually required but much easier if planned from the start. This is one of the ways in which exportable addresses are clearly different from repository paths, because in the repository implementation, each item has only one reposiotory path. See 'bindings' below. |
| Not normally visible in GUI | An exportable address tree is not the normal way users find local items in the GUI, in fact there may be very little visible use of exportable addresses in the GUI. The user will start from the sidebar and jump around following links. |
| Rarely change | Exportable addresses rarely change -- they are permanent except in the condition that the user chooses to cause an exportable address to become invalid. Exportable addresses may be derived from strings intended to display to the user, but if the display strings change we don't plan to change the exportable address. The address also must not change if changes are made, or even if a new version is created (HTTP has ETags to see if the resource located via a URL has changed) |
| [OI] ACL inheritance follows parent/child | The parent/child relationship MAY have implications for ACL inheritance. If users drag an item to a "sharing" address namespace this could result in making that item publicly or partially readable. |
| [OI] Search may follow parent/child | The parent/child relationship MAY have useful implications for searching. If a user links a set of items to a knitting-stuff namespace then a search on that namespace should potentially match on all items linked into that namespace. |
Engineering Advantages
- Exportable addresses are a portable way of sharing Content Items and collections. The same exportable address can be used by other Chandler clients and by any other HTTP or WebDAV client. Thus, to share a View, the sharing user would cause the exportable address to be sent out to others. The sharing user doesn't need to know ahead of time if the sharees are Chandler users or no.
- Exportable addresses allow remote clients to browse the repository. The root of the exportable address namespace affords a starting point, from which to find readable collections and the Content Items inside those collections.
- Exportable addresses allow bindings to be transparent to the remote browsing/synchronizing client.
- Two-layer addressing means that the repository has more power to make arbitrary internal naming decisions (structure, semantics and actual strings chosen) and to change those decisions from release to release. The only thing the repository has to preserve is the exportable address.
- Users may change exportable addresses if they explicitly wish to, but this will only change a binding, not an internal address (easier to implement).
- Alternatively, users may create a new exportable address (a new binding) to an existing item, without changing its existing addresses (fewer Not Found responses)
- How hidden can our internal repository paths be? Is any developer forced to recognize them? John would like to make them as hidden as possible (and removed from as much existing code as possible).
Requirements or goals
- Exportable addresses must be human-readable; even parcel developers will find it easier to work with friendly addresses.
- Exportable addresses should be compatable with hierarchical naming addresses, e.g. HTTP, FTP, IMAP.
- Although a content item starts with only one exportable address, it can easily be given other bindings (thus multiple URLs to same thing) -- I've always found this to be an eventual requirement in the past and one that's easier to deal with earlier.
- Not all repository items have exportable addresses.
- Exportable addresses should not depend on the repository addressing scheme. When an item is given a new exportable address (from the user or parcel developers point of view, this may be a "move" or "rename" operation) this does not affect the repository address.
- Exportable addresses should not depend on a parcel name, because some data (like email) might be managed by a parcel that is then replaced by another parcel, yet the exportable address does not need to change.
- The exportable address space should be able to address parcels, kinds etc as well as content items.
- Exportable addresses should have very little relationship to kinds or types. Eg. just because an item is of kind "Contact", that doesn't mean it needs to be in the exportable namespace called "Contacts". To look at it another way, a user or developer can't tell from the address "/foo/bar/Contacts/JoeBlow" whether an item is a Contact or some other type of thing -- the application code needs to query the item at that address to find out what kind it is. This goal implies that exportable addresses do not have file extensions, which bears further discussion.
- Views should have exportable addresses. Thus, any item that appears in a view has an extra address (binding) defined by that view.
- Even things that we don't yet know that we want to export -- like CPIA Blocks -- should be included, at least theoretically, in this plan. To put it another way, we shouldn't leave anything out of the addressing framework that we might want to export later. (Not all these things would be findable by users very easily -- some parts of the namespace can be protected from user changes or hidden from users finding them in normal conditions)
Use cases and scenarios:
- A new appointment is created. The code that creates it must assign at least one exportable address to the item (to avoid more work, this could replace the current step of assigning an internal repository path). A good exportable address would be
/Calendar/Lunch with Tug. Another way to say this is that the code that creates the new appointment chooses to give it the binding name Lunch with Tug in the /Calendar collection. After the item is created, this parcel or any other parcel can retrieve the same item via this exportable address.
- A calendar is published to a Web server as a set of WebDAV resources in a WebDAV collection. The publishing code can easily choose reasonable HTTP addresses for the calendar and its appointments based on the exportable address:
http://lisamachine.example.com/Chandler/Calendar/.
- IMAP folders are synchronized to the local repository with the same address structure the IMAP folders have on the server. Since multiple accounts may exist, these replicated IMAP folders (locally, they're collections) need to be put inside a collection named after the account. So the exportable address for one of these collections could be
/Archives for ISP account/personal/lists/knitting-group (or it could be flatter). The point of this exercise is that a user can directly share their IMAP boxes from Chandler, for use by other devices or other users.
- The default OSAF calendaring parcel, called "osafcal", is replaced with a new shareware calendaring parcel with more features called "sharecal". The new "sharecal" parcel knows how to interpret the original calendar data and takes control of that data. However the exportable name remains
/Calendar/, and more importantly, external references to this data (e.g. an HTTP address) do not change. This is an important use case because it shows that the exportable path shouldn't contain the parcel name. There needs to be another way to find out what parcel "owns" what content besides looking at the exportable address.
- A sophisticated user decides to move some emails out of the main
/Archives for ISP account hierarchy, because the user doesn't want those emails to show up in searches of the Mail hierarchy or be synchronized to the IMAP server. Instead, those emails are placed into a collection called "/Old-job-stuff/Mail". Another side effect is that this user can publish (or synchronize) all their mail except the "old-job-stuff" mail. This use case is important because it shows that email content items won't necessarily all appear in the exportable address hierarchy where they were originally created. Note that internally this function is a feature of collections, which is why for external use we feel addresses ought to have some structural relationship to collections.
Implementation
Certain Kinds will have exportable addresses. These Kinds will have attributes for either parents, or children or both. Ted is looking into this as it's a data model issue. We need to figure out if there's a different Kind for things that can have children and for things that can't. In this context "children" are used in the sense of addressing foremost, which means that a parent item can have a relationship to child items whose URLs begin with the URL of the parent item.
Every path segment in an exportable address MUST map to a collection or a view, except possibly the last path segment, which may point to a Content Item or another item that must be mapped as a Web resource because it needs its own body.
Exportable addresses will used everywhere above the repository -- only the repository would use internal storage paths or UIDs very frequently. Here's how that would work, using the example of a new mail.
- A new mail arrives via IMAP download. The IMAP client code needs to pick a default exportable address for the new email when it submits it to the repository for creation. We've decided that the path "/Inbox for ISP account/" (to be exact, it would be
/Inbox%20for%20ISP%20Account/) is the default place to store email arriving in this account. The parcel also needs to pick a unique name within Inbox, and it chooses a name based on the subject, but with a uniquifying number, thus /Inbox for ISP account/Re:shawl kit order[3] is the final exportable address.
- Under the covers, the repository chooses an internal path somewhere arbitrary under UserData?. The internal repository address selection can rely on anything, such as the date the item was created or the name of the parcel that created it, to pick a name.
- Thereafter, when the the contents of the Inbox of the ISP account are listed, the repository will list what's in
/Inbox for ISP account and the new mail will be returned.
- The user can make the new email appear in other collections as well, thanks to bindings. The user can add the email to their
knitting-group collection, which may be shared to other knitters. A new binding is created for the same email, probably /knitting-group/Re:shawl kit order[3]. This does not result in "not found" responses for the old address because the old address is also maintained as a binding. Both addresses work for the same email (remote Chandler repositories can tell if they are actually the same resource by checking some ID).
- If the user decides to remove the item from their ISP account inbox, they can consciously decide to have that URL unbound. The email itself is not actually deleted, nor is the URL in the 'knitting-group' collection affected.
- When a WebDAV request comes in from the outside for this kind of item, the WebDAV service module can look up the email in the repository using only its exportable address -- in other words, the exportable address is completely sufficient to look up an item. The WebDAV service module might then need help figuring out exactly how to export the email in an interoperable format (the mail parcel might help with that) but that can be determined by looking at the item's kind.
We should encourage parcel developers to provide only a fairly basic naming scheme, that is to say not to try to do anything too complicated when assigning names at item creation. If items are too carefully organized semantically (e.g. tasks could be arranged in collections by category and then by priority and then by recurring/non-recurring) this is too brittle. OTOH, broad and unspecific organization will result in address namespaces containing very large numbers of items (e.g. a HTTP/WebDAV folder that contains 10,000 emails). The user may also wish to reorganize their exportable address namespace if the user cares deeply about how exportable addresses appear. There still needs to be a default organization that doesn't require the user to think about how their exportable addresses are organized.
Exportable paths and Views
All named Views have exportable paths, so that the exportable path serves as one address to get to either the view or things inside the view. For example, if I a named "unread mail" view which contains mail from Katie, and I also have a specific "mail from katie" view, then at least two exportable addresses would exist for a new mail from Katie:
/Unread mail/Hi Lisa and
/from katie/Hi Lisa (two bindings to the exact same mail).
Note that not all exportable path segments would have to be views. For example,
/core/schemas/parcels/mailParcel wouldn't have to be a view. Or
/Inbox for ISP Account/ need not be a view, it might simply be a collection.
Exportable addresses and ephemeral views
Chandler is designed around the idea of having a large number of ad-hoc views. A user can group emails together by dragging one on top of the other, creating an ad-hoc view "centered around" the drag target. A user can click on a link on a contact to see all events including the contact, or all mails from the contact, or all mails to the contact. These are all viewable -- they're all views -- so as I understand it, the user could choose at any time to share one of these views.
So are all these views ephemeral or permanent, or are there two kinds? It seems completely reasonable to have ephemeral views when they're simply viewed once, or even several times, by the local user, on demand. However, once the user decides to share an ad-hoc view it needs to be given a permanent address, or at least one that won't change for unrelated reasons. So perhaps the best compromise here is to leave ad-hoc views as ephemeral objects, with no exportable address assigned, until the user chooses to share.
Exportable paths and UUIDs
Sometimes it helps to have a permanent address. Particularly, when a user shares a specific single Content Item such as an event, the user wants that item to continue to be shared regardless of where it's organized. We can have an address namespace of permanent addresses, based on UUIDs, alongside the collection-name address namespace. This area of UUID addresses might not be browsable, but it could certainly still be useful for direct sharing with a permanent address. E.g. if we have the address
/__uuids/DA259054-D93B-498C-8C10-DEBD83EF1357
then a GET request to that specific address could certainly return the exact item uniquely and permanently identified with that UUID (provided it's available and readable on that repository). However, a PROPFIND request to the
/__uuids collection would
not return all the children of that collection, so this namespace isn't useful for browsing.
Default addresses
If the exportable addressing hierarchy has more than one address to the same thing (multiple bindings), then there will sometimes need to be one default. A default is required when there isn't enough context to choose between multiple addresses, yet one must be chosen (e.g. to send a change notification, or to log activity, or to share something). Whenever possible, we can try to have enough context to choose the best address. For example, when another Chandler repository registers for change notifications, the source repository can remember what address it used initially and use that same address for change notifications related to the item.
The default address can be chosen situationally, however. When a user shares a single Content Item, we should choose a permanent address (the UUID-based address described above).
Namespace Skeleton
When we deliver Chandler we'll want to have at least a skeleton hierarchy of namespaces for use by parcels. What does this skeleton look like? We're starting to think it will be mostly a two-layer namespace.
First, note that the namespace ought to be have at least one way of being roughly segregated by Kind, even though we don't want to design our UI about this concept ("silos"). There are two reasons for keeping silos in the addressing even if we avoid them in our UI design:
- WebDAV (and other tools) perform better when doing operations on objects with similar properties. WebDAV can list all the start and end times for all the events in a calendar, more efficiently than for listing events that are mixed in with items that don't have start and end times.
- Tools that are used outside Chandler may only be interested in one type of thing, as in a cell phone synch tool that just wants contacts. Segregating by Kind makes it easy for outside tools to find the data they're interested in.
So, the top layer will look something like this at initialization:
/__uuids
/Archive for ISP account
/Calendar
/Contacts
/Inbox for ISP account
/Named Views #also contains ad-hoc views -- any persisted views
/Notes
/Sent items for ISP account
/Shared views
/Tasks
This skeleton assumes a single email account -- more top-level collections would be added if the user has multiple email accounts. This skeleton has no relationship to the sidebar -- it doesn't dictate what does or does not appear in the sidebar.
The calendar, contacts, notes and tasks collections that contain all Content Items of a certain Kind. For example, the calendar contains all Content Items that are events:
/Calendar/Lunch with Tug.ics
/Calendar/Symphony concert.ics
/Calendar/Trip to Whidbey Island.ics
If a Calendar event is also an email (let's say the Symphony concert started out as an email, and was stamped as an event) then the same item appears in two places:
/Calendar/Symphony concert.ics
/Archive for ISP Account/Symphony concert
The
Shared Views collection contains any other collection that the user decides to share. For example the user could have an ad-hoc collection for all home events, and a separate collection for all work events (based on the value of the "@context" attribute). In addition, the user could create ad-hoc collections of mixed types, such as a collection of emails and tasks relating to a project. These would all be "created" under the
/Shared Views collection
/Shared views/Home Calendar
/Shared views/Work Calendar
/Shared views/Schema Project
Shared views could also be created as top-level collections, of course.
New Content Items, when developed by 3rd party parcel developers or invented by the user, should be encouraged to create new top level folders to hold their new Content Items. For example, the
ZaoBao feed items could be created in /ZaoBao or sub-folders.
/Zaobao/Feeds
/Zaobao/Entries
Content Items are typically linked from views inside the Named Views and Shared Views folders, but those are more likely to have mixed content and be slower to synchronize if that's the only place that user data is found. To improve the ability to synchronize and have outside tools work with a set of Content Items, make sure they appear in their own collection roughly sorted by Kind.
Character restrictions, i18n
The
URL specification should be used to guide what characters we can allow, and which ones we have to escape, in exportable addresses. Collection names and path segments can have spaces and question marks and other odd characters internally, but for putting in URLs these characters need to be escaped. Python libraries can probably handle this for us quite nicely.
Do collection names change when the user's language changes? Does a French user care that their "Sent Items for ISP Account" is called "Courriel envoyes..."? Note that because of the existence of the "Shared Views" folder, the user can still give distinctive translated names for shared folders:
/Shared Views/Calendrier personnel
/Shared Views/Projet Bagatelle
Since URLs are not intended to be displayed in the chandler UI, we hope this is sufficient. Note that individual content items are shared with paths like
/__uuids/DA259054-D93B-498C-8C10-DEBD83EF1357, which is even less friendly to a French user, or any other user, than a path like "Shared Views". Clearly, paths are not meant for display, wherever it's possible to use something more friendly.
API
Queries will be able to use exportable address namespaces to limit scope. For example, I could search every address beginning with
/Shared Views/Schema Project by searching "in" that namespace. Searches can be depth-infinity or depth 1. I can search every item to a depth of infinity in
/Shared Views by providing that address. Note that queries can also limit scope by type -- e.g. "every item of type Mail" -- but that's a different kind of search.
Does the repository expose a way to pick a unique name for a new item inside an existing namespace, or is the application providing the new item responsible for suggesting that name? If the application suggests the name then the name could have some relationship to the semantics of the item. For example, an email parcel could create new emails using a sequential ID ("Msg336") or by using the Subject of the mail with a uniquifying number (e.g.
Re: IRC and wing[2]).
Issues
One of the main open questions for where the exportable addressing should be layered is what layer the repository sharing will work at. Mike and Andi seem to be working under the assumption that two synchronized repositories (or in the sharing use case too) will use the identical UID (and path?) to store and refer to a synchronized resource. I'm not sure that's a good idea, particularly if the two repositories are at different versions. If the two repositories have the same content for
/Shared Views/Schema Project/Recap meeting Agenda[2], it shouldn't matter if the two repositories have the same UID or internal storage path, and that could be a good thing as it's doing more to keep the two repositories at an arm's distance, independent of each others versions/quirks. If the item that address refers to is changed, that's fine -- that's a content update and the remote repository downloads the new content. A sharing repository would make its items' UUIDs readable, but those would be used for reference and comparison by the remote Chandler, not as the remote repository's native UUID.
Do schema items/kinds have internal addresses, whereas Content items have external addresses? Does every item have an exportable address?
- One possibility is simply to answer "yes" and require the parcels to assign an exportable address for everything
- Another possibility is to answer "yes, but". Every item would have an address, but sub-items (those that parcels "attach" to other items) would be assigned addresses in the "jungle" or "dump".
- Another possibility is to answer "no". Things that don't need to be navigated to from the outside, but can be referred by another item (e.g. an email attachment) can have internal addresses only.
Does the repository need to expose its internal addresses at all? The internal address or the UID could equally be used to determine if two exportable addresses point to the same item.
--
LisaDusseault - 01 Apr 2004
Comments from MikeT
Two questions that came to mind as I've been trying to assimilate the concept:
1. You mention in the Issues section above that the remote repository would just be a content update and it would download the new content - but how would it know? It's possible in the addressing scheme you mention that the item would have a different "address" than the source repository. Without some sort of UID how would it able to make that decision?
2. In your skeleton example you have usernames as the primary distinct item. Doesn't that preclude being able to have multiple persona's that have views into different collections? It's true that I have multiple email accounts but I want the ability to place emails (mailing lists and the like) from one account into a common collection and be able to group or view them distinct from the account that received the item. (boy I hope that made sense

)
--
MikeT - 01 Apr 2004
I think I addressed these comments in the latest proposal version...
--
LisaDusseault - 19 May 2004
See also
Warning: Can't find topic Trash.PagesAboutAddressing