Chandler and LDAP interoperability
1) LDAP Overview and use cases
LDAP directories are widely used for a range of applications in larger
organizations, including academic institutions. Chandler will therefore
need to interoperate with LDAP to some degree to be successful in these
environments.
The question of how Chandler should best interoperate with LDAP
directories is complicated by the fact that such directories are
used for a range of purposes, and are used both by end-users and
by applications. Thus we need to first identify interesting use
cases of LDAP directories and decide which of these we want to
target.
The following list is not expected to be comprehensive...
- On-line "phone book" (i.e. directory of contact info for end-users)
- Group directory (list members of organizational groups, mailing lists, etc.)
- Email routing information
- Login/service account information
- Authentication data (puplic keys)
- Authorization data (unix groups, ACLs)
- Directories of system information for distributed system management
Although anyone who uses LDAP is likely to use it for at least
some of these cases, things are complicated by the fact that they
are not necessarily doing so in a compatible fashion... for example
one organization might store the members of a group using the
uniqueMember attribute of the
GroupOfUniqueNames object class,
another might use the
member attribute of the
GroupOfNames class,
yet another might forgo both of these and reconstruct groups from
the list of UIDs of the
PosixGroup object class. Or, to make things
worse, some organization might not think any of these approaches
adequate for their needs and define a whole new object class for
esentially the same purpose.
Ignoring these kinds of issues for a moment, cases 1 and 2 are
relatively simple to support as a "client", i.e. an application that
looks up the information in the directory on behalf of the user...
in this case the application (Chandler) could directly use a certain
set of well-known attributes and present the rest "raw". This is
the approach taken by Netscape.
Case 3 is of relevance to Chandler email, but it is expected that
the existing mail transport infrastructure handle this case, so
Chandler will not need to support it directly.
Case 4 will be discussed in the document on Chandler security.
Case 5 has at least one well-standardized instance (X-509 certs)
which is outside of the scope of this document.
Case 6 is a difficult one which Chandler will probably need to
support to in some way... I believe that the approach we'll describe
here may work with some restrictions.
Case 7 is not relevant to Chandler... we can restrict ourselves to
thinking of LDAP directories as containing information about people.
2) Chandler and data sources
It is part of Chandler's intrinsic architecture to be able to support
multiple data sources... although the principle data source is the
Chandler repository, we intend to also support, i.e., IMAP as a
data source for email mailboxes. Since Chandler data is "richer"
than typical email stores, the plan is to be support objects which
have attributes coming from multiple data sources... i.e. a mail
message might be stored in IMAP, but associated markup would be
stored in the Chandler repository.
The approach for LDAP we'll take here fits into this model. Chandler
"contact list" information is similar to the data stored in LDAP
directories... so if a Chandler instance is running in a context where
an LDAP directory is available, the information from that directory
should be merged with information from the Chandler repository in real
time.
Note that this means that both data sources are authoritative.
Chandler may cache LDAP information in the repository, but in contexts
where freshness is important it will still have do a lookup to both
data sources every time data is needed. Since there is usually a
local Chandler repository on a machine, if the machine is off-line,
it may be able to use data cached from a prior LDAP lookup... this
is a simple case of synchronization.
2.1) Caching policies
Chandler will need to cache data from external data sources in order
to be able to function in disconnected mode. Most LDAP data is
highly cachable as it doesn't change very often and the freshness
of data isn't very important in most use cases. However in some
use cases freshness may be absolutely critical, and it may be that
while object freshness is not critical, specific attributes must
be fresh (i.e. an attribute that grants some form of authorization).
To accomodate this Chandler we need a way of specifying the caching
policy. This could be done in one of two ways...
- In LDAP by extending the LDAP schema with an attribute that Chandler can interpret as Caching policy
- In Chandler by specifying patterns that define caching policy for sets of matching LDAP objects and attributes
The former will be preferrable in an organizational context as it
puts the configuration with the already established processes for
LDAP directories, but has the disadvantage that it some change to
the existing infrastructure.
3) Schema transformation
Chandler data is inherently extensible. LDAP directories are also
extensible and have flexible schemas, and as we pointed out before
these are often extended in incompatbile fashion. We will therefor
need a flexible mapping of an LDAP schema to a Chandler data... for
example, since we will want to use LDAP information about groups as
mailing lists or aliases, we'll need to map the various ways of
representing groups of people in LDAP to a Chandler group.
The best way to do this is to define a filter API that lets us write
plug-ins to accomplish the transforms. Chandler can then ship with
a set of filters for common LDAP use cases, and organizations can
modify these or write their own if needed.
The filter will need to accomplish several things...
- Map LDAP DNs to a set of identifying attributes in the Chandler repository so we know which LDAP object matches which Chandler object.
- Map LDAP attributes to Chandler tripples
- Potentially merge multiple LDAP objects into a single Chandler object or vice-versa
This is complex, but made somewhat simpler by the fact that we can
probably get away with doing the transforms only in one direction.
4) Use cases
4.1) Address book / Contacts
This simplest use case is easy to support giving the above assumptions.
It includes things like dynamic lookup of email addresses, the address
book, expansion of groups for email or other contact methods. For
this case "freshness" is probably not critical and we can cache or
synchronize all LDAP access to the Chandler repository (through the
filter of course). The principal difficulty here is one of data source
precedence... i.e. when to go to the LDAP server for data vs. using
data from the local repository in cases where the two might overlap or
conflict. This is mainly a UI issue... the UI must ensure that all
the relevant choices/conflicts are presented to user and/or that the
default resolution mechanisms represent "least surprise".
4.2) Group directory
There are two principal use cases for groups... one is a user convenience,
i.e. sending an email to a group rather than listing all the recipients
individually or inviting a group to a meeting in the calendar. The other
is the user of group membership as authorization data, i.e. someone may
have access to data or may have certain rights because they are a member
of a specific group. The former case is identical to 4.1, but the later
case is more difficult, and is discussed in the next section.
Groups are somewhat problematic in LDAP because there are so many ways
they can be represented... explicit membership lists, implicit through
attributeson the members, dynamic through a search, etc. All these can
be mapped to explicit groups using the schema transform mechanism... we
will store all groups explicitly in Chandler and all cached LDAP groups
will be explicit, even if they were the result of an expansion of a
dynamic or implicit group in LDAP.
4.3) Authorization and access control
A common case for this is authorization by group membership, i.e. a
user is authorized to access something if they are member of a certain
group. It is likely that supporting some of these cases would be
extremely valuable to organizational users... since Chandler can
share data, limiting access by mechanisms already in use in an
organization may even be a requirement.
There are a couple of difficulties with supporting this in
Chandler using this scheme...
- Data freshness is very important
- The transforms must be unambiguous
These should not be insurmountable. We can use an attribute
to indicate that cached data may not be used for authorization, so
the data must either be looked up in the LDAP directory again, or
authorization must fail with an appropriate exception. See also
the section on caching policy above.
There are other authorization cases as well, such as authorization
by existence of a particular attribute. I believe that all these
cases can in fact be mapped to an instance of authorization by
groups internal to Chandler, so they could all be handled by
appropriate customization of the filters.
5) Miscellaneous considerations
5.1) LDAP binding and data access
Not all data the a Chandler user might want from an LDAP directory
is likely to be public and in the higher-Ed context regulations
like FERPA place requirements of limited access on such directories.
We can assume that access control is already implemented at the
LDAP level, so to function in such environments, Chandler will
have to authenticate as a user who has some level of priviledges
on that LDAP server.
There is a potential catch-22 here... in order to authenticate to
the LDAP server a user (or a client program acting on their behalf)
will need to know their LDAP DN, but they may not be able to
discover this DN without doing an LDAP search, which in turn they
can't do without being authenticated. Resolving this catch-22
is outside of the scope of this document, but Chandler must have
the flexibility to implement whatever approaches or compromises
organizations adopt as their local policy.
--
JurgenBotz - 07 Apr 2003