Cosmo 0.7 Backend Change Proposals
The following list is ordered by importance to 0.7, with the items at the end really being post-0.7 nice-to-have features.
Time Range Index/Query Refacoring
Currently Cosmo expands recurring events and indexes each occurrence. The advantage of this is that it makes time-range searches very fast as the expansion is only done once, at creation time. But, this can result in a ton of data indexed (slow event creation/update times) and it means Cosmo can't really handle infinite recurring events really well. The hack to handle recurring events with no end date is to only index instances up until the end of 2008. This means if you use Cosmo UI and fast forward a couple of years to view your event in 2009...sorry out of luck.
Can we solve this by clever indexing of recurring events so that we don't have to expand them? Probably not, because Cosmo currently supports arbitrary recurrence and exception rules (as defined in the icalendar spec). But we can come up with a hybrid approach.
We determined that we can't really index each recurrence because there can be infinite recurrences. Instead, the index for an event can look like:
| index column || purpose |
| dtstart || Indexes the start date/time of the event (DTSTART) in either UTC or floating date/time |
| dtend || Indexes the end date/time of the event (DTEND) in either UTC or floating date/time. For recurring events, this will be the end date/time of the last occurrence. If it is an infinite recurring event, then the value will be a constant date/time that represents infinity (something like 30300101) |
| isFloating || true/false if the event is floating or not |
| isRecurring || true/false if the event is recurring or not |
This means there will be a single index row per event.
The algorithm for a time-range query would look something like:
- Use time range index to retrieve all events that may occur in the specified range. This is a first pass, and we can't return this list because there may be recurring events that may or may not occur.
- For each event returned in the initial query, prune recurring events that do not occur in the specified time range. This means for each recurring event in the returned set, expand the rules/exception rules/ etc and if it is determined that the event does not occur in the time range specified, remove from the list.
- Return this pruned list.
The first thing to notice is this algorithm will be slower than the current time-range query algorithm because of the expansion that has to be done on the fly. But, creation/update times will be faster and we may be able to help increase query times by clever caching.
Performance and Scalability
Due to all the recent model changes, we need to spend some time examining server performance. This includes first coming up with a set of metrics to measure performance and then some time testing to determine bottlenecks/problems and time to fix these problems.
Tweak Item/Collection Security
Currently Item authorization is based on the owner or the item or if you have a shared ticket that includes the target item or collection. With items in multiple collections, we have run into problems with this model. What happens when I copy an item with access X to a collection with access Y? Currently, the item inherits both sets of permissions based on the ticket used. This results in an interesting security hole that needs to be worked on. Ideas have ranged from per-Item access to item soups segregated by user.
Event Persistence Re-factoring
We are currently persisting events by storing the .ics. This has the advantage of easily supporting arbitrary event data as long as it conforms to RFC 2445. This works great for caldav, but because of Chandler's data model, Cosmo ends up persisting event properties multiple times. Examples are item.displayName is the same as event SUMMARY and note.body is DESCRIPTION. So every time the note body or item displayName is changed, the note has be checked for an EventStamp?
and if one exists, the .ics has to be updated. It becomes a bigger problem with displayAlarm, which is stored in the .ics too. This means we can't really store a displayAlarm for non events (currently we use note.reminderTime).
For 0.7 we may want to change the way events are persisted to be more closer to Chandler's model. That is, event properties are stored as item properties. The downside of course, is that it becomes more difficult to support full icalendar (properties that Cosmo/Chandler don't use, x-props, x-params, etc).
Currently, all application logging is done at the request level by Tomcat and stuffed in log files. It would be nice to log all sorts of stuff at the service level, into a database table. That way, the data is centralized and can be manipulated easily with SQL queries. The next step would then be to add in some reporting functionality in Cosmo.
Cosmo can provide more management functionality using JMX. For example, create an MBean that shows meaningful statistics, or provides useful management functionality (need to define what this is).
Another useful feature would be to add in system-wide notifications. For example every time a collection or event is created/updated, a notifications could be broadcast. There could be a simple plug-in architecture for notification listeners so that notifications could trigger things like "send me email when X is updated". We could utilize a messaging standard like JMS.
Currently, you have to create an account on Cosmo to authenticate. It would be nice to allow other types of authentication including SSO (single-sign-on). A Cosmo user would still have to be created and linked to items/collections/ but this user would get created automatically upon successful authentication with another system.
- 27 Mar 2007