Status:
Andi: Skiing
Ted: Agreement w/ John on
QueryItem?, will try to do a little code this week
Heikki: good progress on PKI
0.4 Planning:
How much of sharing/access control can be done? Andi thought that if 0.4 was going to be longer than 0.3 that two major features (sharing, access control each count as one) could be done.
Data model issues need to be resolved, Ted has the ball to hold a meeting.
Ted raised an issue that we need to spend some time on stability and footprint work for 0.4
PyCon:
Review of current status. One more person signed up on the Wiki, plus hazmat from IRC will be going and splitting time with Chandler and distutils.
Mitch is arriving Tuesday PM and leaving Thursday afternoon. We need to try to get the BOF scheduled accordingly.
ItemCollections?/PlayLists/Etc
Mitch is trying to rationalize/unify these concepts and wanted to know some more about the implementation of item collections.
Question:How would you model "all e-mails from Freada"?
Answer:
Assume Freada is an item
The Freada item has a bi-directional ref collection attribute linking it to the e-mails sent by Freada.
The contents of the ref collection are the answer.
Question: Okay, what about emails sent from Freada between Jan 2003 and Feb 2003?
Answer: Use the ref collection as the starting point for the query, and then use range query on dates to locate the right messages.
Question: What about indexes?
Reply: ref-collections are implemented using
BerkeleyDB? BTrees. Indices are typically implemented using BTrees, so one way of looking at the use of ref-collections is that you are creating indices in the schema. This means that the schema design is important.
Question: What about ad-hoc queries over all e-mail?
Answer: if you've never asked the query and there are no indices (ref collections or extra indices) then the best we can probably do is exhaustive search -- an advanced query optimizer might be able to piece things together and do better but that would definitely be out of scope for 0.4
Question: What's the relationship between ref-collections/schema/modelling and queries?
Answer: ref-collections are a form of explicit index. All query systems work by taking stock of extant indices (BTrees, hash tables, etc) and trying to figure out how best to put them to use. So the relationship is symbiotic. In cases where the query is frequently executed, then the schema will probably be designed with this in mind (appropriate ref-collections will exist) -- the query will know to make use of these. For more ad-hoc queries, the query system will do its best to make use of any extant indices as it tries to evaluate the query.
--
TedLeung - 09 Mar 2004