Barry Warsaw Conversation Notes
* The feed parser checked in to /Lib/email of the Python head is a great improvement over previous Python mail message parsers. It handles better badly formated messages and processes incoming data line by line. The parser is still under development and will be a major addition to the email 3.0 release scheduled for the end of the summer.
*
anit-word is an open source Word Document reader which can be used to pull text
from a Word document attachment for searching.
* Report Labs may have an open source Adobe PDF viewer which could be used to pull text from a PDF document attachment for searching. More investigation is needed.
* Better Unicode support will be a key feature of email 3.0 as the parser will attempt to detect the appropriate encoding if non-specified (utf-7, utf-8, etc).
* It was mentioned that a desirable feature of the parser would be to have event callbacks for inline spam probing with a product such as spambayes.
* The IMAP libraries in Python have been fairly stable for some time and no-plans are currently for any major development. Some issues were raised regarding the libraries robustness (quirky IMAP servers, Pipelining).
--
BrianKirsch - 20 Apr 2004