External Email Libraries (OSAF)
Note: this page is perhaps historically interesting, but the choice was made and the libraries integrated: Chandler uses Twisted's libraries.
This area is for discussing various external libraries that could perhaps be used for email.
While the various protocols look simple at first, they get complicated very fast, so it would save us a lot of time if we could use someone else's code.
Here is a preliminary evaluation designed only to answer the question, "Does OSAF need to develop our own code for POP, SMTP, IMAP, and message parsing, or can we leverage someone else's work?"
The answer is, surprisingly, not completely clear, and depends in part on what kind of license OSAF can tolerate. (See Main/ImplicationsofOtherLicenses.) Some of the libraries are GPL, which would eliminate the possibility of selling commercial licenses. Some of the libraries are LGPL or similar licenses, which wouldn't eliminate selling commercial licenses, but it would make the license agreement a bit more complicated.
IMAP and MIME are much more difficult than POP and SMTP. SMTP and POP are mature enough and simple enough that there aren't any significant implementation differences. MIME is tricky mostly because there are lots of email programs in the wild that get it wrong. IMAP is tricky partly because it is just more complex, but also because it is fundamentally asynchronous. Because two clients can be accessing an account simultaneously, the client
must be able to accept messages from the server at any time.
While it isn't particularly risky for OSAF to implement POP and SMTP themselves, there is a significant risk in writing MIME and IMAP handlers. Private comments are usually KDS' interpretation of what somebody said in email or IRC.
BSD-only
If OSAF is restricted to BSD-style licenses:
- IMAP: There is no good BSD-style IMAP library, but there are three suboptimal alternatives.
- The c-client library, because it's written by the same guy as wrote the IMAP spec, is pretty much guaranteed to be IMAP-compliant and fully functional. However, many people complain about the code being opaque and synchronous. (Note that c-client has POP/SMTP/message.)
- The Jungle.Python distribution's imaplib.py is also synchronous and incomplete. It might be possible to contract out improvement to that library; there are competent people who would love to take a month and fix it up.
- An external reviewer found the LibEtPan IMAP code to be well-thought through but poorly documented. There has been essentially no activity around LibEtPan? for a very long time, so OSAF shouldn't count on a lot of support. It is in C. It might be a good start, but it would certainly need work. (Note that libEtPan has POP/SMTP/message parsing components.)
- Message parsing: The Jungle.Python distribution's email module seems to be fine. It's pretty robust and many other pieces of code use it. It doesn't handle non-compliant messages perfectly, but Anthony Baxter is currently actively getting worked on to be better (as part of the very active spambayes project). In addition, Anthony Baxter and Barry Warsaw (the orignal author) have been very responsive.
- libEtPan and c-client have message-parsing components, but the same concerns apply.
- POP/SMTP: The Jungle.Python distribution's poplib and smtplib are serviceable but synchronous. OSAF would need to write a wrapper around them.
- Note that apparently Twisted, which is listed under LGPL, sells other licenses as well.
LGPL
If OSAF makes the business decision that LGPL libraries are acceptable, then more options arise.
- IMAP: Mozilla and Twisted
- The Mozilla code is clearly robust. There are also lots of people around OSAF who are familiar with Mozilla code. The build process has traditionally been scary, and we don't yet know if Mozilla depends upon any libraries that are themselves GPL'd.
- The Twisted IMAP code is a little hairy but quite readable, asynchronous, and in what appears to be a well-thought-out if complex-and-difficult-to-grok framework. It is very new -- and so probably isn't totally mature yet -- but is a lot of activity and excitement around Twisted. If OSAF takes the IMAP code, then OSAF pretty much has to take all of the framework code as well to make it work -- Twisted is itself an application that treats other code like plug-ins. If OSAF takes the framework, they might as well take some of the other protocols as well. Twisted is in Jungle.Python (vs. C++ for Mozilla), which might be an advantage or a disadvantage.
- POP/SMTP/Message parsing: If OSAF takes IMAP from either Mozilla or Twisted, it would seem reasonable to take the other libraries from there as well.
- Message parsing: some people reportedly prefer the mailutil's libmailbox utilities.
Next steps:
- OSAF needs to decide -- from a business perspective -- what type of license OSAF is willing to accept.
- A comprehensive testing of candidate libraries would be nice, but has a non-zero cost. In particular, to do a rigorous test of robustness and feature comprehensiveness would take, well, basically, building an email client and testing it against a number of different servers.
Detailed information
Below is a table that summarizes the various libraries, with columns for the name (linked to the source code), the language, the license, and a very brief comment (linked to detailed comments where applicable). (If the code is part of a suite, the comments aren't always repeated for message parsing or POP or SMTP.)
Summary Table
| IMAP |
| Library | Language | License | Comments |
| IMAP from c-client | C | UWashington's Free-Fork License | Widely used, widely derided. |
| imaplib.py | Python | PSFL | Used some, derided some. |
| Twisted IMAP4 lib | Python | LGPL | Asynch, hairy but nice framework, very new. |
| libimap | C++ | LGPL | Skeletal, orphaned |
| LibEtPan IMAP | C | BSD | Well thought through but immature and inactive. |
| Mozilla IMAP | C++ | NPL | Robust, scary build |
| Evolution IMAP code | C | GPL | Unix-only, GPL |
| KDE IMAP | C++ | GPL | Unix-only. GPL |
| Sylpheed-Claws IMAP | C/GTK+ | GPL | GPL |
| JavaMail API | Java | Sun Community Source License | wrong language |
| Mail-IMAPClient | perl | Artistic License | good rep; wrong language, c-client based |
| Fetchmail IMAP | C | GPL | Unix-only, GPL |
| Message Parsing |
| Library | Language | License | Comments |
| email package | Python | PSFL | probably fine |
| Mozilla MIME,S/MIME | C++ | NPL | Hairy. |
| rfc822.py | Python | PSFL | Deprecated. |
| GMIME | C | GPL | GPL, used by gnome? |
| Mailutils mailbox | C | LGPL | ? |
| libEtPan | C | BSD | Immature orphan |
| Evolution MIME | C | GPL | Unix-only, GPL |
| KDE MIME | C++ | GPL | Unix-only, GPL |
| Fetchmail parsing | C | GPL | Unix-only, GPL |
| SMTP |
| Library | Language | License | Comments |
| smtplib.py source | Python | Python Software Foundation License (PSFL) | well used, synchronous |
| libEtPan! SMTP | C | BSD License | immature orphan |
| Evolution SMTP "Camel" | C | GPL | Unix-only, GPL. |
| KDE SMTP | C++ | GPL | Unix-only, GPL |
| Sylpheed-Claws SMTP | C/GTK+ | GPL | GPL |
| Fetchmail SMTP | C | GPL | Unix-only, GPL |
| libESMTP | C? | LGPL | Unix-only |
| POP |
| Library | Language | License | Comments |
| poplib.py source | Python | PSFL | well-used, synchronous |
| pop3proxy.py | Jungle.Python | PSFL | not suitable |
| Libpostal POP libraries | ? | ? | ? |
| ZOE SZPOP.java | Java | Creative Commons | wrong language |
| Fetchmail POP | C | GPL | Unix-only, GPL |
| Twisted pop3.py | Python | Lesser GPL | Maybe hard to understand |
| libEtPan POP | C | BSD | immature, orphan |
| Evolution POP libraries | C | GPL | Unix-only, GPL |
| KDE POP | C++ | GPL | Unix-only, GPL |
| Sylpheed-Claws POP | C/GTK+ | GPL | GPL |
Comments
IMAP libraries
- University of Washington IMAP Toolkit (C, University of Washington's Free-Fork License) by by Mark Crispin of University of Washington
- Private comment: Part of c-client. UW lib is the industry standard for compatibility, but needs some pretty substantial modifications to get it to do interruptability properly. (It has much more than IMAP, but there are some knocks on its POP support.)
- Used by Mahogany -- and UI freezes every time POP is polled.
- Main.DuckySherwood: The rumors have it that it's very powerful but hard to understand. See Greg Noel's critique, Terry Grey's rebuttal, and Greg Noel's rebuttal rebuttal
- KDS: C-client is pretty much guaranteed to be correct almost by definition (since the IMAP spec author wrote it) but I've heard complaints about the coding style; it also allegedly depends heavily on global variables, so is dangerous to use from more than one thread.
- imaplib.py source and docs (Jungle.Python, PSFL) in standard Jungle.Python distribution. Used by: WorldPilot?, UsableEmail?,Zope-Messages, OfflineIMAP.
- Private comment: imaplib.py totaly sucks, it's not even close to complete
- Main.DuckySherwood: Seems to be missing login referral, mailbox referral, IDLE, UIDPLUS, ID, MULTIAPPEND. Only has CRAM_MD5 authentication, not KERBEROS_V4, GSSAPI, or SKEY. (There might, however, be a way to hook in different authentication mechanisms that are coded externally.)
- Twisted IMAP4 lib (Jungle.Python/LGPL) This project is sponsored in part by Divmod.com.
- KDS: Spotty comments -- sometimes exceptional, sometimes sparse, but overall pretty good. (NOT like the Twisted POP code.) Nice short methods, variable names and method names are usually descriptive. Brand-new; probably not robust yet. Asynch. Not so easy to understand -- you have to grok quite a bit of the framework in order to understand the IMAP lib. To use it, you have to include quite a bit of the framework as well.
- Private eval: Twisted is itself an application that treats other code like plug-ins. Quote from Twisted docs: "Although there are other ways for Twisted to call your code, all Twisted projects should start as a plug-in of some kind."
- KDS: what Luther and Ducky have learned about how Twisted works
- KDS: API docs
- libimap (C++, GPL lib or LGPL) by Chris Read
- Main.DuckySherwood: missing a lot of features, including AUTHENTICATE command. Also doesn't seem to be under active development, and has an unsettling CVS comment of "segfaults on semaphores in the reader thread"... not ready for prime time. Chris Read says that he'd be happy to turn this code over to someone else.
- LibEtPan IMAP (C, BSD) by DINH V. Hoa
- Main.DuckySherwood: Dinh said that he wanted to make an email library with an abstraction layer above the messages access method, be it IMAP, POP, an MH filesystem, an mbox-format file, or NNTP. He thinks it's done enough for someone to start using, but acknowledges that it is imperfect. Not under active development.
- Private eval: Well thought-through, missing some functionality, poorly commented. Immature, but good start.
- Mozilla IMAP (C++, Jungle.Netscape Public License)
- KDS: The build process has traditionally been scary, but might be less scary when Minotaur is fully independent. For now, the build instructions for Minotaur are to pull and build the entire Mozilla source tree.
- KDS: While the Mozilla license is LPGL-ish, there's always the possibility that it might depend upon libraries that are either GPL or extensive enough that Chandler would collapse under their weight. A Mozilla team member tells me that it does depend upon Mozilla's network libraries. Further investigation required: stay tuned to ThunderbirdBuildNotes.
- Note: because the Mozilla bug database is open, that will help us be aware of/iron out interoperability issues if we start with immature (or non-existent) code.
- KDS: I did a pretty extensive search through the extensive Mozilla code base, and found a few additional licenses in the Mozilla code base.
- Timo Sirainen: Mozilla's behaviour seems to be mostly good, except it also doesn't
handle shared mailboxes well.
- Evolution IMAP code (C, GPL) used by Evolution
- KDS: seemed extremely hairy and complex to me. Nice idea to superclass everything in "camel" but hard for me to follow. Unix-only; would need to be ported to Windows.
- Fetchmail(C,mostly GPL2) by Eric Raymond
- KDS: Fetchmail is designed for Unix systems -- it downloads messages from POP or IMAP, then feeds them to localhost port 25 using SMTP. It's not exactly an email program, but it does have much of the same code that you'd want for an email program. Great comments documenting corner cases or oddnesses of the spec/implementations. However, it has a fair number of static variables, so it might be a bit tricky to make thread-safe.
- KDE IMAP (C++/GPL) used by KMail
- KDS: reasonable variable names, sometimes long methods, minimal commenting. Currently Unix-only.
- Mail-IMAPClient (perl/Artistic License)
- Private comment: good reputation, very complete package (including test cases)
- Sylpheed-Claws IMAP (C/GTK+, GPL) by Hiroyuki Yamamoto
- KDS: Looks extensive, though has lots of static variables.
Message parsing (RFC2822, MIME, S/MIME)
- email package and docs (Jungle.Python, PSFL) in standard Jungle.Python distribution
- Used by: Zope, UsableEmail?, Mailman, TDMA, SpamBayes?.
- Barry Warsaw, the author of the email module, says: The email package is all about manipulating email messages. It very specifically avoids any notion of transmission. The key idea behind the email package is to separate the object model from parsing and generating, the benefit being that you can write custom parsers and generators and still interoperate. Goals for the email package include full RFC compliance and replacing modules such as rfc822.py, mimify.py, etc. Some RFCs may not be supported directly, but are not hard to implement (e.g. RFC 2557). I'm open to contributions for any standards or other features that folks might think are missing.
- Barry continues: While I believe the parser is quite robust and correct, it could probably use some optimization. Anthony Baxter is working on that, and fortunately there are lots of unit tests for the trickier aspects. Note too that the standard parser has both a strict and a lax mode. The latter does pretty well with some of the more, er, interesting messages you see in the wild but you still get messages that are too broken to parse. The vast majority of such messages are spam and viruses anyway. There's a HeaderParser? class that can be used as the ultimate fallback parser and that should be able to handle just about any arbitrary stream of characters. :)
- Anthony Baxter (in email) said: I'm using it in production for a voice messaging server - this provides a phone interface to people's mailboxes. This gets pretty heavy use. The major problem with the current email package (in my opinion) is the parser isn't particularly robust in the presence of malformed messages. [...] I'm currently re-writing the Parser module to take a completely different approach. [...] My goal is to have the parser have a "give it your best shot" mode, where it will do the best that it can with a message (currently, stuff that's too broken for the parser results in getting nothing at all). [...] The current email parser isn't overly zippy - if in the rewrite I can make it faster, that's well and good, but my main concern is handling terminally broken messages as well as is conceivably possible.
- Anthony continues: Lest this all seem too negative, it's only the Parser that's needing this level of work. The rest of the code (for instance, the stuff for building messages) is fine, and a pleasure to work with. And for 99.9+% of messages, the current stuff works fine. It's [received messages that are] seriously broken MIME that [are] a problem. [...] As far as fixability, Barry's always more than happy to accept fixes.
- Private comment: As far as S/MIME libraries go, the best thing I've seen are hooks into OpenSSL. KDS: That will do the encryption for you, but not the certificate management.
- rfc822.py
- Used by smtplib, UsableMail?
- Deprecated.
- Mozilla MIME and Mozilla S/MIME
- GMIME independent c library used by gnome (C/GPL)
- Mailutils mailbox (C/LGPL) libmailbox
SMTP
- smtplib.py source and docs (Jungle.Python, Jungle.Python Software Foundation License (PSFL)) in standard Jungle.Python distribution.
- Used by: Emailux, WorldPilot?, UsableEmail?.
- Barry Warsaw says: I have some experience with the *lib.py client-side modules in the standard library. smtplib.py seems fine...
- Main.DuckySherwood: well-commented, only a few terse vblnames. Synchronous. Imports rfc822.py, which has been deprecated in favor of the email module. Takes message as a string.
- libEtPan! SMTP (C, BSD License, DINH Viêt Ho )
- Sylpheed-Claws SMTP (C/GTK+, GPL) by Hiroyuki Yamamoto. Used by the Sylpheed-Claws email client.
POP
- poplib.py source and docs (Jungle.Python, PSFL) in standard Jungle.Python distribution
- Used by UsableMail?, mailgate.
- Main.DuckySherwood: I find it a little bit hard to read because the variable names are old-school terse -- names like "o", "c", "pswd", "retval", and "a". Medium-grade comments.
- Synchronous.
- Returns a list of lines (strings), which isn't what Email.Parser uses or what smtplib.py uses.
- From the poplib.py source: "POP is a line-based protocol, which means large mail messages consume lots of python cycles reading them line-by-line."
- pop3proxy.py (Jungle.Python, Jungle.Python Software Foundation License) by the SpamBayes? project
- Main.DuckySherwood: looks very clean, well commented, easy to read, but isn't really appropriate for us. It relays commands from a client to a server; I had hoped it would act like more like a client (storing messages) and a server (serving those stored messages), but it's just a pass-through. It might be useful to look at for ideas, but it's probably not useful as a stand-alone.
- POP libraries from Libpostal. Libpostal is an ongoing project to make various email management tasks more workable. Our present subgoal is to create an abstraction library for different access methods (POP, IMAP, MAPI, mbox, Maildir, mh folders, etc etc etc)
- ZOE SZPOP.java (Java, Creative Commons)
- Fetchmail - perhaps this could be used by Chandler for POP/IMAP mail retrieval?
- Twisted pop3.py (Jungle.Python, Lesser GPL)
- Main.DuckySherwood: A. Nice architecture, in that it inherits much of its behavior from an abstract superclass. Right way to go. However, it has NO (zero) comments in it! What's up with that? It is also a bit hard to understand: Luther and I spent a long time trying and failing to figure out how it worked. It turned out that POPClient is supposed to be subclassed and that POPClient commands return data one line (or block) at a time and that there are funky "reactors" which do the timesharing.
- libEtPan POP (C, BSD)
- Evolution POP libraries (C, GPL)
- KDS: I found the Camel libs really hard to make sense of. That might be me not having read C in a long time, but it seemed amazingly complicated for POP....
- KDE POP (C++/minimal, though possibly tainted by GPL?)
Sylpheed-Claws POP (C/GTK+, GPL) by Hiroyuki Yamamoto
MAPI
- KDS: MAPI is proprietary, and I'm not finding any libraries that speak it except for that which comes with IDEs.
Security
Import Tool
Filtering
Interoperation with Microsoft Exchange
Other libraries related to email
Spell Checker
- Aspell (C++, GNU Lesser General Public License) by Kevin Atkinson
- Jungle.Python Spelling Construction Project (Jungle.Python, GNU Lesser General Public License) by Mike C. Fletcher. Early stage project to implement Aspell-style engine in Python.
- Jungle.Python MySpell? Bindings (Jungle.Python & C, GPL) Jungle.Python Bindings for Myspell (used by openoffice speller) comes with multiple language dictionaries.
Other email clients
These clients might have code that we could use:
- Mozilla mail / Minotaur (C++/XUL, Mozilla Public License)
- Pygmy -- open source python/gnome email client. Tested on Linux and FreeBSD?. Uses fetchmail.
- UsableEmail -- python IMAP client. Includes a spellchecker (GNU General Public License, Nick Edens).
- World Pilot -- python web-based email system (WorldPilot? Public License, Neuberger & Hughes GmbH? and Ryan Hughes)
- Emailux -- python SMTP_AUTH client
- Mahogany (The Mahogany "Artistic License", The Mahogany Development Team)-- full GUI email client, mostly C++/wxWindows with some python glue
- Main.DuckySherwood: Mahogany uses c-client, with no asynch wrapper -- which means that the UI freezes whenever downloading messages.
- Sylpheed-Claws (C/ GTK+, GNU General Public License, Hiroyuki Yamamoto).
- Jungle.Python Mail System (Jungle.Python, "Public Domain") -- toolkit for building python email clients; has POP but not IMAP code
- Zope Messages built on imaplib.py and smtplib.py
- email_marcro (Jungle.Python/?)
- Main.DuckySherwood: POP-only (despite claims), poorly commented
- IMP (PHP, GPL) - A set of PHP scripts that provides IMAP access. Used by: University of Michigan.
- Evolution (C/GPL) seems to have rolled its own libraries. Nice abstractions, by-and-large well-commented. See a README about the mail libraries, which for no reason, are called "Camel".
- An "oscenely out of date" discussion says that the Camel libraries are not asynch but are threaded.
- Zoë (Java/?) uses Javamail
Which libraries?
We are interested in libraries (i.e. code that somebody else wrote, regardless of how it is packaged) for:
- protocols: POP, SMTP, and IMAP. (See EmailStandards?.)
- parsing email messages (including all the MIME stuff)
- security (including protocol authorization and message encryption)
- related services (e.g. spell checking)
Evaluation criteria
- Licensing. OSAF's LicensingPlan is not straightforward (or finalized).
- Main.DuckySherwood: My understanding is basically that we can't use anything that has a license that is more restrictive than "give us credit and then do whatever you want"... like the BSD license, the perl Artistic License, or the Jungle.Python license.
- Quality of code. Is the code clean, tight, and easy to read?
- Fixability. Code is much less useful to us if we can't get into the originating tree. While it would be great if we could count on the original owner to fix bugs that we find, at a minimum, we need to feel comfortable that the owner will accept and incorporate fixes reliably.
- Synchronicity. Is it synchronous or asynchronous?
- Threading.
- Robustness. How thoroughly has the code been exercised? Which other projects use this code?
- Control flow: who owns the event loop? Is it client-pull (polling), server push?
- Performance: What is the throughput for high volumes of messages? For large messages? What is the round trip for short transactions.
- Size: What is its memory footprint? How many lines of code?
- Internationalization: does the code handle non-English text adequately? (Note: DuckySherwood is, at this point, completely unsure how the code should handler international text.)
--
DuckySherwood - 05 Jun 2003
Comments:
For rfc822 parsing the rfc822.py module is deprecated (and not up to date). The Spambayes project uses the email package from the standard Jungle.Python distribution. This package has some issue with malformad MIME messages. The author Barry Warsaw (Zope)
http://www.wooz.org/users/barry/software/index.html
--
FrancoisGranger - 06 Jun 2003
Barry Warsaw added (on the Dev list):
I have less direct experience with poplib.py and imaplib.py. Server side
modules are harder to put in the standard library because these are
typically written to fit in some kind of server framework. [...] I'm very interested in the protocols in Twisted. At Pycon the divmod.org folks mentioned something about a killer imap server implementation, but looking at
their site now, I don't see it.
Private comment:
A comment on development methodology -- you might just start
implementing with the default libraries that come with Jungle.Python, and then
branch out from there if you need to. A formal evaluation seems like
it's overkill -- get in there and write some code ;).
David Bienvenue (mozilla mail) has offered some
fairly detailed? comments which I have placed on-line.
--
MichaelToy - 18 Aug 2003