|
dgc / software / mmimap
|
[This is the README file. You can get the whole package, or browse the source; this is a preliminary release for mailman-developers inspection.]
The Mail Archiver Problem:
I've been following some discussion on the Mailman dev list about web
archiving tools. Personally, I've never much liked any of the options.
Each archiver package has its own strengths and weaknesses (some more
than others) making it inconvenient for certain kinds of archival
searching or browsing. And getting them installed, keeping them current,
and making room for all the archival data (if you have a large server)
gets to be a bother for something that doesn't even really render mail
the way I want to read it.
Part of the problem, to me, is that I don't like to read archive mail on the web. I don't use a web browser to read my personal mail, and I can't stand any webmail application I've ever used. I *like* my mail reader. It knows mail, it presents mail in a way that I'm comfortable with, and it doesn't change for every server I interact with. For client/server access to personal mail, we use application-focused protocols to intelligently provide data to user tools that already know how to use that data. We don't try to wrench that information into an environment that doesn't already know the application or have much sense of how to deal with it.
The web is not really an e-mail application. Even if it dresses up as one from time to time, and although it works fine for some people, the web will never be a good e-mail application for everyone. People have different needs. People have their own preferred e-mail applications.
To make an HTML archiver faithfully represent an e-mail message, it needs not only to fully understand the mail itself, but it needs to know something reasonable to do with content of varying types — it needs to file messages and attachments smartly, make them downloadable, etc. If messages are PGP- or S/MIME- signed, the archiver should be able to validate the signature. And then you have user preferences: how does the individual browser want to see things displayed? Threads? By date or by topic? Focus on particular topics or posters? Everyone has preferences, and I haven't seen an HTML archiver yet that utterly fails to be usable for more than the most generic overviews and glances. On multiple occasions I've started writing my own out of frustration, but I always wind up facing the other side of that frustration: it's a lot of work to write a mail search and presentation program that respects its users individually.
Then there's the difficulty of pairing together list servers and HTML archivers. How hard should it be to hook together application X with application Y, cross-platform, without dozens of external dependencies, when there's no protocol specifying the linkage?
One approach to this problem would be a CGI that parses an rfc822 message and displays it to a web user, on the fly. No static storage. User account, preferences, the whole caboodle. That seems reasonable at first. It might even work for multiple mailing list applications. But then you'd be implementing most of the functionality of a local mail reader in CGI, and suddenly this seems strikingly redundant. How many times have people tried to do that? How many times have they done well?
People already meet these needs in e-mail applications, using protocols that work very well for e-mail. Why reduplicate that effort to provide mail over HTTP?
The best approach to archives is to treat the canonical source of the archival data — i.e., in our case, the mbox files EMDASH as source data to something that actually knows diddly about mail: IMAP. All the things that HTML archivers lack has already been done a thousand times for mail clients, and users have a variety to choose from. All you need is to connect the archive to the user via IMAP instead of the web.