public inbox for gnats-devel@sourceware.org
 help / color / mirror / Atom feed
* MIME mail filter for GNATS
@ 2001-04-03  9:28 Yngve Svendsen
  0 siblings, 0 replies; only message in thread
From: Yngve Svendsen @ 2001-04-03  9:28 UTC (permalink / raw)
  To: gnats-devel

This is a followup to the little Perl snippet I posted a couple of months 
ago to translate quoted printables (those ugly =F9, =E8 and so on which 
appear in messages that use international character sets.) The script I 
posted had several grave shortcomings and bugs. For instance, it transpired 
that it was the script itself which broke recognition of PR numbers in 
Subject lines with international characters, resulting in the creation of a 
new PR each time a reply to a PR with such a Subject line was received.

After spending some time studying the relevant RFCs (especially 822 and 
2047), I have now come up with a much more watertight and robust script. I 
have done quite a bit of testing, and it seems to work very well. PR number 
recognition now seems to work exactly as it should.

Warning: Using this filter causes PRs to be stored in the Gnats database in 
a decoded format which might not be supported properly by all UNIXes. 
Outgoing mail from GNATS will also be unencoded, possibly resulting in 
problems when mail is transferred over the Internet. Some mail clients may 
also have problems with it. In short: your mileage may vary. Anyone wanna 
start looking into proper internationalization of Gnats itself, allowing us 
to avoid kludges such as this script?

Script attached below. It will also be posted to 
http://sources.redhat.com/gnats/ . The Perl MIME tools module is required, 
you can get it from http://search.cpan.org/search?dist=MIME-tools

In order to use the script, set up the mail alias which receives bug 
reports to pipe messages through this script before it is piped into 
queue-pr. Like this:

| /path-to-script/script.pl | /usr/local/libexec/gnats/queue-pr -q


#!/usr/bin/perl
# Script to translate quoted-printables in MIME-encoded mail messages
# Fully decodes header fields according to RFC2047
# Merges multi-line header fields into single lines

undef $/; # We want to treat everything read from STDIN as one line
$input = <>;
($headers, $body) = split (/\n\n/, $input, 2);

# Process the headers:
$headers =~ s/\?=\s\n/\?=\n/g; # Lines ending with an encoded-word
                                # have an extra space at the end. Remove it.
$headers =~ s/\n[ |\t]//g; # Merge multi-line headers into a single line.
$transheaders = '';

foreach $line (split(/\n/, $headers))
{
   while ($line =~ m/=\?[^?]+\?(.)\?([^?]*)\?=/)
   {
     $encoding   = $1;
     $txt        = $2;
     $str_before = $`;
     $str_after  = $';

# Base64
     if ($encoding =~ /b/i)
     {
       require MIME::Base64;
       MIME::Base64->import(decode_base64);
       $txt = decode_base64($txt);
     }

# QP
     elsif ($encoding =~ /q/i)
     {
       require MIME::QuotedPrint;
       MIME::QuotedPrint->import(decode_qp);
       $txt = decode_qp($txt);
     }

     $line = $str_before . $txt . $str_after;
   }
   # The decode above does not do underline-to-space translation:
   $line =~ tr/_/ /;
   $transheaders .= $line . "\n";
}

# Process the body:
$transbody = MIME::QuotedPrint::decode($body);

# Output the combined results. We got a free \n from
# the transheaders concatenation.
print $transheaders . "\n" . $transbody;
# Script ends here.


Yngve Svendsen
IS Engineer
Clustra AS, Trondheim, Norway
yngve.svendsen@clustra.com

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2001-04-03  9:28 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-04-03  9:28 MIME mail filter for GNATS Yngve Svendsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).