public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
* Stability of pipermail ml archive URLs
@ 2020-05-06 14:11 Jakub Jelinek
  2020-05-06 14:44 ` Frank Ch. Eigler
  2020-05-06 15:22 ` Christopher Faylor
  0 siblings, 2 replies; 18+ messages in thread
From: Jakub Jelinek @ 2020-05-06 14:11 UTC (permalink / raw)
  To: GCC Development; +Cc: Overseers mailing list

Hi!

Last week after sending status report mails to gcc mailing list,
I've opened the web archive and copied the URLs of those status reports
https://gcc.gnu.org/pipermail/gcc/2020-April/232267.html
https://gcc.gnu.org/pipermail/gcc/2020-April/232268.html
and checked them into gcc-wwwdocs git
c3162d9e711d3e32935c17d1451c63839d702019 revision.
But today people are complaining that those links don't work anymore
and those mails have
https://gcc.gnu.org/pipermail/gcc/2020-April/000504.html
https://gcc.gnu.org/pipermail/gcc/2020-April/000505.html
URLs instead.
Martin Jambor also said he has posted a URL into the archive
https://gcc.gnu.org/pipermail/gcc/2020-February/231851.html
which is now instead
https://gcc.gnu.org/pipermail/gcc/2020-February/232205.html
Looking around, the last two months of gcc now have very small
numbers, but e.g. on gcc-patches the mails have very high numbers like
545238.html.  Can pipermail provide stable URLs at all?  We really
need those, we reference those in commit messages, other mails, bugzilla
etc.

	Jakub


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 14:11 Stability of pipermail ml archive URLs Jakub Jelinek
@ 2020-05-06 14:44 ` Frank Ch. Eigler
  2020-05-06 14:54   ` Arseny Solokha
                     ` (2 more replies)
  2020-05-06 15:22 ` Christopher Faylor
  1 sibling, 3 replies; 18+ messages in thread
From: Frank Ch. Eigler @ 2020-05-06 14:44 UTC (permalink / raw)
  To: Jakub Jelinek, Overseers mailing list; +Cc: GCC Development

Hi -

> https://gcc.gnu.org/pipermail/gcc/2020-February/232205.html
> Looking around, the last two months of gcc now have very small
> numbers, but e.g. on gcc-patches the mails have very high numbers like
> 545238.html.  Can pipermail provide stable URLs at all?  We really
> need those, we reference those in commit messages, other mails, bugzilla
> etc.

Argh, that is a problem, sorry.  We get mailman to regenerate web
archives for example in the case of spam that has gone through.  Our
recipe has been to delete the spam from the apropriate .mbox, but this
does renumber things.

The big vs. little numbers are probably an accidental function of
whether the email .mbox files were processed chronologically or not.
I'll tweak the mrefresh script to make sure it's chronological; that
should avoid gross jumps like that.  I believe gcc-patches just wasn't
regenerated for spam removal whereas others have.  There should not be
gross jumps in the future, except we'll have to regenerate everything
one more time. :-(

Small jumps though --- darn, we'd have to do something else with spam
in the mbox, maybe replace it somehow in situ with something else.  Or
catch it so quickly that subsequent URLs aren't archived anywhere
important.

It would be good to have another way of making permanent URLs for
individual messages in mailing list archives.

- FChE

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 14:44 ` Frank Ch. Eigler
@ 2020-05-06 14:54   ` Arseny Solokha
  2020-05-06 15:24     ` Christopher Faylor
  2020-05-06 14:55   ` Arseny Solokha
  2020-05-07  9:48   ` Thomas Schwinge
  2 siblings, 1 reply; 18+ messages in thread
From: Arseny Solokha @ 2020-05-06 14:54 UTC (permalink / raw)
  To: Frank Ch. Eigler, Jakub Jelinek, Overseers mailing list; +Cc: GCC Development

Hi,


>> https://gcc.gnu.org/pipermail/gcc/2020-February/232205.html
>> Looking around, the last two months of gcc now have very small
>> numbers, but e.g. on gcc-patches the mails have very high numbers like
>> 545238.html.  Can pipermail provide stable URLs at all?  We really
>> need those, we reference those in commit messages, other mails, bugzilla
>> etc.
>
> Argh, that is a problem, sorry.  We get mailman to regenerate web
> archives for example in the case of spam that has gone through.  Our
> recipe has been to delete the spam from the apropriate .mbox, but this
> does renumber things.
>
> The big vs. little numbers are probably an accidental function of
> whether the email .mbox files were processed chronologically or not.
> I'll tweak the mrefresh script to make sure it's chronological; that
> should avoid gross jumps like that.  I believe gcc-patches just wasn't
> regenerated for spam removal whereas others have.  There should not be
> gross jumps in the future, except we'll have to regenerate everything
> one more time. :-(
>
> Small jumps though --- darn, we'd have to do something else with spam
> in the mbox, maybe replace it somehow in situ with something else.  Or
> catch it so quickly that subsequent URLs aren't archived anywhere
> important.
>
> It would be good to have another way of making permanent URLs for
> individual messages in mailing list archives.

may I also chime in with a related (to some extent), even though a separate
issue? It seems URL rewriting rules designed to replace old-style

  https://gcc.gnu.org/ml/<list name>/current

URLs pointing to monthly digests to current ones

  https://gcc.gnu.org/pipermail/<list name>/<year-month>/date.html#end

broke with onset of May. I mean, if I type

  https://gcc.gnu.org/ml/gcc/current

I still get

  https://gcc.gnu.org/pipermail/gcc/2020-April/date.html#end

(note 2020-April) instead.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 14:44 ` Frank Ch. Eigler
  2020-05-06 14:54   ` Arseny Solokha
@ 2020-05-06 14:55   ` Arseny Solokha
  2020-05-07  9:48   ` Thomas Schwinge
  2 siblings, 0 replies; 18+ messages in thread
From: Arseny Solokha @ 2020-05-06 14:55 UTC (permalink / raw)
  To: Frank Ch. Eigler, Jakub Jelinek, Overseers mailing list; +Cc: GCC Development

Hi,


>> https://gcc.gnu.org/pipermail/gcc/2020-February/232205.html
>> Looking around, the last two months of gcc now have very small
>> numbers, but e.g. on gcc-patches the mails have very high numbers like
>> 545238.html.  Can pipermail provide stable URLs at all?  We really
>> need those, we reference those in commit messages, other mails, bugzilla
>> etc.
>
> Argh, that is a problem, sorry.  We get mailman to regenerate web
> archives for example in the case of spam that has gone through.  Our
> recipe has been to delete the spam from the apropriate .mbox, but this
> does renumber things.
>
> The big vs. little numbers are probably an accidental function of
> whether the email .mbox files were processed chronologically or not.
> I'll tweak the mrefresh script to make sure it's chronological; that
> should avoid gross jumps like that.  I believe gcc-patches just wasn't
> regenerated for spam removal whereas others have.  There should not be
> gross jumps in the future, except we'll have to regenerate everything
> one more time. :-(
>
> Small jumps though --- darn, we'd have to do something else with spam
> in the mbox, maybe replace it somehow in situ with something else.  Or
> catch it so quickly that subsequent URLs aren't archived anywhere
> important.
>
> It would be good to have another way of making permanent URLs for
> individual messages in mailing list archives.

may I also chime in with a related (to some extent), even though a separate
issue? It seems URL rewriting rules designed to replace old-style

  https://gcc.gnu.org/ml/<list name>/current

URLs pointing to monthly digests to current ones

  https://gcc.gnu.org/pipermail/<list name>/<year-month>/date.html#end

broke with onset of May. I mean, if I type

  https://gcc.gnu.org/ml/gcc/current

I still get

  https://gcc.gnu.org/pipermail/gcc/2020-April/date.html#end

(note 2020-April) instead.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 14:11 Stability of pipermail ml archive URLs Jakub Jelinek
  2020-05-06 14:44 ` Frank Ch. Eigler
@ 2020-05-06 15:22 ` Christopher Faylor
  2020-05-06 16:04   ` Per Bothner
  1 sibling, 1 reply; 18+ messages in thread
From: Christopher Faylor @ 2020-05-06 15:22 UTC (permalink / raw)
  To: Jakub Jelinek, Overseers mailing list; +Cc: GCC Development

On Wed, May 06, 2020 at 04:11:39PM +0200, Jakub Jelinek wrote:
>Hi!
>
>Last week after sending status report mails to gcc mailing list,
>I've opened the web archive and copied the URLs of those status reports
>https://gcc.gnu.org/pipermail/gcc/2020-April/232267.html
>https://gcc.gnu.org/pipermail/gcc/2020-April/232268.html
>and checked them into gcc-wwwdocs git
>c3162d9e711d3e32935c17d1451c63839d702019 revision.
>But today people are complaining that those links don't work anymore
>and those mails have
>https://gcc.gnu.org/pipermail/gcc/2020-April/000504.html
>https://gcc.gnu.org/pipermail/gcc/2020-April/000505.html
>URLs instead.
>Martin Jambor also said he has posted a URL into the archive
>https://gcc.gnu.org/pipermail/gcc/2020-February/231851.html
>which is now instead
>https://gcc.gnu.org/pipermail/gcc/2020-February/232205.html
>Looking around, the last two months of gcc now have very small
>numbers, but e.g. on gcc-patches the mails have very high numbers like
>545238.html.  Can pipermail provide stable URLs at all?  We really
>need those, we reference those in commit messages, other mails, bugzilla
>etc.

I'll bet this is due to rebuilding the archive after removing spam.

Maybe we need to revisit how that's done.

cgf


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 14:54   ` Arseny Solokha
@ 2020-05-06 15:24     ` Christopher Faylor
  0 siblings, 0 replies; 18+ messages in thread
From: Christopher Faylor @ 2020-05-06 15:24 UTC (permalink / raw)
  To: Arseny Solokha
  Cc: Frank Ch. Eigler, Jakub Jelinek, Overseers mailing list, GCC Development

On Wed, May 06, 2020 at 09:54:06PM +0700, Arseny Solokha wrote:
>may I also chime in with a related (to some extent), even though a separate
>issue? It seems URL rewriting rules designed to replace old-style
>
>  https://gcc.gnu.org/ml/<list name>/current
>
>URLs pointing to monthly digests to current ones
>
>  https://gcc.gnu.org/pipermail/<list name>/<year-month>/date.html#end
>
>broke with onset of May. I mean, if I type
>
>  https://gcc.gnu.org/ml/gcc/current
>
>I still get
>
>  https://gcc.gnu.org/pipermail/gcc/2020-April/date.html#end
>
>(note 2020-April) instead.

This is not related in any way.

I'll fix this.  Apparently my cron job didn't fire.

cgf


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 15:22 ` Christopher Faylor
@ 2020-05-06 16:04   ` Per Bothner
  2020-05-06 20:55     ` Christopher Faylor
  0 siblings, 1 reply; 18+ messages in thread
From: Per Bothner @ 2020-05-06 16:04 UTC (permalink / raw)
  To: overseers

On 5/6/20 8:22 AM, Christopher Faylor via Overseers wrote:
> I'll bet this is due to rebuilding the archive after removing spam.
> 
> Maybe we need to revisit how that's done.

The Correct Way to Handle This is for each message to contain an
Archived-At header (https://tools.ietf.org/html/rfc5064) generated by Mailman.
Pipermail should use that header when building the archive.

Having each message containing a stable clean unique URL for that message
is essential for any kind of web/email integration (by which I mean allowing
people to comment/reply directly on the web-site).

Ideally, the archive should be updated right after the message has been
munged and when it is ready to go, but before the message is mailed to subscribers.
However, if the archiver is less integrated with the mailer, it is acceptable for
there to be a short delay while URL is "dangling".
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 16:04   ` Per Bothner
@ 2020-05-06 20:55     ` Christopher Faylor
  2020-05-06 21:06       ` Per Bothner
  0 siblings, 1 reply; 18+ messages in thread
From: Christopher Faylor @ 2020-05-06 20:55 UTC (permalink / raw)
  To: Per Bothner; +Cc: overseers

On Wed, May 06, 2020 at 09:04:55AM -0700, Per Bothner wrote:
>On 5/6/20 8:22 AM, Christopher Faylor via Overseers wrote:
>>I'll bet this is due to rebuilding the archive after removing spam.
>>
>>Maybe we need to revisit how that's done.
>
>The Correct Way to Handle This is for each message to contain an
>Archived-At header (https://tools.ietf.org/html/rfc5064) generated by Mailman.
>Pipermail should use that header when building the archive.

AFAIK, mailman doesn't support this out of the box.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 20:55     ` Christopher Faylor
@ 2020-05-06 21:06       ` Per Bothner
  2020-05-07  3:55         ` Christopher Faylor
  0 siblings, 1 reply; 18+ messages in thread
From: Per Bothner @ 2020-05-06 21:06 UTC (permalink / raw)
  To: overseers

On 5/6/20 1:55 PM, Christopher Faylor wrote:
> On Wed, May 06, 2020 at 09:04:55AM -0700, Per Bothner wrote:
>> On 5/6/20 8:22 AM, Christopher Faylor via Overseers wrote:
>>> I'll bet this is due to rebuilding the archive after removing spam.
>>>
>>> Maybe we need to revisit how that's done.
>>
>> The Correct Way to Handle This is for each message to contain an
>> Archived-At header (https://tools.ietf.org/html/rfc5064) generated by Mailman.
>> Pipermail should use that header when building the archive.
> 
> AFAIK, mailman doesn't support this out of the box.

That is my impression, too - and I'm astounded it still hasn't been fixed.
It's a major missing feature, IMO.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 21:06       ` Per Bothner
@ 2020-05-07  3:55         ` Christopher Faylor
  0 siblings, 0 replies; 18+ messages in thread
From: Christopher Faylor @ 2020-05-07  3:55 UTC (permalink / raw)
  To: Per Bothner; +Cc: overseers

On Wed, May 06, 2020 at 02:06:25PM -0700, Per Bothner wrote:
>On 5/6/20 1:55 PM, Christopher Faylor wrote:
>>On Wed, May 06, 2020 at 09:04:55AM -0700, Per Bothner wrote:
>>>On 5/6/20 8:22 AM, Christopher Faylor via Overseers wrote:
>>>>I'll bet this is due to rebuilding the archive after removing spam.
>>>>
>>>>Maybe we need to revisit how that's done.
>>>
>>>The Correct Way to Handle This is for each message to contain an
>>>Archived-At header (https://tools.ietf.org/html/rfc5064) generated by
>>>Mailman.  Pipermail should use that header when building the archive.
>>
>>AFAIK, mailman doesn't support this out of the box.
>
>That is my impression, too - and I'm astounded it still hasn't been
>fixed.  It's a major missing feature, IMO.

AFAICT, it's either a feature or a proposed feature in Mailman 3.

Hard to believe it's 2020 and this isn't already a standard feature.

cgf


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-06 14:44 ` Frank Ch. Eigler
  2020-05-06 14:54   ` Arseny Solokha
  2020-05-06 14:55   ` Arseny Solokha
@ 2020-05-07  9:48   ` Thomas Schwinge
  2020-05-07 10:14     ` Frank Ch. Eigler
                       ` (2 more replies)
  2 siblings, 3 replies; 18+ messages in thread
From: Thomas Schwinge @ 2020-05-07  9:48 UTC (permalink / raw)
  To: Frank Ch. Eigler, Jakub Jelinek, overseers; +Cc: gcc

Hi!

On 2020-05-06T10:44:46-0400, "Frank Ch. Eigler via Gcc" <gcc@gcc.gnu.org> wrote:
>> Can pipermail provide stable URLs at all?  We really
>> need those, we reference those in commit messages, other mails, bugzilla
>> etc.

> It would be good to have another way of making permanent URLs for
> individual messages in mailing list archives.

Look up by Message-ID?
<http://mid.mail-archive.com/20200506141139.GJ2375@tucnak>, for example.
See <https://en.wikipedia.org/wiki/Message-ID>, etc.  The idea is that
for all practical purposes, Message-IDs are "sufficiently unique".
(Compare conceptually to the Git SHA-1 hashes.)

Such a service is not currently available on sourceware, but it'd be
possible to implement: as messages come in, you'd build a database
mapping from the Message-ID header to "current Mailman's Pipermail URL".

(That's one reason why when posting such links I used to use Gmane's
Message-ID lookup, now using The Mail Archive's.  The other reason is
that compared to Mailman's Pipermail these services don't artificially
break discussion threads at month boundaries.)


By the way, the public-inbox software
(<https://public-inbox.org/README.html>), as recently mentioned in a
different thread discussing deficiencies of Mailman's Pipermail, also
does support this:
<https://public-inbox.org/libc-alpha/129c8494-bfd0-87f0-ddb5-e56f6d4a6e0c@gotplt.org>
(random example).  (I have not yet really looked into that software
myself, but from the little I read about it, it seems conceptually
simple, "easy", good.)

If there's sufficient interest (users) and commitment (overseers), we
could install this on sourceware, in addition to what we've currently
got?


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-07  9:48   ` Thomas Schwinge
@ 2020-05-07 10:14     ` Frank Ch. Eigler
  2020-05-07 15:54       ` Christopher Faylor
  2020-05-07 15:48     ` Christopher Faylor
  2020-05-07 19:23     ` Segher Boessenkool
  2 siblings, 1 reply; 18+ messages in thread
From: Frank Ch. Eigler @ 2020-05-07 10:14 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Jakub Jelinek, overseers, gcc

Hi -

> Such a service is not currently available on sourceware, but it'd be
> possible to implement: as messages come in, you'd build a database
> mapping from the Message-ID header to "current Mailman's Pipermail URL".

I was thinking we might be able to trick pipermail (the web archiver
component) to simply name the message web urls after some function of
the message-id instead of the sequence number.  Will give this a try
very shortly.

- FChE

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-07  9:48   ` Thomas Schwinge
  2020-05-07 10:14     ` Frank Ch. Eigler
@ 2020-05-07 15:48     ` Christopher Faylor
  2020-05-07 19:23     ` Segher Boessenkool
  2 siblings, 0 replies; 18+ messages in thread
From: Christopher Faylor @ 2020-05-07 15:48 UTC (permalink / raw)
  To: gcc, overseers

On Thu, May 07, 2020 at 11:48:18AM +0200, Thomas Schwinge wrote:
>On 2020-05-06T10:44:46-0400, "Frank Ch.  Eigler via Gcc"
><gcc@gcc.gnu.org> wrote:
>>>Can pipermail provide stable URLs at all?  We really need those, we
>>>reference those in commit messages, other mails, bugzilla etc.
>
>>It would be good to have another way of making permanent URLs for
>>individual messages in mailing list archives.
>
>Look up by Message-ID?
><http://mid.mail-archive.com/20200506141139.GJ2375@tucnak>, for
>example.  See <https://en.wikipedia.org/wiki/Message-ID>, etc.  The
>idea is that for all practical purposes, Message-IDs are "sufficiently
>unique".  (Compare conceptually to the Git SHA-1 hashes.)

IMO, we're making way too big a deal out of this.  The message archives
are changing because we are resequencing them.  Mailman doesn't, AFAIK,
take it upon itself to randomly renumber them.  fche and cgf have been
renumbering them when we remove spam.

If we stopped doing that there would be no issue.

When we were using ezmlm, I was careful not to remove message files when
dealing with spam.  We haven't been that careful with mailman and, so,
we're seeing problems.

If we just changed the way that we deal with spam to keep the message
around but blank it out, we wouldn't have this problem.

In addition, when I was migrating the mail archives from ezmlm to mailman
I came across a number of cases where the same message-id was used in
two messages.  Possibly it was someone just bouncing email or maybe
it was something else.

Maybe it's a corner case but we wouldn't have to worry about this at all
if we just used mailman's current numbering and didn't ever take it upon
ourselves to rescan the archives.

cgf


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-07 10:14     ` Frank Ch. Eigler
@ 2020-05-07 15:54       ` Christopher Faylor
  2020-05-07 17:56         ` Frank Ch. Eigler
  0 siblings, 1 reply; 18+ messages in thread
From: Christopher Faylor @ 2020-05-07 15:54 UTC (permalink / raw)
  To: overseers, gcc

On Thu, May 07, 2020 at 06:14:55AM -0400, Frank Ch. Eigler wrote:
>>Such a service is not currently available on sourceware, but it'd be
>>possible to implement: as messages come in, you'd build a database
>>mapping from the Message-ID header to "current Mailman's Pipermail
>>URL".
>
>I was thinking we might be able to trick pipermail (the web archiver
>component) to simply name the message web urls after some function of
>the message-id instead of the sequence number.  Will give this a try
>very shortly.

I just want to go on record as saying that I think this is a bad idea.
We can fix this problem simply without redesigning pipermail.  The
problem that we're seeing is caused by a script that I wrote to migrate
ezmlm to mailman.  The fix for the problem is "Don't run that script".

But, if we are going to make this level of change to pipermail we might
as well go wild and just implement all of the other things that people
want and forget about our supposed desire to use "supported" software.
Changing pipermail to use message-id's rather than sequence numbers
negates the argument that we want to be standard since we likely won't
be able to get this change in upstream.  I doubt that mailman2
developers will want to consider this major a change in a product that
is supposedly close to EOL.

cgf


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-07 15:54       ` Christopher Faylor
@ 2020-05-07 17:56         ` Frank Ch. Eigler
  2020-05-07 18:27           ` Christopher Faylor
  0 siblings, 1 reply; 18+ messages in thread
From: Frank Ch. Eigler @ 2020-05-07 17:56 UTC (permalink / raw)
  To: overseers, gcc

Hi -

> >I was thinking we might be able to trick pipermail (the web archiver
> >component) to simply name the message web urls after some function of
> >the message-id instead of the sequence number.  Will give this a try
> >very shortly.
> 
> I just want to go on record as saying that I think this is a bad idea.
> We can fix this problem simply without redesigning pipermail.

If the idea requires more than a dozenish lines of code, then I agree
it's not worth doing.  "redesigning" - indeed no thanks.


> The problem that we're seeing is caused by a script that I wrote to
> migrate ezmlm to mailman.  The fix for the problem is "Don't run
> that script".

Yeah, but that is the official mailman2 method for this.  Spam/malware
that gets through can sit in multiple locations unless we clean it out
in the proper thorough manner, through the entire pipeline (which
starts with the mbox files).  Not super keen on building much
complexity that operates on all the intermediate results and html
files.


- FChE

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-07 17:56         ` Frank Ch. Eigler
@ 2020-05-07 18:27           ` Christopher Faylor
  0 siblings, 0 replies; 18+ messages in thread
From: Christopher Faylor @ 2020-05-07 18:27 UTC (permalink / raw)
  To: overseers, gcc

On Thu, May 07, 2020 at 01:56:04PM -0400, Frank Ch. Eigler wrote:
>>>I was thinking we might be able to trick pipermail (the web archiver
>>>component) to simply name the message web urls after some function of
>>>the message-id instead of the sequence number.  Will give this a try
>>>very shortly.
>>
>>I just want to go on record as saying that I think this is a bad idea.
>>We can fix this problem simply without redesigning pipermail.
>
>If the idea requires more than a dozenish lines of code, then I agree
>it's not worth doing.  "redesigning" - indeed no thanks.

I'd call a major change to the way that mailman archives files a
"redesign".

>>The problem that we're seeing is caused by a script that I wrote to
>>migrate ezmlm to mailman.  The fix for the problem is "Don't run that
>>script".
>
>Yeah, but that is the official mailman2 method for this.

One recommended method is to edit the mbox file and leave the message
around but blank and then regenerate the archive.  But, that could cause
renumbering issues.

They also mention what I'm suggesting - edit the mbox and html files and
leave the content blank.  You'd have to be careful not to step on incoming
email in that scenario, of course.

https://wiki.list.org/DOC/How%20can%20I%20remove%20a%20post%20from%20the%20list%20archive%20or%20remove%20an%20entire%20archive%3F

The above mentions that the message would be in three places which are
easily editable.  There is also prev and next links which apparently
live in a database but there are scripts available to fix that too.

Spam used to be in multiple places when we were running ezmlm.  It never
occurred to me that we needed to modify ezmlm to deal with the issue.  I
used to get rid of viruses using a "mlzap" script that hit the right
files.  That technique should work here too.

OTOH, maybe we should just give up on mailman2 and move to something
more modern even if we can't use dnf to install it on RHEL.  I'm surely
not a fan of mailman2.  If we have to do head-standing to get it to work
the way we want then maybe we should just move on and forget that we
said we wanted to use something "stable".


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-07  9:48   ` Thomas Schwinge
  2020-05-07 10:14     ` Frank Ch. Eigler
  2020-05-07 15:48     ` Christopher Faylor
@ 2020-05-07 19:23     ` Segher Boessenkool
  2020-05-07 20:28       ` Christopher Faylor
  2 siblings, 1 reply; 18+ messages in thread
From: Segher Boessenkool @ 2020-05-07 19:23 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Frank Ch. Eigler, Jakub Jelinek, overseers, gcc

Hi!

On Thu, May 07, 2020 at 11:48:18AM +0200, Thomas Schwinge wrote:
> By the way, the public-inbox software
> (<https://public-inbox.org/README.html>), as recently mentioned in a
> different thread discussing deficiencies of Mailman's Pipermail, also
> does support this:
> <https://public-inbox.org/libc-alpha/129c8494-bfd0-87f0-ddb5-e56f6d4a6e0c@gotplt.org>
> (random example).  (I have not yet really looked into that software
> myself, but from the little I read about it, it seems conceptually
> simple, "easy", good.)
> 
> If there's sufficient interest (users) and commitment (overseers), we
> could install this on sourceware, in addition to what we've currently
> got?

I would very much like this.  *All* of the problems with the current
mail archive, as well as all of the problems with the one we had before,
do not exist with public-inbox.

(It probably has problems all of its own, of course ;-) )


Segher

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Stability of pipermail ml archive URLs
  2020-05-07 19:23     ` Segher Boessenkool
@ 2020-05-07 20:28       ` Christopher Faylor
  0 siblings, 0 replies; 18+ messages in thread
From: Christopher Faylor @ 2020-05-07 20:28 UTC (permalink / raw)
  To: gcc, overseers

On Thu, May 07, 2020 at 02:23:30PM -0500, Segher Boessenkool wrote:
>On Thu, May 07, 2020 at 11:48:18AM +0200, Thomas Schwinge wrote:
>>By the way, the public-inbox software
>>(<https://public-inbox.org/README.html>), as recently mentioned in a
>>different thread discussing deficiencies of Mailman's Pipermail, also
>>does support this:
>><https://public-inbox.org/libc-alpha/129c8494-bfd0-87f0-ddb5-e56f6d4a6e0c@gotplt.org>
>>(random example).  (I have not yet really looked into that software
>>myself, but from the little I read about it, it seems conceptually
>>simple, "easy", good.)
>>
>>If there's sufficient interest (users) and commitment (overseers), we
>>could install this on sourceware, in addition to what we've currently
>>got?
>
>I would very much like this.  *All* of the problems with the current
>mail archive, as well as all of the problems with the one we had
>before, do not exist with public-inbox.
>
>(It probably has problems all of its own, of course ;-) )

It's been suggested many times both before we rolled out the new
sourceware and after.

I'm not a real fan of the interface but at least it's being supported.
It's just not supported in RHEL 8 right now, as far as I know.

To reiterate our current philosophy: We're trying to use supported
software on sourceware and not have to roll our own and worry about
keeping track of upstream fixes and security issues.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-05-07 20:28 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-06 14:11 Stability of pipermail ml archive URLs Jakub Jelinek
2020-05-06 14:44 ` Frank Ch. Eigler
2020-05-06 14:54   ` Arseny Solokha
2020-05-06 15:24     ` Christopher Faylor
2020-05-06 14:55   ` Arseny Solokha
2020-05-07  9:48   ` Thomas Schwinge
2020-05-07 10:14     ` Frank Ch. Eigler
2020-05-07 15:54       ` Christopher Faylor
2020-05-07 17:56         ` Frank Ch. Eigler
2020-05-07 18:27           ` Christopher Faylor
2020-05-07 15:48     ` Christopher Faylor
2020-05-07 19:23     ` Segher Boessenkool
2020-05-07 20:28       ` Christopher Faylor
2020-05-06 15:22 ` Christopher Faylor
2020-05-06 16:04   ` Per Bothner
2020-05-06 20:55     ` Christopher Faylor
2020-05-06 21:06       ` Per Bothner
2020-05-07  3:55         ` Christopher Faylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).