public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
* Mailing list archives
@ 2012-03-29 11:44 Diego Novillo
  2012-03-29 16:02 ` Jeff Law
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Diego Novillo @ 2012-03-29 11:44 UTC (permalink / raw)
  To: overseers


I may be misremembering, but I think we used to offer mailing list 
archives in a tar ball or some other archiving format.

I'm trying to do some offline processing on messages and all I can seem 
to do is run wget -r, which results in individual messages in html format.

Are there any other options?


Thanks.  Diego.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mailing list archives
  2012-03-29 11:44 Mailing list archives Diego Novillo
@ 2012-03-29 16:02 ` Jeff Law
  2012-03-29 16:07 ` Jonathan Larmour
  2012-03-29 17:08 ` Ian Lance Taylor
  2 siblings, 0 replies; 8+ messages in thread
From: Jeff Law @ 2012-03-29 16:02 UTC (permalink / raw)
  To: Diego Novillo; +Cc: overseers

On 03/29/2012 05:44 AM, Diego Novillo wrote:
>
> I may be misremembering, but I think we used to offer mailing list
> archives in a tar ball or some other archiving format.
>
> I'm trying to do some offline processing on messages and all I can seem
> to do is run wget -r, which results in individual messages in html format.
>
> Are there any other options?
I thought we had a monthly mbox, but I don't see it anymore.

It's been a long long time since I looked at this stuff, but can you 
issue an ezmlm-get via the web or -request mail addresses?  If so, you 
might be able to request them yourself.

jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mailing list archives
  2012-03-29 11:44 Mailing list archives Diego Novillo
  2012-03-29 16:02 ` Jeff Law
@ 2012-03-29 16:07 ` Jonathan Larmour
  2012-03-29 17:43   ` Christopher Faylor
  2012-03-29 17:08 ` Ian Lance Taylor
  2 siblings, 1 reply; 8+ messages in thread
From: Jonathan Larmour @ 2012-03-29 16:07 UTC (permalink / raw)
  To: Diego Novillo; +Cc: overseers

On 29/03/12 12:44, Diego Novillo wrote:
> 
> I may be misremembering, but I think we used to offer mailing list
> archives in a tar ball or some other archiving format.
> 
> I'm trying to do some offline processing on messages and all I can seem to
> do is run wget -r, which results in individual messages in html format.
> 
> Are there any other options?

I don't think there are any tarballs made automatically. But you should
find that if you get the "txt" subdirectory of the mailing list instead,
you get the raw text format versions instead, if that helps.

Also, you can get them with rsync, which may be more convenient and faster
than wget. e.g. the with rsync module "gcc-ml-archive" for the gcc mailing
list for March:

rsync -a sourceware.org::gcc-ml-archive/2012-03/txt .

Jifl
-- 
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mailing list archives
  2012-03-29 11:44 Mailing list archives Diego Novillo
  2012-03-29 16:02 ` Jeff Law
  2012-03-29 16:07 ` Jonathan Larmour
@ 2012-03-29 17:08 ` Ian Lance Taylor
  2 siblings, 0 replies; 8+ messages in thread
From: Ian Lance Taylor @ 2012-03-29 17:08 UTC (permalink / raw)
  To: Diego Novillo; +Cc: overseers

Diego Novillo <dnovillo@google.com> writes:

> I may be misremembering, but I think we used to offer mailing list
> archives in a tar ball or some other archiving format.

We stopped doing that.  We had enough copies of the mailing lists.

Ian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mailing list archives
  2012-03-29 16:07 ` Jonathan Larmour
@ 2012-03-29 17:43   ` Christopher Faylor
  2012-03-29 18:05     ` Diego Novillo
  2012-03-29 20:31     ` Jonathan Larmour
  0 siblings, 2 replies; 8+ messages in thread
From: Christopher Faylor @ 2012-03-29 17:43 UTC (permalink / raw)
  To: overseers, Diego Novillo, Jonathan Larmour

On Thu, Mar 29, 2012 at 05:06:49PM +0100, Jonathan Larmour wrote:
>On 29/03/12 12:44, Diego Novillo wrote:
>> 
>> I may be misremembering, but I think we used to offer mailing list
>> archives in a tar ball or some other archiving format.
>> 
>> I'm trying to do some offline processing on messages and all I can seem to
>> do is run wget -r, which results in individual messages in html format.
>> 
>> Are there any other options?
>
>I don't think there are any tarballs made automatically. But you should
>find that if you get the "txt" subdirectory of the mailing list instead,
>you get the raw text format versions instead, if that helps.
>
>Also, you can get them with rsync, which may be more convenient and faster
>than wget. e.g. the with rsync module "gcc-ml-archive" for the gcc mailing
>list for March:
>
>rsync -a sourceware.org::gcc-ml-archive/2012-03/txt .

That's a great idea.  Maybe we should put that in a FAQ somewhere.

Remember that if you want to use these files as real mail you'll have to
unobfuscate the mail addresses to s/ dot /./; s/ at /@/ in the header.

cgf

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mailing list archives
  2012-03-29 17:43   ` Christopher Faylor
@ 2012-03-29 18:05     ` Diego Novillo
  2012-03-29 20:31     ` Jonathan Larmour
  1 sibling, 0 replies; 8+ messages in thread
From: Diego Novillo @ 2012-03-29 18:05 UTC (permalink / raw)
  To: overseers, Jonathan Larmour, Jeff Law

Thanks for all the suggestions folks.

I went with Jonathan's rsync suggestion.  I've got the data I need now.


Diego.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mailing list archives
  2012-03-29 17:43   ` Christopher Faylor
  2012-03-29 18:05     ` Diego Novillo
@ 2012-03-29 20:31     ` Jonathan Larmour
  2012-03-29 20:41       ` Christopher Faylor
  1 sibling, 1 reply; 8+ messages in thread
From: Jonathan Larmour @ 2012-03-29 20:31 UTC (permalink / raw)
  To: overseers

On 29/03/12 18:43, Christopher Faylor wrote:
> On Thu, Mar 29, 2012 at 05:06:49PM +0100, Jonathan Larmour wrote:
>> Also, you can get them with rsync, which may be more convenient and faster
>> than wget. e.g. the with rsync module "gcc-ml-archive" for the gcc mailing
>> list for March:
>>
>> rsync -a sourceware.org::gcc-ml-archive/2012-03/txt .
> 
> That's a great idea.  Maybe we should put that in a FAQ somewhere.

I can add it to http://sourceware.org/ml/ if you like, but I'd personally
be a bit hesitant making it too obvious, in case we are just making things
easier for spammers, since, as you say...

> Remember that if you want to use these files as real mail you'll have to
> unobfuscate the mail addresses to s/ dot /./; s/ at /@/ in the header.

... all the email addresses are easily deobfuscated.

Jifl

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mailing list archives
  2012-03-29 20:31     ` Jonathan Larmour
@ 2012-03-29 20:41       ` Christopher Faylor
  0 siblings, 0 replies; 8+ messages in thread
From: Christopher Faylor @ 2012-03-29 20:41 UTC (permalink / raw)
  To: overseers, Jonathan Larmour

On Thu, Mar 29, 2012 at 09:31:01PM +0100, Jonathan Larmour wrote:
>On 29/03/12 18:43, Christopher Faylor wrote:
>> On Thu, Mar 29, 2012 at 05:06:49PM +0100, Jonathan Larmour wrote:
>>> Also, you can get them with rsync, which may be more convenient and faster
>>> than wget. e.g. the with rsync module "gcc-ml-archive" for the gcc mailing
>>> list for March:
>>>
>>> rsync -a sourceware.org::gcc-ml-archive/2012-03/txt .
>> 
>> That's a great idea.  Maybe we should put that in a FAQ somewhere.
>
>I can add it to http://sourceware.org/ml/ if you like, but I'd personally
>be a bit hesitant making it too obvious, in case we are just making things
>easier for spammers, since, as you say...
>
>> Remember that if you want to use these files as real mail you'll have to
>> unobfuscate the mail addresses to s/ dot /./; s/ at /@/ in the header.
>
>... all the email addresses are easily deobfuscated.

Ok, point taken.  We didn't really advertise the ftp archives either so
we should be able to get by with a need-to-know on this too.

Nonetheless, I really do like the idea.  I don't recall anyone
mentioning this in the past and it seems like a good alternative to the
ftp archives.

cgf

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-03-29 20:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-29 11:44 Mailing list archives Diego Novillo
2012-03-29 16:02 ` Jeff Law
2012-03-29 16:07 ` Jonathan Larmour
2012-03-29 17:43   ` Christopher Faylor
2012-03-29 18:05     ` Diego Novillo
2012-03-29 20:31     ` Jonathan Larmour
2012-03-29 20:41       ` Christopher Faylor
2012-03-29 17:08 ` Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).