Good news, bad news on the repository conversion

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Good news, bad news on the repository conversion
@ 2018-07-09  0:28 Eric S. Raymond
  2018-07-09  2:09 ` Jason Merrill
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Eric S. Raymond @ 2018-07-09  0:28 UTC (permalink / raw)
  To: Mailing List; +Cc: Mark Atwood

There is good news bad news on the GCC repository conversion.

The good news is that I have solved the only known remaining technical
problem in reposurgeon blocking the conversion.  I've fixed the bug
that prevented execute permissions from being carried by branch
copies.

The bad news is that my last test run overran the memnory capacity of
the 64GB Great Beast.  I shall have to find some way of reducing the
working set, as 128GB DD4 memory is hideously expensive.

The only remaining software issue is that I need to figure out what
should be done with your mid-branch deletes. When they're followed by
a branch copy the combination is probably best expressed as a merge
to the target branch.  I need to audit to see if there are other
cases.

Alas, this continues to be a slow and grindingly difficult job.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

Gun Control: The theory that a woman found dead in an alley, raped and
strangled with her panty hose, is somehow morally superior to a
woman explaining to police how her attacker got that fatal bullet wound.
	-- L. Neil Smith

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09  0:28 Good news, bad news on the repository conversion Eric S. Raymond
@ 2018-07-09  2:09 ` Jason Merrill
  2018-07-09  7:18 ` Janus Weil
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Jason Merrill @ 2018-07-09  2:09 UTC (permalink / raw)
  To: Eric Raymond; +Cc: gcc Mailing List, Mark Atwood

Thanks for the update.

On Mon, Jul 9, 2018, 10:28 AM Eric S. Raymond <esr@thyrsus.com> wrote:

> There is good news bad news on the GCC repository conversion.
>
> The good news is that I have solved the only known remaining technical
> problem in reposurgeon blocking the conversion.  I've fixed the bug
> that prevented execute permissions from being carried by branch
> copies.
>
> The bad news is that my last test run overran the memnory capacity of
> the 64GB Great Beast.  I shall have to find some way of reducing the
> working set, as 128GB DD4 memory is hideously expensive.
>
> The only remaining software issue is that I need to figure out what
> should be done with your mid-branch deletes. When they're followed by
> a branch copy the combination is probably best expressed as a merge
> to the target branch.  I need to audit to see if there are other
> cases.
>
> Alas, this continues to be a slow and grindingly difficult job.
> --
>                 <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
>
> Gun Control: The theory that a woman found dead in an alley, raped and
> strangled with her panty hose, is somehow morally superior to a
> woman explaining to police how her attacker got that fatal bullet wound.
>         -- L. Neil Smith
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09  0:28 Good news, bad news on the repository conversion Eric S. Raymond
  2018-07-09  2:09 ` Jason Merrill
@ 2018-07-09  7:18 ` Janus Weil
  2018-07-09 10:16   ` Eric S. Raymond
  2018-07-09  8:40 ` Martin Liška
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Janus Weil @ 2018-07-09  7:18 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: Mailing List, Mark Atwood

2018-07-09 2:27 GMT+02:00 Eric S. Raymond <esr@thyrsus.com>:
> There is good news bad news on the GCC repository conversion.
>
> The good news is that I have solved the only known remaining technical
> problem in reposurgeon blocking the conversion.  I've fixed the bug
> that prevented execute permissions from being carried by branch
> copies.

Great to hear that there is progress on that front!


> The bad news is that my last test run overran the memnory capacity of
> the 64GB Great Beast.  I shall have to find some way of reducing the
> working set, as 128GB DD4 memory is hideously expensive.

Or maybe you could use a machine from the GCC compile farm?

According to https://gcc.gnu.org/wiki/CompileFarm, there are three
machines with at least 128GB available (gcc111, gcc112, gcc119).

Cheers,
Janus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09  0:28 Good news, bad news on the repository conversion Eric S. Raymond
  2018-07-09  2:09 ` Jason Merrill
  2018-07-09  7:18 ` Janus Weil
@ 2018-07-09  8:40 ` Martin Liška
  2018-07-09 19:51 ` Florian Weimer
  2018-07-20 21:36 ` Joseph Myers
  4 siblings, 0 replies; 23+ messages in thread
From: Martin Liška @ 2018-07-09  8:40 UTC (permalink / raw)
  To: Eric S. Raymond, Mailing List; +Cc: Mark Atwood

On 07/09/2018 02:27 AM, Eric S. Raymond wrote:
> The bad news is that my last test run overran the memnory capacity of
> the 64GB Great Beast.  I shall have to find some way of reducing the
> working set, as 128GB DD4 memory is hideously expensive.

Hello.

I can help with by running a conversion on our server machine. Feel free
to contact me privately on my mail.

Martin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09  7:18 ` Janus Weil
@ 2018-07-09 10:16   ` Eric S. Raymond
  2018-07-09 13:59     ` David Edelsohn
  2018-07-09 18:16     ` David Malcolm
  0 siblings, 2 replies; 23+ messages in thread
From: Eric S. Raymond @ 2018-07-09 10:16 UTC (permalink / raw)
  To: Janus Weil; +Cc: Mailing List, Mark Atwood

Janus Weil <janus@gcc.gnu.org>:
> > The bad news is that my last test run overran the memnory capacity of
> > the 64GB Great Beast.  I shall have to find some way of reducing the
> > working set, as 128GB DD4 memory is hideously expensive.
> 
> Or maybe you could use a machine from the GCC compile farm?
> 
> According to https://gcc.gnu.org/wiki/CompileFarm, there are three
> machines with at least 128GB available (gcc111, gcc112, gcc119).

The Great Beast is a semi-custom PC optimized for doing graph theory
on working sets gigabytes wide - its design emphasis is on the best
possible memory caching. If I dropped back to a conventional machine
the test times would go up by 50% (benchmarked, that's not a guess),
and they're already bad enough to make test cycles very painful.
I just saw elapsed time 8h30m36.292s for the current test - I had it
down to 6h at one point but the runtimes scale badly with increasing
repo size, there is intrinsically O(n**2) stuff going on.

My first evasive maneuver is therefore to run tests with my browser
shut down.  That's working.  I used to do that before I switched from
C-Python to PyPy, which runs faster and has a lower per-object
footprint.  Now it's mandatory again.  Tells me I need to get the
conversion finished before the number of commits gets much higher.

More memory would avoid OOM but not run the tests faster.  More cores
wouldn't help due to Python's GIL problem - many of reposurgeon's
central algorithms are intrinsically serial, anyway.  Higher
single-processor speed could help a lot, but there plain isn't
anything in COTS hardware that beats a Xeon 3 cranking 3.5Ghz by
much. (The hardware wizard who built the Beast thinks he might be able
to crank me up to 3.7GHz later this year but that hardware hasn't
shipped yet.)

The one technical change that might help is moving reposurgeon from
Python to Go - I might hope for as much as a 10x drop in runtimes from
that and a somewhat smaller decrease in working set. Unfortunately
while the move is theoretically possible (I've scoped the job) that
too would be very hard and take a long time.  It's 14KLOC of the most
algorithmically dense Python you are ever likely to encounter, with
dependencies on Python libraries sans Go equivalents that might
double the LOC; only the fact that I built a *really good* regression-
and unit-test suite in self-defense keeps it anywhere near to
practical.

(Before you ask, at the time I started reposurgeon in 2010 there
wasn't any really production-ready language that might have been a
better fit than Python. I did look. OO languages with GC and compiled
speed are still pretty thin on the ground.)

The truth is we're near the bleeding edge of what conventional tools
and hardware can handle gracefully.  Most jobs with working sets as
big as this one's do only comparatively dumb operations that can be
parallellized and thrown on a GPU or supercomputer.  Most jobs with
the algorithmic complexity of repository surgery have *much* smaller
working sets.  The combination of both extrema is hard.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 10:16   ` Eric S. Raymond
@ 2018-07-09 13:59     ` David Edelsohn
  2018-07-09 16:35       ` Eric S. Raymond
  2018-07-09 18:16     ` David Malcolm
  1 sibling, 1 reply; 23+ messages in thread
From: David Edelsohn @ 2018-07-09 13:59 UTC (permalink / raw)
  To: Eric Raymond; +Cc: Janus Weil, GCC Development, fallenpegasus

On Mon, Jul 9, 2018 at 6:16 AM Eric S. Raymond <esr@thyrsus.com> wrote:
>
> Janus Weil <janus@gcc.gnu.org>:
> > > The bad news is that my last test run overran the memnory capacity of
> > > the 64GB Great Beast.  I shall have to find some way of reducing the
> > > working set, as 128GB DD4 memory is hideously expensive.
> >
> > Or maybe you could use a machine from the GCC compile farm?
> >
> > According to https://gcc.gnu.org/wiki/CompileFarm, there are three
> > machines with at least 128GB available (gcc111, gcc112, gcc119).
>
> The Great Beast is a semi-custom PC optimized for doing graph theory
> on working sets gigabytes wide - its design emphasis is on the best
> possible memory caching. If I dropped back to a conventional machine
> the test times would go up by 50% (benchmarked, that's not a guess),
> and they're already bad enough to make test cycles very painful.
> I just saw elapsed time 8h30m36.292s for the current test - I had it
> down to 6h at one point but the runtimes scale badly with increasing
> repo size, there is intrinsically O(n**2) stuff going on.
>
> My first evasive maneuver is therefore to run tests with my browser
> shut down.  That's working.  I used to do that before I switched from
> C-Python to PyPy, which runs faster and has a lower per-object
> footprint.  Now it's mandatory again.  Tells me I need to get the
> conversion finished before the number of commits gets much higher.
>
> More memory would avoid OOM but not run the tests faster.  More cores
> wouldn't help due to Python's GIL problem - many of reposurgeon's
> central algorithms are intrinsically serial, anyway.  Higher
> single-processor speed could help a lot, but there plain isn't
> anything in COTS hardware that beats a Xeon 3 cranking 3.5Ghz by
> much. (The hardware wizard who built the Beast thinks he might be able
> to crank me up to 3.7GHz later this year but that hardware hasn't
> shipped yet.)
>
> The one technical change that might help is moving reposurgeon from
> Python to Go - I might hope for as much as a 10x drop in runtimes from
> that and a somewhat smaller decrease in working set. Unfortunately
> while the move is theoretically possible (I've scoped the job) that
> too would be very hard and take a long time.  It's 14KLOC of the most
> algorithmically dense Python you are ever likely to encounter, with
> dependencies on Python libraries sans Go equivalents that might
> double the LOC; only the fact that I built a *really good* regression-
> and unit-test suite in self-defense keeps it anywhere near to
> practical.
>
> (Before you ask, at the time I started reposurgeon in 2010 there
> wasn't any really production-ready language that might have been a
> better fit than Python. I did look. OO languages with GC and compiled
> speed are still pretty thin on the ground.)
>
> The truth is we're near the bleeding edge of what conventional tools
> and hardware can handle gracefully.  Most jobs with working sets as
> big as this one's do only comparatively dumb operations that can be
> parallellized and thrown on a GPU or supercomputer.  Most jobs with
> the algorithmic complexity of repository surgery have *much* smaller
> working sets.  The combination of both extrema is hard.

If you come to the conclusion that the GCC Community could help with
resources, such as the GNU Compile Farm or paying for more RAM, let us
know.

Thanks, David

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 13:59     ` David Edelsohn
@ 2018-07-09 16:35       ` Eric S. Raymond
  2018-07-09 16:53         ` Janus Weil
  2018-07-09 17:18         ` David Edelsohn
  0 siblings, 2 replies; 23+ messages in thread
From: Eric S. Raymond @ 2018-07-09 16:35 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Janus Weil, GCC Development, fallenpegasus

David Edelsohn <dje.gcc@gmail.com>:
> > The truth is we're near the bleeding edge of what conventional tools
> > and hardware can handle gracefully.  Most jobs with working sets as
> > big as this one's do only comparatively dumb operations that can be
> > parallellized and thrown on a GPU or supercomputer.  Most jobs with
> > the algorithmic complexity of repository surgery have *much* smaller
> > working sets.  The combination of both extrema is hard.
> 
> If you come to the conclusion that the GCC Community could help with
> resources, such as the GNU Compile Farm or paying for more RAM, let us
> know.

128GB of DDR4 registered RAM would allow me to run conversions with my
browser up, but be eye-wateringly expensive.  Thanks, but I'm not
going to yell for that help unless the working set gets so large that
it blows out 64GB even with nothing but i4 and some xterms running.

Unfortunately that is a contingency that no longer seems impossible.

(If you're not familar, i4 is a minimalist tiling window manager with
a really small working set. I like it and would use it even if I
didn't have a memory-crowding problem.  Since I do it is extra helpful.)
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 16:35       ` Eric S. Raymond
@ 2018-07-09 16:53         ` Janus Weil
  2018-07-09 16:57           ` Jeff Law
  2018-07-09 17:18         ` David Edelsohn
  1 sibling, 1 reply; 23+ messages in thread
From: Janus Weil @ 2018-07-09 16:53 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: David Edelsohn, GCC Development, fallenpegasus

2018-07-09 18:35 GMT+02:00 Eric S. Raymond <esr@thyrsus.com>:
> David Edelsohn <dje.gcc@gmail.com>:
>> > The truth is we're near the bleeding edge of what conventional tools
>> > and hardware can handle gracefully.  Most jobs with working sets as
>> > big as this one's do only comparatively dumb operations that can be
>> > parallellized and thrown on a GPU or supercomputer.  Most jobs with
>> > the algorithmic complexity of repository surgery have *much* smaller
>> > working sets.  The combination of both extrema is hard.
>>
>> If you come to the conclusion that the GCC Community could help with
>> resources, such as the GNU Compile Farm or paying for more RAM, let us
>> know.
>
> 128GB of DDR4 registered RAM would allow me to run conversions with my
> browser up, but be eye-wateringly expensive.  Thanks, but I'm not
> going to yell for that help

I for one would certainly be happy to donate some spare bucks towards
beastie RAM if it helps to get the GCC repo converted to git in a
timely manner, and I'm sure there are other GCC
developers/users/sympathizers who'd be willing to join in. So, where
do we throw those bucks?

Cheers,
Janus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 16:53         ` Janus Weil
@ 2018-07-09 16:57           ` Jeff Law
  2018-07-09 18:05             ` Paul Smith
  2018-07-10  9:14             ` Aldy Hernandez
  0 siblings, 2 replies; 23+ messages in thread
From: Jeff Law @ 2018-07-09 16:57 UTC (permalink / raw)
  To: Janus Weil, Eric S. Raymond
  Cc: David Edelsohn, GCC Development, fallenpegasus

On 07/09/2018 10:53 AM, Janus Weil wrote:
> 2018-07-09 18:35 GMT+02:00 Eric S. Raymond <esr@thyrsus.com>:
>> David Edelsohn <dje.gcc@gmail.com>:
>>>> The truth is we're near the bleeding edge of what conventional tools
>>>> and hardware can handle gracefully.  Most jobs with working sets as
>>>> big as this one's do only comparatively dumb operations that can be
>>>> parallellized and thrown on a GPU or supercomputer.  Most jobs with
>>>> the algorithmic complexity of repository surgery have *much* smaller
>>>> working sets.  The combination of both extrema is hard.
>>>
>>> If you come to the conclusion that the GCC Community could help with
>>> resources, such as the GNU Compile Farm or paying for more RAM, let us
>>> know.
>>
>> 128GB of DDR4 registered RAM would allow me to run conversions with my
>> browser up, but be eye-wateringly expensive.  Thanks, but I'm not
>> going to yell for that help
> 
> I for one would certainly be happy to donate some spare bucks towards
> beastie RAM if it helps to get the GCC repo converted to git in a
> timely manner, and I'm sure there are other GCC
> developers/users/sympathizers who'd be willing to join in. So, where
> do we throw those bucks?
I'd be willing to throw some $$$ at this as well.
Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 16:35       ` Eric S. Raymond
  2018-07-09 16:53         ` Janus Weil
@ 2018-07-09 17:18         ` David Edelsohn
  1 sibling, 0 replies; 23+ messages in thread
From: David Edelsohn @ 2018-07-09 17:18 UTC (permalink / raw)
  To: Eric Raymond; +Cc: Janus Weil, GCC Development, fallenpegasus

On Mon, Jul 9, 2018 at 12:35 PM Eric S. Raymond <esr@thyrsus.com> wrote:
>
> David Edelsohn <dje.gcc@gmail.com>:
> > > The truth is we're near the bleeding edge of what conventional tools
> > > and hardware can handle gracefully.  Most jobs with working sets as
> > > big as this one's do only comparatively dumb operations that can be
> > > parallellized and thrown on a GPU or supercomputer.  Most jobs with
> > > the algorithmic complexity of repository surgery have *much* smaller
> > > working sets.  The combination of both extrema is hard.
> >
> > If you come to the conclusion that the GCC Community could help with
> > resources, such as the GNU Compile Farm or paying for more RAM, let us
> > know.
>
> 128GB of DDR4 registered RAM would allow me to run conversions with my
> browser up, but be eye-wateringly expensive.  Thanks, but I'm not
> going to yell for that help unless the working set gets so large that
> it blows out 64GB even with nothing but i4 and some xterms running.

Funds in the FSF GNU Toolchain Fund probably can be allocated to
purchase additional RAM, if that proves necessary.

Also, IBM Power Systems have excellent memory subsystems.  The ones in
the GNU Compile Farm have more than 128GB of memory available.

Thanks, David

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 16:57           ` Jeff Law
@ 2018-07-09 18:05             ` Paul Smith
  2018-07-10  8:10               ` Jonathan Wakely
  2018-07-10  9:14             ` Aldy Hernandez
  1 sibling, 1 reply; 23+ messages in thread
From: Paul Smith @ 2018-07-09 18:05 UTC (permalink / raw)
  To: Jeff Law, Janus Weil, Eric S.	Raymond
  Cc: David Edelsohn, GCC Development, fallenpegasus

On Mon, 2018-07-09 at 10:57 -0600, Jeff Law wrote:
> On 07/09/2018 10:53 AM, Janus Weil wrote:
> > 2018-07-09 18:35 GMT+02:00 Eric S. Raymond <esr@thyrsus.com>:
> > > David Edelsohn <dje.gcc@gmail.com>:
> > > > > The truth is we're near the bleeding edge of what conventional tools
> > > > > and hardware can handle gracefully.  Most jobs with working sets as
> > > > > big as this one's do only comparatively dumb operations that can be
> > > > > parallellized and thrown on a GPU or supercomputer.  Most jobs with
> > > > > the algorithmic complexity of repository surgery have *much* smaller
> > > > > working sets.  The combination of both extrema is hard.
> > > > 
> > > > If you come to the conclusion that the GCC Community could help with
> > > > resources, such as the GNU Compile Farm or paying for more RAM, let us
> > > > know.
> > > 
> > > 128GB of DDR4 registered RAM would allow me to run conversions with my
> > > browser up, but be eye-wateringly expensive.  Thanks, but I'm not
> > > going to yell for that help
> > 
> > I for one would certainly be happy to donate some spare bucks towards
> > beastie RAM if it helps to get the GCC repo converted to git in a
> > timely manner, and I'm sure there are other GCC
> > developers/users/sympathizers who'd be willing to join in. So, where
> > do we throw those bucks?
> 
> I'd be willing to throw some $$$ at this as well.

I may be misreading between the lines but I suspect Eric is more hoping
to get everyone to focus on moving this through before the GCC commit
count gets even more out of control, than he is asking for a hardware
handout :).

Maybe the question should rather be, what does the dev community need
to do to help push this conversion through soonest?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 10:16   ` Eric S. Raymond
  2018-07-09 13:59     ` David Edelsohn
@ 2018-07-09 18:16     ` David Malcolm
  1 sibling, 0 replies; 23+ messages in thread
From: David Malcolm @ 2018-07-09 18:16 UTC (permalink / raw)
  To: esr, Janus Weil; +Cc: Mailing List, Mark Atwood

On Mon, 2018-07-09 at 06:16 -0400, Eric S. Raymond wrote:
> Janus Weil <janus@gcc.gnu.org>:
> > > The bad news is that my last test run overran the memnory
> > > capacity of
> > > the 64GB Great Beast.  I shall have to find some way of reducing
> > > the
> > > working set, as 128GB DD4 memory is hideously expensive.
> > 
> > Or maybe you could use a machine from the GCC compile farm?
> > 
> > According to https://gcc.gnu.org/wiki/CompileFarm, there are three
> > machines with at least 128GB available (gcc111, gcc112, gcc119).
> 
> The Great Beast is a semi-custom PC optimized for doing graph theory
> on working sets gigabytes wide - its design emphasis is on the best
> possible memory caching. If I dropped back to a conventional machine
> the test times would go up by 50% (benchmarked, that's not a guess),
> and they're already bad enough to make test cycles very painful.
> I just saw elapsed time 8h30m36.292s for the current test - I had it
> down to 6h at one point but the runtimes scale badly with increasing
> repo size, there is intrinsically O(n**2) stuff going on.
> 
> My first evasive maneuver is therefore to run tests with my browser
> shut down.  That's working.  I used to do that before I switched from
> C-Python to PyPy, which runs faster and has a lower per-object
> footprint.  Now it's mandatory again.  Tells me I need to get the
> conversion finished before the number of commits gets much higher.

I wonder if one approach would be to tune PyPy for the problem?

I was going to check that you've read:
  https://pypy.org/performance.html
but I see you've already contributed text to it :)

For CPU, does PyPy's JIT get a chance to kick in and turn the hot loops
into machine code, or is it stuck interpreting bytecode for the most
part?

For RAM, is there a way to make PyPy make more efficient use of the RAM
to store the objects?  (PyPy already has a number of tricks it uses to
store things more efficiently, and it's possible, though hard, to teach
it new ones)

This is possibly self-serving, as I vaguely know them from my days in
the Python community, but note that the PyPy lead developers have a
consulting gig where they offer paid consulting on dealing with Python
and PyPy performance issues:
  https://baroquesoftware.com/
(though I don't know who would pay for that for the GCC repo
conversion)

Hope this is constructive.
Dave



> More memory would avoid OOM but not run the tests faster.  More cores
> wouldn't help due to Python's GIL problem - many of reposurgeon's
> central algorithms are intrinsically serial, anyway.  Higher
> single-processor speed could help a lot, but there plain isn't
> anything in COTS hardware that beats a Xeon 3 cranking 3.5Ghz by
> much. (The hardware wizard who built the Beast thinks he might be
> able
> to crank me up to 3.7GHz later this year but that hardware hasn't
> shipped yet.)

> The one technical change that might help is moving reposurgeon from
> Python to Go - I might hope for as much as a 10x drop in runtimes
> from
> that and a somewhat smaller decrease in working set. Unfortunately
> while the move is theoretically possible (I've scoped the job) that
> too would be very hard and take a long time.  It's 14KLOC of the most
> algorithmically dense Python you are ever likely to encounter, with
> dependencies on Python libraries sans Go equivalents that might
> double the LOC; only the fact that I built a *really good*
> regression-
> and unit-test suite in self-defense keeps it anywhere near to
> practical.
> 
> (Before you ask, at the time I started reposurgeon in 2010 there
> wasn't any really production-ready language that might have been a
> better fit than Python. I did look. OO languages with GC and compiled
> speed are still pretty thin on the ground.)
> 
> The truth is we're near the bleeding edge of what conventional tools
> and hardware can handle gracefully.  Most jobs with working sets as
> big as this one's do only comparatively dumb operations that can be
> parallellized and thrown on a GPU or supercomputer.  Most jobs with
> the algorithmic complexity of repository surgery have *much* smaller
> working sets.  The combination of both extrema is hard.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09  0:28 Good news, bad news on the repository conversion Eric S. Raymond
                   ` (2 preceding siblings ...)
  2018-07-09  8:40 ` Martin Liška
@ 2018-07-09 19:51 ` Florian Weimer
  2018-07-09 20:04   ` Eric S. Raymond
  2018-07-20 21:36 ` Joseph Myers
  4 siblings, 1 reply; 23+ messages in thread
From: Florian Weimer @ 2018-07-09 19:51 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: Mailing List, Mark Atwood

* Eric S. Raymond:

> The bad news is that my last test run overran the memnory capacity of
> the 64GB Great Beast.  I shall have to find some way of reducing the
> working set, as 128GB DD4 memory is hideously expensive.

Do you need interactive access to the machine, or can we run the job
for you?

If your application is not NUMA-aware, we probably need something that
has 128 GiB per NUMA node, which might be bit harder to find, but I'm
sure many of us have suitable lab machines which could be temporarily
allocated for that purpose.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 19:51 ` Florian Weimer
@ 2018-07-09 20:04   ` Eric S. Raymond
  2018-07-10  8:10     ` Alec Teal
  0 siblings, 1 reply; 23+ messages in thread
From: Eric S. Raymond @ 2018-07-09 20:04 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Mailing List, Mark Atwood

Florian Weimer <fw@deneb.enyo.de>:
> * Eric S. Raymond:
> 
> > The bad news is that my last test run overran the memnory capacity of
> > the 64GB Great Beast.  I shall have to find some way of reducing the
> > working set, as 128GB DD4 memory is hideously expensive.
> 
> Do you need interactive access to the machine, or can we run the job
> for you?
> 
> If your application is not NUMA-aware, we probably need something that
> has 128 GiB per NUMA node, which might be bit harder to find, but I'm
> sure many of us have suitable lab machines which could be temporarily
> allocated for that purpose.

I would need interactive access.

But that's now one level way from the principal problem; there is
somme kind of recent metadata damage - or maybe some "correct" but
weird and undocumented stream semantics that reposurgeon doesn't know
how to emulate - that is blocking correct conversion.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 20:04   ` Eric S. Raymond
@ 2018-07-10  8:10     ` Alec Teal
  2018-07-10  8:11       ` Jonathan Wakely
  0 siblings, 1 reply; 23+ messages in thread
From: Alec Teal @ 2018-07-10  8:10 UTC (permalink / raw)
  To: esr, Florian Weimer; +Cc: Mailing List, Mark Atwood, alec

Is this still an issue? (I missed the convo due to an overzealous spam 
filter; this is the only message I have)


I often use AWS Spot instances (bidding on instances other people 
previsioned but put up for auction as it's not always needed) to get 
results extremely quickly without hearing a fan or to test changes on a 
"large" system.

What do you need and how long (roughly, eg days, hours...)?

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/general-purpose-instances.html 


https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/memory-optimized-instances.html

Take your pick. m4.16xlarge is 64 cores and 256Gib of RAM, x1e.16xlarge 
64 cores, just shy of 2 Tb of ram, x1e.32xlarge is 128 cores and 3.9 Tb 
of Ram

Alec


PS: Migrating what to what? Wasn't the git migration done years ago? 
Remember I only have the quoted message!


On 09/07/18 21:03, Eric S. Raymond wrote:
> Florian Weimer <fw@deneb.enyo.de>:
>> * Eric S. Raymond:
>>
>>> The bad news is that my last test run overran the memnory capacity of
>>> the 64GB Great Beast.  I shall have to find some way of reducing the
>>> working set, as 128GB DD4 memory is hideously expensive.
>> Do you need interactive access to the machine, or can we run the job
>> for you?
>>
>> If your application is not NUMA-aware, we probably need something that
>> has 128 GiB per NUMA node, which might be bit harder to find, but I'm
>> sure many of us have suitable lab machines which could be temporarily
>> allocated for that purpose.
> I would need interactive access.
>
> But that's now one level way from the principal problem; there is
> somme kind of recent metadata damage - or maybe some "correct" but
> weird and undocumented stream semantics that reposurgeon doesn't know
> how to emulate - that is blocking correct conversion.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 18:05             ` Paul Smith
@ 2018-07-10  8:10               ` Jonathan Wakely
  0 siblings, 0 replies; 23+ messages in thread
From: Jonathan Wakely @ 2018-07-10  8:10 UTC (permalink / raw)
  To: Paul Smith
  Cc: Jeff Law, Janus Weil, Eric Raymond, David Edelsohn, gcc, fallenpegasus

On Mon, 9 Jul 2018, 19:05 Paul Smith, <paul@mad-scientist.net> wrote:
>
> On Mon, 2018-07-09 at 10:57 -0600, Jeff Law wrote:
> > On 07/09/2018 10:53 AM, Janus Weil wrote:
> > > 2018-07-09 18:35 GMT+02:00 Eric S. Raymond <esr@thyrsus.com>:
> > > > David Edelsohn <dje.gcc@gmail.com>:
> > > > > > The truth is we're near the bleeding edge of what conventional tools
> > > > > > and hardware can handle gracefully.  Most jobs with working sets as
> > > > > > big as this one's do only comparatively dumb operations that can be
> > > > > > parallellized and thrown on a GPU or supercomputer.  Most jobs with
> > > > > > the algorithmic complexity of repository surgery have *much* smaller
> > > > > > working sets.  The combination of both extrema is hard.
> > > > >
> > > > > If you come to the conclusion that the GCC Community could help with
> > > > > resources, such as the GNU Compile Farm or paying for more RAM, let us
> > > > > know.
> > > >
> > > > 128GB of DDR4 registered RAM would allow me to run conversions with my
> > > > browser up, but be eye-wateringly expensive.  Thanks, but I'm not
> > > > going to yell for that help
> > >
> > > I for one would certainly be happy to donate some spare bucks towards
> > > beastie RAM if it helps to get the GCC repo converted to git in a
> > > timely manner, and I'm sure there are other GCC
> > > developers/users/sympathizers who'd be willing to join in. So, where
> > > do we throw those bucks?
> >
> > I'd be willing to throw some $$$ at this as well.
>
> I may be misreading between the lines but I suspect Eric is more hoping
> to get everyone to focus on moving this through before the GCC commit
> count gets even more out of control, than he is asking for a hardware
> handout :).
>
> Maybe the question should rather be, what does the dev community need
> to do to help push this conversion through soonest?


Apart from making the repository read-only (so the commit count
doesn't grow), I don't see what the dev community can do here. Eric is
not waiting on us.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-10  8:10     ` Alec Teal
@ 2018-07-10  8:11       ` Jonathan Wakely
  0 siblings, 0 replies; 23+ messages in thread
From: Jonathan Wakely @ 2018-07-10  8:11 UTC (permalink / raw)
  To: Alec Teal; +Cc: Eric Raymond, Florian Weimer, gcc, fallenpegasus, alec

On Tue, 10 Jul 2018 at 09:10, Alec Teal wrote:
> PS: Migrating what to what?

Git.

> Wasn't the git migration done years ago?

No.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09 16:57           ` Jeff Law
  2018-07-09 18:05             ` Paul Smith
@ 2018-07-10  9:14             ` Aldy Hernandez
  2018-07-11  3:31               ` Mark Atwood
  1 sibling, 1 reply; 23+ messages in thread
From: Aldy Hernandez @ 2018-07-10  9:14 UTC (permalink / raw)
  To: Jeff Law; +Cc: janus, Eric S. Raymond, dje.gcc, GCC Mailing List, fallenpegasus

Wait, there's a pot of money for making SVN go away?  Sign me up!
While we're at it, let's start one for TCL and dejagnu!
On Mon, Jul 9, 2018 at 6:58 PM Jeff Law <law@redhat.com> wrote:
>
> On 07/09/2018 10:53 AM, Janus Weil wrote:
> > 2018-07-09 18:35 GMT+02:00 Eric S. Raymond <esr@thyrsus.com>:
> >> David Edelsohn <dje.gcc@gmail.com>:
> >>>> The truth is we're near the bleeding edge of what conventional tools
> >>>> and hardware can handle gracefully.  Most jobs with working sets as
> >>>> big as this one's do only comparatively dumb operations that can be
> >>>> parallellized and thrown on a GPU or supercomputer.  Most jobs with
> >>>> the algorithmic complexity of repository surgery have *much* smaller
> >>>> working sets.  The combination of both extrema is hard.
> >>>
> >>> If you come to the conclusion that the GCC Community could help with
> >>> resources, such as the GNU Compile Farm or paying for more RAM, let us
> >>> know.
> >>
> >> 128GB of DDR4 registered RAM would allow me to run conversions with my
> >> browser up, but be eye-wateringly expensive.  Thanks, but I'm not
> >> going to yell for that help
> >
> > I for one would certainly be happy to donate some spare bucks towards
> > beastie RAM if it helps to get the GCC repo converted to git in a
> > timely manner, and I'm sure there are other GCC
> > developers/users/sympathizers who'd be willing to join in. So, where
> > do we throw those bucks?
> I'd be willing to throw some $$$ at this as well.
> Jeff
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-10  9:14             ` Aldy Hernandez
@ 2018-07-11  3:31               ` Mark Atwood
  2018-07-11  4:29                 ` Eric S. Raymond
  0 siblings, 1 reply; 23+ messages in thread
From: Mark Atwood @ 2018-07-11  3:31 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Jeff Law, janus, Eric S. Raymond, dje.gcc, GCC Mailing List

ESR, how much for the memory expansion?  It sounds like we have some
volunteers to solve this problem with some money.

..m

On Tue, Jul 10, 2018 at 2:14 AM Aldy Hernandez <aldyh@redhat.com> wrote:

> Wait, there's a pot of money for making SVN go away?  Sign me up!
> While we're at it, let's start one for TCL and dejagnu!
> On Mon, Jul 9, 2018 at 6:58 PM Jeff Law <law@redhat.com> wrote:
> >
> > On 07/09/2018 10:53 AM, Janus Weil wrote:
> > > 2018-07-09 18:35 GMT+02:00 Eric S. Raymond <esr@thyrsus.com>:
> > >> David Edelsohn <dje.gcc@gmail.com>:
> > >>>> The truth is we're near the bleeding edge of what conventional tools
> > >>>> and hardware can handle gracefully.  Most jobs with working sets as
> > >>>> big as this one's do only comparatively dumb operations that can be
> > >>>> parallellized and thrown on a GPU or supercomputer.  Most jobs with
> > >>>> the algorithmic complexity of repository surgery have *much* smaller
> > >>>> working sets.  The combination of both extrema is hard.
> > >>>
> > >>> If you come to the conclusion that the GCC Community could help with
> > >>> resources, such as the GNU Compile Farm or paying for more RAM, let
> us
> > >>> know.
> > >>
> > >> 128GB of DDR4 registered RAM would allow me to run conversions with my
> > >> browser up, but be eye-wateringly expensive.  Thanks, but I'm not
> > >> going to yell for that help
> > >
> > > I for one would certainly be happy to donate some spare bucks towards
> > > beastie RAM if it helps to get the GCC repo converted to git in a
> > > timely manner, and I'm sure there are other GCC
> > > developers/users/sympathizers who'd be willing to join in. So, where
> > > do we throw those bucks?
> > I'd be willing to throw some $$$ at this as well.
> > Jeff
> >
>
-- 

Mark Atwood
http://about.me/markatwood
+1-206-604-2198

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-11  3:31               ` Mark Atwood
@ 2018-07-11  4:29                 ` Eric S. Raymond
  2018-07-11  8:27                   ` Alec Teal
  0 siblings, 1 reply; 23+ messages in thread
From: Eric S. Raymond @ 2018-07-11  4:29 UTC (permalink / raw)
  To: Mark Atwood; +Cc: Aldy Hernandez, Jeff Law, janus, dje.gcc, GCC Mailing List

Mark Atwood <fallenpegasus@gmail.com>:
> ESR, how much for the memory expansion?  It sounds like we have some
> volunteers to solve this problem with some money.

That's now rthe second problem out.  There's a malformation that has
turned up in the repo that may sink the conversion entirely.  I want to be
reasonably sure I can solve that before I go asking for money.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-11  4:29                 ` Eric S. Raymond
@ 2018-07-11  8:27                   ` Alec Teal
  0 siblings, 0 replies; 23+ messages in thread
From: Alec Teal @ 2018-07-11  8:27 UTC (permalink / raw)
  To: esr, Mark Atwood
  Cc: Aldy Hernandez, Jeff Law, janus, dje.gcc, GCC Mailing List, alec

I have no idea what order messages are in now because I wasn't CCed into 
this (so was it before?) but it may not be much money. It depends how 
long you need it for.

Presumably someone's mentioned swapspace too...

Anyway do let me know, I don't check the mailing lists as often as I'd 
like and the junk mail filter is very eager.

Alec

On 11/07/18 05:29, Eric S. Raymond wrote:
> Mark Atwood <fallenpegasus@gmail.com>:
>> ESR, how much for the memory expansion?  It sounds like we have some
>> volunteers to solve this problem with some money.
> That's now rthe second problem out.  There's a malformation that has
> turned up in the repo that may sink the conversion entirely.  I want to be
> reasonably sure I can solve that before I go asking for money.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-09  0:28 Good news, bad news on the repository conversion Eric S. Raymond
                   ` (3 preceding siblings ...)
  2018-07-09 19:51 ` Florian Weimer
@ 2018-07-20 21:36 ` Joseph Myers
  2018-07-20 23:47   ` Eric S. Raymond
  4 siblings, 1 reply; 23+ messages in thread
From: Joseph Myers @ 2018-07-20 21:36 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: Mailing List, Mark Atwood

I don't see any commits at 
git://thyrsus.com/repositories/gcc-conversion.git since January.  Are 
there further changes that haven't been pushed there?  (For example, I 
sent a few additions to the author map on 13 Feb.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Good news, bad news on the repository conversion
  2018-07-20 21:36 ` Joseph Myers
@ 2018-07-20 23:47   ` Eric S. Raymond
  0 siblings, 0 replies; 23+ messages in thread
From: Eric S. Raymond @ 2018-07-20 23:47 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Mailing List, Mark Atwood

Joseph Myers <joseph@codesourcery.com>:
> I don't see any commits at 
> git://thyrsus.com/repositories/gcc-conversion.git since January.  Are 
> there further changes that haven't been pushed there?  (For example, I 
> sent a few additions to the author map on 13 Feb.)

Yes, that copy is rather stale. I need toi do some annoying sysdamin stuff
on the downstairs mmachine to get it live again.

Anything you sent me by email is merged into the live repo here on the Beast.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-07-20 23:46 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-09  0:28 Good news, bad news on the repository conversion Eric S. Raymond
2018-07-09  2:09 ` Jason Merrill
2018-07-09  7:18 ` Janus Weil
2018-07-09 10:16   ` Eric S. Raymond
2018-07-09 13:59     ` David Edelsohn
2018-07-09 16:35       ` Eric S. Raymond
2018-07-09 16:53         ` Janus Weil
2018-07-09 16:57           ` Jeff Law
2018-07-09 18:05             ` Paul Smith
2018-07-10  8:10               ` Jonathan Wakely
2018-07-10  9:14             ` Aldy Hernandez
2018-07-11  3:31               ` Mark Atwood
2018-07-11  4:29                 ` Eric S. Raymond
2018-07-11  8:27                   ` Alec Teal
2018-07-09 17:18         ` David Edelsohn
2018-07-09 18:16     ` David Malcolm
2018-07-09  8:40 ` Martin Liška
2018-07-09 19:51 ` Florian Weimer
2018-07-09 20:04   ` Eric S. Raymond
2018-07-10  8:10     ` Alec Teal
2018-07-10  8:11       ` Jonathan Wakely
2018-07-20 21:36 ` Joseph Myers
2018-07-20 23:47   ` Eric S. Raymond

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).