public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32
@ 2021-05-28 18:45 Barnes, Richard
  2021-05-28 19:34 ` H.J. Lu
  0 siblings, 1 reply; 6+ messages in thread
From: Barnes, Richard @ 2021-05-28 18:45 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 2153 bytes --]

We are porting gcc-10.2.0 to a proprietary OS called VOS with a POSIX API that runs on x86-32. We are using a prior port of gcc-3.4.6 to build the port natively. When the build gets to the point where it compiles libgcc2.c with the gcc-10.2.0 compiler, it goes into an infinite loop and eventually runs out of virtual memory. We analyzed the failure by building libgcc2.c with -da and found that the build fails while compiling __mulvsi3.c. This build fails whether our build is using the reload pass or the LRA pass to run after the IRA pass.

What is contributing to the problem is that we must compile libgcc2.c with -fpic because the VOS runtime is position-independent with a different GOT for each thread, that the VOS ABI requires that when a function is entered, %edx must contain the GOTP, and that %edx must contain the GOTP at each call point, all of which is very different than the Unix ABI. Furthermore, GNU code requires that %ebx contain the GOTP wherever @PLT or @GOT relocation is used,. Since %edx also contains the upper half of the result of the multiply that adds to the register pressure.

I am attaching mulvsi3.c, which exhibits the problem.

During the failed builds, it looks like spilling keeps failing while trying to color or coalesce pseudo registers. I have built my own Chaitin register allocator that we use with our native compilers, and I know that we do not color the registers until no constrained registers are left after pruning the interference graph. If pruning fails to eliminate constrained registers, we spill and rebuild the interference graph and try again. While spilling does take time, this should not take too many passes to work. The behavior that the reload and LRA passes exhibits concerns me because that violates this rule. Also, I note that comments in the assign_by_spills() function show that the author was concerned about the possibility of multi-register reload-pseudos when the hard regs pool is fragmented. Something that could easily happen on x86-32.

I would appreciate any advice that would help us deal with this problem.

Thanks,
Richard Barnes

Stratus Technologies


[-- Attachment #2: mulvsi3.c --]
[-- Type: text/plain, Size: 318 bytes --]

extern void abort (void)  __attribute__ ((noreturn));
typedef int SItype __attribute__ ((mode (SI)));
typedef int DItype __attribute__ ((mode (DI)));
 
SItype
__mulvsi3 (SItype a, SItype b)
{
  const DItype w = (DItype) a * (DItype) b;
 
  if ((SItype) (w >> 32) != (SItype) (w) >> 31)
    abort ();
 
  return w;
}
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32
  2021-05-28 18:45 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32 Barnes, Richard
@ 2021-05-28 19:34 ` H.J. Lu
  2021-05-28 19:41   ` [EXTERNAL] " Barnes, Richard
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2021-05-28 19:34 UTC (permalink / raw)
  To: Barnes, Richard; +Cc: gcc

On Fri, May 28, 2021 at 12:10 PM Barnes, Richard
<Richard.Barnes@stratus.com> wrote:
>
> We are porting gcc-10.2.0 to a proprietary OS called VOS with a POSIX API that runs on x86-32. We are using a prior port of gcc-3.4.6 to build the port natively. When the build gets to the point where it compiles libgcc2.c with the gcc-10.2.0 compiler, it goes into an infinite loop and eventually runs out of virtual memory. We analyzed the failure by building libgcc2.c with -da and found that the build fails while compiling __mulvsi3.c. This build fails whether our build is using the reload pass or the LRA pass to run after the IRA pass.

I have no comment on your issue.  I am just curious why you don't build your
32-bit x86 OS with x86-64.

> What is contributing to the problem is that we must compile libgcc2.c with -fpic because the VOS runtime is position-independent with a different GOT for each thread, that the VOS ABI requires that when a function is entered, %edx must contain the GOTP, and that %edx must contain the GOTP at each call point, all of which is very different than the Unix ABI. Furthermore, GNU code requires that %ebx contain the GOTP wherever @PLT or @GOT relocation is used,. Since %edx also contains the upper half of the result of the multiply that adds to the register pressure.
>
> I am attaching mulvsi3.c, which exhibits the problem.
>
> During the failed builds, it looks like spilling keeps failing while trying to color or coalesce pseudo registers. I have built my own Chaitin register allocator that we use with our native compilers, and I know that we do not color the registers until no constrained registers are left after pruning the interference graph. If pruning fails to eliminate constrained registers, we spill and rebuild the interference graph and try again. While spilling does take time, this should not take too many passes to work. The behavior that the reload and LRA passes exhibits concerns me because that violates this rule. Also, I note that comments in the assign_by_spills() function show that the author was concerned about the possibility of multi-register reload-pseudos when the hard regs pool is fragmented. Something that could easily happen on x86-32.
>
> I would appreciate any advice that would help us deal with this problem.
>
> Thanks,
> Richard Barnes
>
> Stratus Technologies
>


-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [EXTERNAL] Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32
  2021-05-28 19:34 ` H.J. Lu
@ 2021-05-28 19:41   ` Barnes, Richard
  2021-05-28 19:52     ` H.J. Lu
  0 siblings, 1 reply; 6+ messages in thread
From: Barnes, Richard @ 2021-05-28 19:41 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc

Unfortunately, our OS is only a 32-bit OS. It's ABI is only a 32-bit ABI. As you imply, if we had a 64-bit OS, we would have more registers and more memory and would probably avoid this problem. Also, libgcc2.c is supposed to be built natively by the gcc-10.2.0 compiler you have just created.

Richard Barnes

Stratus Technologies
________________________________
From: H.J. Lu <hjl.tools@gmail.com>
Sent: Friday, May 28, 2021 3:34 PM
To: Barnes, Richard <Richard.Barnes@stratus.com>
Cc: gcc@gcc.gnu.org <gcc@gcc.gnu.org>
Subject: [EXTERNAL] Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32

[EXTERNAL SENDER: This email originated from outside of Stratus Technologies. Do not click links or open attachments unless you recognize the sender and know the content is safe.]

On Fri, May 28, 2021 at 12:10 PM Barnes, Richard
<Richard.Barnes@stratus.com> wrote:
>
> We are porting gcc-10.2.0 to a proprietary OS called VOS with a POSIX API that runs on x86-32. We are using a prior port of gcc-3.4.6 to build the port natively. When the build gets to the point where it compiles libgcc2.c with the gcc-10.2.0 compiler, it goes into an infinite loop and eventually runs out of virtual memory. We analyzed the failure by building libgcc2.c with -da and found that the build fails while compiling __mulvsi3.c. This build fails whether our build is using the reload pass or the LRA pass to run after the IRA pass.

I have no comment on your issue.  I am just curious why you don't build your
32-bit x86 OS with x86-64.

> What is contributing to the problem is that we must compile libgcc2.c with -fpic because the VOS runtime is position-independent with a different GOT for each thread, that the VOS ABI requires that when a function is entered, %edx must contain the GOTP, and that %edx must contain the GOTP at each call point, all of which is very different than the Unix ABI. Furthermore, GNU code requires that %ebx contain the GOTP wherever @PLT or @GOT relocation is used,. Since %edx also contains the upper half of the result of the multiply that adds to the register pressure.
>
> I am attaching mulvsi3.c, which exhibits the problem.
>
> During the failed builds, it looks like spilling keeps failing while trying to color or coalesce pseudo registers. I have built my own Chaitin register allocator that we use with our native compilers, and I know that we do not color the registers until no constrained registers are left after pruning the interference graph. If pruning fails to eliminate constrained registers, we spill and rebuild the interference graph and try again. While spilling does take time, this should not take too many passes to work. The behavior that the reload and LRA passes exhibits concerns me because that violates this rule. Also, I note that comments in the assign_by_spills() function show that the author was concerned about the possibility of multi-register reload-pseudos when the hard regs pool is fragmented. Something that could easily happen on x86-32.
>
> I would appreciate any advice that would help us deal with this problem.
>
> Thanks,
> Richard Barnes
>
> Stratus Technologies
>


--
H.J.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [EXTERNAL] Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32
  2021-05-28 19:41   ` [EXTERNAL] " Barnes, Richard
@ 2021-05-28 19:52     ` H.J. Lu
  2021-05-28 19:59       ` Barnes, Richard
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2021-05-28 19:52 UTC (permalink / raw)
  To: Barnes, Richard; +Cc: gcc

On Fri, May 28, 2021 at 12:42 PM Barnes, Richard
<Richard.Barnes@stratus.com> wrote:
>
> Unfortunately, our OS is only a 32-bit OS. It's ABI is only a 32-bit ABI. As you imply, if we had a 64-bit OS, we would have more registers and more memory and would probably avoid this problem. Also, libgcc2.c is supposed to be built natively by the gcc-10.2.0 compiler you have just created.
>

Are you aware that you can build a 32-bit OS with x86-64?  You can try
-mx32 with
GCC on Ubuntu.  You will get more registers as well as IP relative addressing.


-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [EXTERNAL] Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32
  2021-05-28 19:52     ` H.J. Lu
@ 2021-05-28 19:59       ` Barnes, Richard
  2021-06-04 18:54         ` Barnes, Richard
  0 siblings, 1 reply; 6+ messages in thread
From: Barnes, Richard @ 2021-05-28 19:59 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc

Our OS is not built with gcc. It is built with native compilers and linkers. It sounds like you are talking about cross compiling, which is something we have considered but hope to avoid.

Richard Barnes
________________________________
From: H.J. Lu <hjl.tools@gmail.com>
Sent: Friday, May 28, 2021 3:52 PM
To: Barnes, Richard <Richard.Barnes@stratus.com>
Cc: gcc@gcc.gnu.org <gcc@gcc.gnu.org>
Subject: Re: [EXTERNAL] Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32

[EXTERNAL SENDER: This email originated from outside of Stratus Technologies. Do not click links or open attachments unless you recognize the sender and know the content is safe.]

On Fri, May 28, 2021 at 12:42 PM Barnes, Richard
<Richard.Barnes@stratus.com> wrote:
>
> Unfortunately, our OS is only a 32-bit OS. It's ABI is only a 32-bit ABI. As you imply, if we had a 64-bit OS, we would have more registers and more memory and would probably avoid this problem. Also, libgcc2.c is supposed to be built natively by the gcc-10.2.0 compiler you have just created.
>

Are you aware that you can build a 32-bit OS with x86-64?  You can try
-mx32 with
GCC on Ubuntu.  You will get more registers as well as IP relative addressing.


--
H.J.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [EXTERNAL] Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32
  2021-05-28 19:59       ` Barnes, Richard
@ 2021-06-04 18:54         ` Barnes, Richard
  0 siblings, 0 replies; 6+ messages in thread
From: Barnes, Richard @ 2021-06-04 18:54 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc

I found the problem, and it was a mistake I made elsewhere that resulted in %edx being busy everywhere. I have fixed it and consider the issue resolved.

Richard Barnes

________________________________
From: Barnes, Richard <Richard.Barnes@stratus.com>
Sent: Friday, May 28, 2021 3:59 PM
To: H.J. Lu <hjl.tools@gmail.com>
Cc: gcc@gcc.gnu.org <gcc@gcc.gnu.org>
Subject: Re: [EXTERNAL] Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32

Our OS is not built with gcc. It is built with native compilers and linkers. It sounds like you are talking about cross compiling, which is something we have considered but hope to avoid.

Richard Barnes
________________________________
From: H.J. Lu <hjl.tools@gmail.com>
Sent: Friday, May 28, 2021 3:52 PM
To: Barnes, Richard <Richard.Barnes@stratus.com>
Cc: gcc@gcc.gnu.org <gcc@gcc.gnu.org>
Subject: Re: [EXTERNAL] Re: 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32

[EXTERNAL SENDER: This email originated from outside of Stratus Technologies. Do not click links or open attachments unless you recognize the sender and know the content is safe.]

On Fri, May 28, 2021 at 12:42 PM Barnes, Richard
<Richard.Barnes@stratus.com> wrote:
>
> Unfortunately, our OS is only a 32-bit OS. It's ABI is only a 32-bit ABI. As you imply, if we had a 64-bit OS, we would have more registers and more memory and would probably avoid this problem. Also, libgcc2.c is supposed to be built natively by the gcc-10.2.0 compiler you have just created.
>

Are you aware that you can build a 32-bit OS with x86-64?  You can try
-mx32 with
GCC on Ubuntu.  You will get more registers as well as IP relative addressing.


--
H.J.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-04 18:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-28 18:45 64-bit integer typedef's and -fpic lead to infinite loop and growing memory use in port to x86-32 Barnes, Richard
2021-05-28 19:34 ` H.J. Lu
2021-05-28 19:41   ` [EXTERNAL] " Barnes, Richard
2021-05-28 19:52     ` H.J. Lu
2021-05-28 19:59       ` Barnes, Richard
2021-06-04 18:54         ` Barnes, Richard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).