public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/100799] New: Stackoverflow in optimized code on PPC
@ 2021-05-27 11:20 alexander.grund@tu-dresden.de
  2021-05-28 16:42 ` [Bug target/100799] " alexander.grund@tu-dresden.de
                   ` (33 more replies)
  0 siblings, 34 replies; 35+ messages in thread
From: alexander.grund@tu-dresden.de @ 2021-05-27 11:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

            Bug ID: 100799
           Summary: Stackoverflow in optimized code on PPC
           Product: gcc
           Version: 10.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: alexander.grund@tu-dresden.de
  Target Milestone: ---

Created attachment 50879
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50879&action=edit
Disassembly of dbgebal_ in debug and release modes

Quick summary of the use case: When using FlexiBLAS with OpenBLAS I noticed
corruption of the parameters passed to OpenBLAS functions. FlexiBLAS basically
provides a BLAS interface where each function is a stub that forwards the
arguments to a real BLAS lib, like OpenBLAS

Example:
void FC_GLOBAL(dgebal,DGEBAL)(char* job, blasint* n, double* a, blasint* lda,
blasint* ilo, blasint* ihi, double* scale, blasint* info)
{
        void (*fn) (void* job, void* n, void* a, void* lda, void* ilo, void*
ihi, void* scale, void* info);

        fn = current_backend->lapack.dgebal.f77_blas_function; 

                fn((void*) job, (void*) n, (void*) a, (void*) lda, (void*) ilo,
(void*) ihi, (void*) scale, (void*) info); 

        return;
}
void dgebal(char* job, blasint* n, double* a, blasint* lda, blasint* ilo,
blasint* ihi, double* scale, blasint* info)
__attribute__((alias(MTS(FC_GLOBAL(dgebal,DGEBAL)))));

Due to the alias and the real BLAS lib being loader after FlexiBLAS also the
calls from an OpenBLAS function to another OpenBLAS function get routed through
FlexiBLAS.

Now I noticed that the parameter "N" at
https://github.com/xianyi/OpenBLAS/blob/v0.3.15/lapack-netlib/SRC/dgeev.f#L369
gets messed up during the call at
https://github.com/xianyi/OpenBLAS/blob/v0.3.15/lapack-netlib/SRC/dgeev.f#L363
which I traced to FlexiBLAS pushing the register that holds it, calling the
OpenBLAS DGEBAL and restoring it afterwards but the stack entry where it came
from gets changed by DGEBAL

So the actual Bug here is that GCC generates code for DGEBAL which uses a write
outside of the allocated stack.

The dissassembly of the dgebal_ function shows "stdu    r1,-368(r1)" in the
prologue and "std     r25,440(r1)" later, which is the instruction that
overwrites the saved register from the calling function.
As far as I can tell an offset of 440 onto r1, which is bigger than the 368
"allocated" by the stdu is invalid.
The line reported by GDB for the overwriting instruction is
https://github.com/xianyi/OpenBLAS/blob/v0.3.15/lapack-netlib/SRC/dgebal.f#L328

The command used to compile the file is: gfortran -fno-math-errno -Wall
-frecursive -fno-optimize-sibling-calls -m64 -fopenmp -fPIC -O2 -fno-fast-math
-mcpu=power9 -mtune=power9  -DUSE_OPENMP -fopenmp -fno-optimize-sibling-calls
-g  -c -o dgebal.o dgebal.f

Replacing the "O2" by "Og" changes the prologue to "stdu    r1,-336(r1)" and
the max offset used for std on r1 is 328. Using this works with FlexiBLAS,
hence I suspect an optimization issue which leads to more spills but doesn't
update the stack size.

Reproduced with GCC 10.2.0, 10.3.0, 11.1.0

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
@ 2021-05-28 16:42 ` alexander.grund@tu-dresden.de
  2021-06-01 19:08 ` bergner at gcc dot gnu.org
                   ` (32 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: alexander.grund@tu-dresden.de @ 2021-05-28 16:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #1 from Alexander Grund <alexander.grund@tu-dresden.de> ---
Confirmed to also break with GCC 7.3, 8.2, 8.3 but works with 6.3, 6.4, 6.5

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
  2021-05-28 16:42 ` [Bug target/100799] " alexander.grund@tu-dresden.de
@ 2021-06-01 19:08 ` bergner at gcc dot gnu.org
  2021-06-01 21:09 ` segher at gcc dot gnu.org
                   ` (31 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2021-06-01 19:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-06-01

--- Comment #2 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Alexander Grund from comment #1)
> Confirmed to also break with GCC 7.3, 8.2, 8.3 but works with 6.3, 6.4, 6.5

The failure with GCC 7 and later coincides with the PPC port starting to
default to LRA instead of reload.  If I look at the debug dumps compiling
dgebal.f, the 440 offset to the stack is created by an LRA spill.  No problem
there that I can see.  The problem seems to come later when we generate the
prologue/epilogue and we only update the stack pointer by the smaller 368 byte
offset.

Either LRA isn't telling us it needs that extra stack space or the ppc backend
didn't notice.  I'll keep digging.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
  2021-05-28 16:42 ` [Bug target/100799] " alexander.grund@tu-dresden.de
  2021-06-01 19:08 ` bergner at gcc dot gnu.org
@ 2021-06-01 21:09 ` segher at gcc dot gnu.org
  2021-06-02  0:31 ` amodra at gmail dot com
                   ` (30 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: segher at gcc dot gnu.org @ 2021-06-01 21:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #3 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Hi Alexander,

You do not say what the actual target you used is?  powerpc-linux,
powerpc64-linux, powerpc64le-linux, something else entirely?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (2 preceding siblings ...)
  2021-06-01 21:09 ` segher at gcc dot gnu.org
@ 2021-06-02  0:31 ` amodra at gmail dot com
  2021-10-05 22:45 ` bergner at gcc dot gnu.org
                   ` (29 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: amodra at gmail dot com @ 2021-06-02  0:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Alan Modra <amodra at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|powerpc                     |powerpc64le
                 CC|                            |amodra at gmail dot com

--- Comment #4 from Alan Modra <amodra at gmail dot com> ---
The disassembly says this is powerpc64le.  Possibly interesting fact: the
offsets used above the stack frame are 400, 432, 440, which all correspond to
the parameter save area.  I don't see any reason that DGEBAL should have a
parameter save area though since all parameters can be passed in regs.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (3 preceding siblings ...)
  2021-06-02  0:31 ` amodra at gmail dot com
@ 2021-10-05 22:45 ` bergner at gcc dot gnu.org
  2022-01-09 11:13 ` kenneth.hoste at ugent dot be
                   ` (28 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2021-10-05 22:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #5 from Peter Bergner <bergner at gcc dot gnu.org> ---
So I took dgebal.f and ran delta on it to try and reduce it to something
manageable (I wish creduce worked on fortran files!) and got the following
which still shows us accessing above the stack.

      SUBROUTINE DGEBAL( JOB, N, A, LDA, ILO, IHI, SCALE, INFO )
      CHARACTER          JOB
      DOUBLE PRECISION   A( LDA, * ), SCALE( * )
      LOGICAL            NOCONV
  140 CONTINUE
      DO 200 I = K, L
         C = DNRM2( L-K+1, A( K, I ), 1 )
         R = DNRM2( L-K+1, A( I, K ), LDA )
         ICA = IDAMAX( L, A( 1, I ), 1 )
         CA = ABS( A( ICA, I ) )
         IF( C.EQ.ZERO .OR. R.EQ.ZERO )
     $      GO TO 200
         IF( G.LT.R .OR. MAX( R, RA ).GE.SFMAX2 .OR.
     $       MIN( F, C, G, CA ).LE.SFMIN2 )GO TO 190
         F = F / SCLFAC
         G = G / SCLFAC
  190    CONTINUE
         IF( ( C+R ).GE.FACTOR*S )
     $      GO TO 200
         IF( F.LT.ONE .AND. SCALE( I ).LT.ONE ) THEN
         END IF
         CALL DSCAL( N-K+1, G, A( I, K ), LDA )
  200 CONTINUE
      IF( NOCONV )
     $   GO TO 140
      END

This isn't related to some strange fortran parameter passing rules (ie, all
params are passed by reference), is it?


dgebal_:
.LFB0:
        .cfi_startproc
.LCF0:
0:      addis 2,12,.TOC.-.LCF0@ha
        addi 2,2,.TOC.-.LCF0@l
        .localentry     dgebal_,.-dgebal_
        std 24,-88(1)
        .cfi_offset 24, -88
        lwa 24,0(6)
        mflr 0
        mfcr 11,8
        std 20,-120(1)
        std 15,-160(1)
        std 16,-152(1)
        std 17,-144(1)
        std 19,-128(1)
        std 21,-112(1)
        std 22,-104(1)
        std 23,-96(1)
        std 25,-80(1)
        std 27,-64(1)
        std 28,-56(1)
        stw 11,8(1)
        li 9,0
        .cfi_register 65, 0
        .cfi_offset 20, -120
        .cfi_offset 15, -160
        .cfi_offset 16, -152
        .cfi_offset 17, -144
        .cfi_offset 19, -128
        .cfi_offset 21, -112
        .cfi_offset 22, -104
        .cfi_offset 23, -96
        .cfi_offset 25, -80
        .cfi_offset 27, -64
        .cfi_offset 28, -56
        .cfi_offset 72, 8
        addis 27,2,.LANCHOR0@toc@ha
        stfd 29,-24(1)
        stfd 30,-16(1)
        stfd 31,-8(1)
        std 14,-168(1)
        std 18,-136(1)
        std 26,-72(1)
        std 29,-48(1)
        cmpdi 0,24,0
        std 0,16(1)
        std 30,-40(1)
        std 31,-32(1)
        stdu 1,-224(1)
        .cfi_def_cfa_offset 224
        .cfi_offset 61, -24
        .cfi_offset 62, -16
        .cfi_offset 63, -8
        .cfi_offset 14, -168
        .cfi_offset 18, -136
        .cfi_offset 26, -72
        .cfi_offset 29, -48
        .cfi_offset 65, 16
        .cfi_offset 30, -40
        .cfi_offset 31, -32
        addi 27,27,.LANCHOR0@toc@l
        li 21,-8
        mr 25,6
        isel 24,0,24,0
        mr 16,4
        cmpwi 4,9,0
        addi 28,1,32
        addi 22,1,36
        addi 15,1,40
        std 5,272(1)       # 272 is bigger than 224!
...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (4 preceding siblings ...)
  2021-10-05 22:45 ` bergner at gcc dot gnu.org
@ 2022-01-09 11:13 ` kenneth.hoste at ugent dot be
  2022-07-08 10:53 ` alexander.grund@tu-dresden.de
                   ` (27 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: kenneth.hoste at ugent dot be @ 2022-01-09 11:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #6 from Kenneth Hoste <kenneth.hoste at ugent dot be> ---
(In reply to Segher Boessenkool from comment #3)
> Hi Alexander,
> 
> You do not say what the actual target you used is?  powerpc-linux,
> powerpc64-linux, powerpc64le-linux, something else entirely?

We're definitely seeing this on ppc64le, see also
https://github.com/mpimd-csc/flexiblas/issues/17 and
https://github.com/easybuilders/easybuild-easyconfigs/issues/12968 for
additional context.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (5 preceding siblings ...)
  2022-01-09 11:13 ` kenneth.hoste at ugent dot be
@ 2022-07-08 10:53 ` alexander.grund@tu-dresden.de
  2022-07-08 16:38 ` bergner at gcc dot gnu.org
                   ` (26 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: alexander.grund@tu-dresden.de @ 2022-07-08 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #7 from Alexander Grund <alexander.grund@tu-dresden.de> ---
Hi,
it's more than 1 year later now. Peter seemingly has a simple reproducer.
Is there anything new on this? Any patch to fix that or at least anything to
try or a workaround like disabling a specific optimization causing this?

Best Regards

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (6 preceding siblings ...)
  2022-07-08 10:53 ` alexander.grund@tu-dresden.de
@ 2022-07-08 16:38 ` bergner at gcc dot gnu.org
  2022-07-14 20:10 ` bergner at gcc dot gnu.org
                   ` (25 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2022-07-08 16:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |bergner at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #8 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Alexander Grund from comment #7)
> Hi, it's more than 1 year later now. Peter seemingly has a simple reproducer.
> Is there anything new on this? Any patch to fix that or at least anything to
> try or a workaround like disabling a specific optimization causing this?

I'm sorry, this is still on my TODO to debug.  I have worked on this, but got
side tracked on other things.  I'll try and refresh myself with where I was at
and continue working this.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (7 preceding siblings ...)
  2022-07-08 16:38 ` bergner at gcc dot gnu.org
@ 2022-07-14 20:10 ` bergner at gcc dot gnu.org
  2022-07-20 11:45 ` alexander.grund@tu-dresden.de
                   ` (24 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2022-07-14 20:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|bergner at gcc dot gnu.org         |jskumari at gcc dot gnu.org

--- Comment #9 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #8)
> I'm sorry, this is still on my TODO to debug.  I have worked on this, but
> got side tracked on other things.  I'll try and refresh myself with where I
> was at and continue working this.

Actually, Surya from my team will take over looking at this.  Reassigning the
bug to her.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (8 preceding siblings ...)
  2022-07-14 20:10 ` bergner at gcc dot gnu.org
@ 2022-07-20 11:45 ` alexander.grund@tu-dresden.de
  2022-07-20 14:14 ` alexander.grund@tu-dresden.de
                   ` (23 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: alexander.grund@tu-dresden.de @ 2022-07-20 11:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #10 from Alexander Grund <alexander.grund@tu-dresden.de> ---
(In reply to Peter Bergner from comment #2)
> The failure with GCC 7 and later coincides with the PPC port starting to
> default to LRA instead of reload.

Is there a compiler flag that can switch the default back as a workaround?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (9 preceding siblings ...)
  2022-07-20 11:45 ` alexander.grund@tu-dresden.de
@ 2022-07-20 14:14 ` alexander.grund@tu-dresden.de
  2022-07-20 17:42 ` segher at gcc dot gnu.org
                   ` (22 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: alexander.grund@tu-dresden.de @ 2022-07-20 14:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #11 from Alexander Grund <alexander.grund@tu-dresden.de> ---
Some more experiments with GCC 10.3, OpenBLAS 0.3.15 and FlexiBLAS 3.0.4:

Baseline: Broken at -O1, working at -Og

I got it to break with "-Og -fmove-loop-invariants".
Then it worked again by adding "-fstack-protector-all". But that is seemingly
not advisable:
https://developers.redhat.com/blog/2020/05/22/stack-clash-mitigation-in-gcc-part-3

Hence the current workaround is to use "-O2 -fno-move-loop-invariants"

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (10 preceding siblings ...)
  2022-07-20 14:14 ` alexander.grund@tu-dresden.de
@ 2022-07-20 17:42 ` segher at gcc dot gnu.org
  2022-07-20 17:59 ` segher at gcc dot gnu.org
                   ` (21 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: segher at gcc dot gnu.org @ 2022-07-20 17:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #12 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Alexander Grund from comment #10)
> (In reply to Peter Bergner from comment #2)
> > The failure with GCC 7 and later coincides with the PPC port starting to
> > default to LRA instead of reload.
> 
> Is there a compiler flag that can switch the default back as a workaround?

No, the PowerPC GCC port only supports LRA since g:7a5cbf29beb2 (from 2017).

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (11 preceding siblings ...)
  2022-07-20 17:42 ` segher at gcc dot gnu.org
@ 2022-07-20 17:59 ` segher at gcc dot gnu.org
  2022-09-13 19:29 ` segher at gcc dot gnu.org
                   ` (20 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: segher at gcc dot gnu.org @ 2022-07-20 17:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #13 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Alexander Grund from comment #11)
> Some more experiments with GCC 10.3, OpenBLAS 0.3.15 and FlexiBLAS 3.0.4:
> 
> Baseline: Broken at -O1, working at -Og
> 
> I got it to break with "-Og -fmove-loop-invariants".
> Then it worked again by adding "-fstack-protector-all".

Both are great info!

> But that is
> seemingly not advisable:
> https://developers.redhat.com/blog/2020/05/22/stack-clash-mitigation-in-gcc-
> part-3

-fstack-protector-strong is cheap enough that you can (and perhaps should)
enable it almost always.  Some distributions do this even?

-fstack-check= is an Ada thing.  -fstack-clash-protection is a different thing
as well (that's what that article is about).

Enabling ssp is not a great workaround of course, it is much to roundabout;
and I suspect the only reason it works is because it changes the stack layout.
Still, useful info, thanks :-)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (12 preceding siblings ...)
  2022-07-20 17:59 ` segher at gcc dot gnu.org
@ 2022-09-13 19:29 ` segher at gcc dot gnu.org
  2022-09-19  5:46 ` jskumari at gcc dot gnu.org
                   ` (19 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: segher at gcc dot gnu.org @ 2022-09-13 19:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #14 from Segher Boessenkool <segher at gcc dot gnu.org> ---
What is the exact command line (and relevant configuration!) required to
reproduce this?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (13 preceding siblings ...)
  2022-09-13 19:29 ` segher at gcc dot gnu.org
@ 2022-09-19  5:46 ` jskumari at gcc dot gnu.org
  2022-09-20 22:45 ` segher at gcc dot gnu.org
                   ` (18 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jskumari at gcc dot gnu.org @ 2022-09-19  5:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #15 from Surya Kumari Jangala <jskumari at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #14)
> What is the exact command line (and relevant configuration!) required to
> reproduce this?

The reduced testcase is:

      SUBROUTINE DGEBAL( JOB, N, ARRAY, LDA, ILO, IHI, SCALE, INFO )
      CHARACTER          JOB
      DOUBLE PRECISION   ARRAY( LDA, * ), SCALE( * )
      LOGICAL            NOCONV
  140 CONTINUE
      DO 200 I = K, L
         C = DNRM2( L-K+1, ARRAY( K, I ), 1 )
         R = DNRM2( L-K+1, ARRAY( I, K ), LDA )
         ICA = IDAMAX( L, ARRAY( 1, I ), 1 )
         CA = ABS( ARRAY( ICA, I ) )
         IF( C.EQ.ZERO .OR. R.EQ.ZERO )
     $      GO TO 200
         IF( G.LT.R .OR. MAX( R, RA ).GE.SFMAX2 .OR.
     $       MIN( F, C, G, CA ).LE.SFMIN2 )GO TO 190
         F = F / SCLFAC
         G = G / SCLFAC
  190    CONTINUE
         CALL DSCAL( N-K+1, G, ARRAY( I, K ), LDA )
  200 CONTINUE
      IF( NOCONV )
     $   GO TO 140
      END


The options to use to reproduce: -mcpu=power8 -O2 -fPIC

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (14 preceding siblings ...)
  2022-09-19  5:46 ` jskumari at gcc dot gnu.org
@ 2022-09-20 22:45 ` segher at gcc dot gnu.org
  2022-10-17  8:17 ` jskumari at gcc dot gnu.org
                   ` (17 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: segher at gcc dot gnu.org @ 2022-09-20 22:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #16 from Segher Boessenkool <segher at gcc dot gnu.org> ---
It cannot be -mcpu=power8, that cannot generate isel.  -mcpu=power9 comes
closer, but I still do not see exactly the same output, and crucially not
the strange store either.

What the what.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (15 preceding siblings ...)
  2022-09-20 22:45 ` segher at gcc dot gnu.org
@ 2022-10-17  8:17 ` jskumari at gcc dot gnu.org
  2022-10-17  9:42 ` jskumari at gcc dot gnu.org
                   ` (16 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jskumari at gcc dot gnu.org @ 2022-10-17  8:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #17 from Surya Kumari Jangala <jskumari at gcc dot gnu.org> ---
I analysed the reduced test case specified in comment 15. In the .s file, the
callee decrements r1 by 224, ie, callee’s frame size is 224. But there is an
instruction in the callee that accesses into the caller’s frame at (r1+272).
At first glance this looks odd, even incorrect, but after further analysis, I
am not sure if this is incorrect.
If we look at the RTL dumps, the offset 272 is introduced in ‘reload’. ‘Insn 4’
stores into (r1+272). 

‘Insn 4’ after vregs:

(insn 4 3 5 2 (set (reg/v/f:DI 177 [ arrayD.2714 ])
        (reg:DI 5 5 [ arrayD.2714 ])) "bug.f":1:23 675 {*movdi_internal64}
     (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
                (const_int 48 [0x30])) [3 arrayD.2714+0 S8 A64])
        (nil)))


‘Insn 4’ after IRA:

(insn 4 214 237 2 (set (reg/v/f:DI 177 [ arrayD.2714 ])
        (reg:DI 262)) "bug.f":1:23 675 {*movdi_internal64}
     (expr_list:REG_DEAD (reg:DI 262)
        (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
                    (const_int 48 [0x30])) [3 arrayD.2714+0 S8 A64])
            (nil))))

‘Insn 4’ after reload:

(insn 4 214 19 2 (set (mem/f/c:DI (plus:DI (reg/f:DI 1 1)
                (const_int 272 [0x110])) [3 arrayD.2714+0 S8 A64])
        (reg:DI 5 5 [262])) "bug.f":1:23 675 {*movdi_internal64}
     (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
                (const_int 48 [0x30])) [3 arrayD.2714+0 S8 A64])
        (nil)))


As we can see, during vregs phase, we are moving r5 to r177 and r177 is equiv
to (ap+48). ‘ap’ (r99) is the base register for access to arguments of the
function.

In the gcc code:
#define ARG_POINTER_REGNUM 99

During vregs phase, not just r5, but all registers from r3-r10 are moved to
pseudo registers and these pseudo regs are equivalent to (ap+’offset’) with
‘offset’ starting from 32 for r3 and going on till 88 for r10. Note that ap
points to the beginning of the callee frame, hence to access the parameter save
area of the caller’s frame, 32 needs to be added to ap.

During LRA, in curr_insn_transform(), we make equivalence substitution and
change r177 to r1+272. (272 because r177 is equivalent to ap+48, and ap equals
r1+224, so ap+48 = r1+272). 

The argument registers r3-r10 are saved as they need to be reused to pass
parameters to functions called from the callee. But not all parameter registers
are spilled to the stack. For example, r6 is saved in r24. We can see this
after the “final” phase:

(insn 5 289 19 (set (reg/v/f:DI 24 %r24 [orig:178 ldaD.2715 ] [178])
        (reg:DI 6 %r6 [263])) "bug.f":1:23 675 {*movdi_internal64}
     (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
                (const_int 56 [0x38])) [6 ldaD.2715+0 S8 A64])
        (nil)))

I guess r5 had to be spilled to stack because there were no free registers.

Also, note that there is a load from (r1+272) in the reduced test case. This
shows that the value in r5 is needed, and hence it has to be saved somewhere.

I ran the test case with the options: -mcpu=power8 -O2 -fPIC

If -fPIC option is removed, we do not see any access to the caller’s frame in
the generated assembly. But it does have instructions that save the parameter
registers into other registers. I suppose the parameter registers did not have
to be saved on stack (ie, in the caller’s parameter save area) because there
were enough registers available. That is, perhaps there is lesser register
pressure without -fPIC.

After vregs:
(insn 4 3 5 2 (set (reg/v/f:DI 177 [ arrayD.2714 ])
        (reg:DI 5 %r5 [ arrayD.2714 ])) "bug.f":1:23 675 {*movdi_internal64}
     (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
                (const_int 48 [0x30])) [3 arrayD.2714+0 S8 A64])

After reload:
(insn 4 214 19 2 (set (reg/v/f:DI 17 %r17 [orig:177 arrayD.2714 ] [177])
        (reg:DI 5 %r5 [262])) "bug.f":1:23 675 {*movdi_internal64}
     (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
                (const_int 48 [0x30])) [3 arrayD.2714+0 S8 A64])
        (nil)))


To summarise, the reduced testcase seems to be correctly compiled. So I shifted
my focus to the original fortran file dgebal.f in the openBLAS library.


In dgebal.f too we have some instructions accessing the caller’s parameter save
area. These are the interesting snippets of instructions from the assembly
code: 

   // The original contents of r23 are spilled.
std %r23,-192(%r1)
   // r3 is saved in r23
mr %r23,%r3
   // frame is allocated
stdu %r1,-400(%r1)

  // restore r3 contents before making call to lsame_. There are several calls
to lsame_ and 
  // each time, r3 is restored.
mr %r3,%r23
bl lsame_

   // save r23 to the stack because we are running out of registers and we need
a free reg.
   // Note that we are saving to the caller’s frame into the parameter save
area. And we 
   // are saving to (400+32) which is the
   // location that r3 would have been spilled. This is correct because r23
holds the contents of r3.
std %r23,432(%r1)
   // Use r23
li %r23,1
cmpwi %cr0,%r23,2

   // Load back r23 as we need to pass parameter to lsame_
ld %r23,432(%r1)
mr %r3,%r23
bl lsame_

   // Epilogue: restore r1 and the original contents of r23.
addi %r1,%r1,400
ld %r23,-192(%r1)
blr

The snippets of assembly code above are for r3 being saved in r23. There are
other parameter registers too being saved like for example, r10 is copied to
r30 which is then later spilled into the caller’s parameter save area at
(r1+488). 488=400+32+56 = 400+32+8*7, and this is the location for r10.

In the rtl dump, after vregs phase, we can see registers r3 to r10 being saved
to pseudo registers.

After vregs phase:
(r3 saved to pseudo r303 which is equiv to ap+32)

(insn 2 43 3 2 (set (reg/v/f:DI 303 [ jobD.2712 ])
        (reg:DI 3 %r3 [ jobD.2712 ])) "dgebal.f":2:23 675 {*movdi_internal64}
     (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
                (const_int 32 [0x20])) [4 jobD.2712+0 S8 A64])

After reload: 
(r303 is assigned to r23 and it is spilled at r1+432).

(insn 1620 931 1627 22 (set (mem/f/c:DI (plus:DI (reg/f:DI 1 %r1)
                (const_int 432 [0x1b0])) [4 jobD.2712+0 S8 A64])
        (reg/v/f:DI 23 %r23 [orig:303 jobD.2712 ] [303])) 675
{*movdi_internal64}
     (nil))


From the description of the bug
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799#c0), the issue is
occurring when FlexiBLAS is used with OpenBLAS.  And the issue is that when the
fortran routine DGEEV calls another fortran routine DGEBAL, the second
parameter (’N’) gets corrupted when control returns back to DGEEV. (DGEBAL and
DGEEV are routines in openBLAS). When FlexiBLAS is used, any call from an
openBLAS routine to another openBLAS routine goes thru flexiBLAS. So the call
to DGEBAL goes thru flexiblas first. FlexiBLAS is a wrapper library and it
contains a wrapper for dgebal. FlexiBLAS is written in C while openBLAS is a
fortran library. There is a wrapper for DGEBAL in flexiblas which reroutes the
call to DGEBAL in openBLAS. My suspicion is that the wrapper routine written in
C does not allocate the optional parameter save area. I tried compiling the
wrapper routine for dgebal with -O2 -fPIC and with these options, the frame
size is only 32; the parameter save area is not being allocated. And I think
this is resulting in corrupting contents of DGEEV’s stack when the fortran
routine DGEBAL writes into the caller’s parameter save area. I am not sure with
what options flexiBLAS is built, but I suspect we do not allocate parameter
save area irrespective of the options used. 

I wonder if saving the parameter registers r3-r10 to the parameter save area of
caller’s frame is specific to Fortran. In C, looks like these registers are
being saved in the callee frame itself.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (16 preceding siblings ...)
  2022-10-17  8:17 ` jskumari at gcc dot gnu.org
@ 2022-10-17  9:42 ` jskumari at gcc dot gnu.org
  2022-10-17 17:10 ` jskumari at gcc dot gnu.org
                   ` (15 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jskumari at gcc dot gnu.org @ 2022-10-17  9:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #18 from Surya Kumari Jangala <jskumari at gcc dot gnu.org> ---
I git cloned and built flexiblas to see what is the frame size and what is the
assembly code generated for the flexiblas C wrapper routine for dgebal.

The important assembly code snippets for dgebal.c :

// r23-r31 are saved in the callee frame
   std     r23,-72(r1)
   std     r24,-64(r1)
   ...
   ...
   std     r31,-8(r1)

// allocate the stack frame
   stdu    r1,-112(r1)

// save the parameter registers r3-r10 into r23-r30
   mr      r30,r3
   ...
   mr      r23,r10

// some of the param regs are used as temps
   ld      r3,0(r31)
   lwz     r11,16(r3)

// populate the param registers appropriately
   mr      r3,r30
   ...
   mr      r10,r23

// make the call to the fortran dgebal routine
   bctrl

// restore r1
   addi    r1,r1,112

// restore r23-r31
   ld      r23,-72(r1)
   ...
   ld      r31,-8(r1)

// return
   blr

As we can see, the frame size allocated is only 112 out of which 32 is for
things like LR, TOC etc. and 72 is needed to save r23-r31. So clearly, the
wrapper routine is not allocating any parameter save area in it's frame.
Now, the dgebal fortran routine writes into the caller's frame thereby
corrupting a callee save register (one of r23-r31). So when control returns
back from the wrapper routine to the fortran routine dgeev, we see a corrupted
value.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (17 preceding siblings ...)
  2022-10-17  9:42 ` jskumari at gcc dot gnu.org
@ 2022-10-17 17:10 ` jskumari at gcc dot gnu.org
  2022-10-31  3:00 ` linkw at gcc dot gnu.org
                   ` (14 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jskumari at gcc dot gnu.org @ 2022-10-17 17:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #19 from Surya Kumari Jangala <jskumari at gcc dot gnu.org> ---
There is a keyword called BIND(C) which can be specified on a Fortran procedure
to make it interoperable.
I tried this keyword on DGEBAL fortran routine which is a part of the openblas
library and it worked! I did not see any REG_EQUIV notes after the expand pass,
and the final assembly did not have accesses to the caller's frame.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (18 preceding siblings ...)
  2022-10-17 17:10 ` jskumari at gcc dot gnu.org
@ 2022-10-31  3:00 ` linkw at gcc dot gnu.org
  2022-11-09 16:43 ` jskumari at gcc dot gnu.org
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-31  3:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |linkw at gcc dot gnu.org

--- Comment #20 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Alan Modra from comment #4)
> The disassembly says this is powerpc64le.  Possibly interesting fact: the
> offsets used above the stack frame are 400, 432, 440, which all correspond
> to the parameter save area.  I don't see any reason that DGEBAL should have
> a parameter save area though since all parameters can be passed in regs.

This also confuses me, since the function prototype

  SUBROUTINE DGEBAL( JOB, N, A, LDA, ILO, IHI, SCALE, INFO )

only has eight parameters, by looking into it the reason is that the first
parameter

  "CHARACTER JOB"

has one more hidden associated length argument.

"For arguments of CHARACTER type, the character length is passed as a hidden
argument at the end of the argument list. " as said in [1], so this function
actually has nine (more than eight) doubleword arguments, then it does need one
parameter save area.

[1] https://gcc.gnu.org/onlinedocs/gfortran/Argument-passing-conventions.html

Surya's analysis looks reasonable to me, the current stub scheme with function
pointer call in C doesn't match the Fortran side.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (19 preceding siblings ...)
  2022-10-31  3:00 ` linkw at gcc dot gnu.org
@ 2022-11-09 16:43 ` jskumari at gcc dot gnu.org
  2023-06-19 20:25 ` bergner at gcc dot gnu.org
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jskumari at gcc dot gnu.org @ 2022-11-09 16:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Surya Kumari Jangala <jskumari at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |WAITING

--- Comment #21 from Surya Kumari Jangala <jskumari at gcc dot gnu.org> ---
There are two options to resolve the issue:

1. Use the BIND(C) directive on the fortran callee (DGEBAL) to make it
interoperable with the caller which is written in C. As described in comment
19, using this directive removed accesses to the caller's frame.

2. As described in
(https://gcc.gnu.org/onlinedocs/gfortran/Argument-passing-conventions.html),
since the first parameter to DGEBAL is of type CHARACTER, there is an extra
hidden argument. Change the call to DGEBAL from dgebal (the flexiBLAS wrapper
routine) to take an extra argument. This causes the compiler to allocate a
parameter save area in dgebal's frame, as there are now 9 parameters but only 8
parameter registers.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (20 preceding siblings ...)
  2022-11-09 16:43 ` jskumari at gcc dot gnu.org
@ 2023-06-19 20:25 ` bergner at gcc dot gnu.org
  2024-02-21  7:38 ` jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2023-06-19 20:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|WAITING                     |RESOLVED

--- Comment #22 from Peter Bergner <bergner at gcc dot gnu.org> ---
I'm closing this as NOT A BUG in GCC and is a bug in the source code being
compiled not being cognizant of the rules between calling between fortran and
C.  Surya listed two solutions which can be used in Comment #21 below.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (21 preceding siblings ...)
  2023-06-19 20:25 ` bergner at gcc dot gnu.org
@ 2024-02-21  7:38 ` jakub at gcc dot gnu.org
  2024-02-22  2:51 ` bergner at gcc dot gnu.org
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-21  7:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #23 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, given that in PR90329 a workaround has been introduced for such buggy
cases (that time to disallow functions with the DECL_HIDDEN_STRING_LENGTH
arguments from making certain tail-calls and call them normally instead), if
the PowerPC backend maintainers wanted, there could be a similar workaround on
the rs6000 backend side,
in the decisions whether the callee can use the parameter save area or not
ignore counting DECL_HIDDEN_STRING_LENGTH PARM_DECLs, so if e.g. 9 arguments
are passed but one of them is DECL_HIDDEN_STRING_LENGTH, assume parameter save
area is not there.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (22 preceding siblings ...)
  2024-02-21  7:38 ` jakub at gcc dot gnu.org
@ 2024-02-22  2:51 ` bergner at gcc dot gnu.org
  2024-02-22 14:44 ` bergner at gcc dot gnu.org
                   ` (9 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2024-02-22  2:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #24 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #23)
> if the PowerPC backend maintainers wanted, there could be a similar workaround
> on the rs6000 backend side, in the decisions whether the callee can use
> the parameter save area or not ignore counting DECL_HIDDEN_STRING_LENGTH
> PARM_DECLs, so if e.g. 9 arguments are passed but one of them is
> DECL_HIDDEN_STRING_LENGTH, assume parameter save area is not there.

If the callee has 9 arguments, even if one is a hidden str len arg, then there
MUST be a parameter save area, since that is where the callee is supposed to
load the 9th argument from.  There is simply no other location that 9th
argument exists at.

I think the only viable rs6000 workaround is for the caller to allocate a
parameter save area in some cases where it doesn't think it needs one.  Ie, the
caller is calling a function which it thinks has 8 parameters and there might
be a hidden one (maybe one param is a string or whatever the Fortran CHARACTER
with len great than 1 maps to) because the callee might be a Fortran routine. 
That would solve the problem of the callee scribbling data into the caller's
frame, but wouldn't solve the issue of the caller didn't actually place a valid
value for the missing hidden parameter.  Thoughts on that?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (23 preceding siblings ...)
  2024-02-22  2:51 ` bergner at gcc dot gnu.org
@ 2024-02-22 14:44 ` bergner at gcc dot gnu.org
  2024-02-22 14:59 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2024-02-22 14:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dje at gcc dot gnu.org,
                   |                            |meissner at gcc dot gnu.org

--- Comment #25 from Peter Bergner <bergner at gcc dot gnu.org> ---
CCing Mike and David for possible comments about the possible workarounds
mentioned in Comment 23 and Comment 24.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (24 preceding siblings ...)
  2024-02-22 14:44 ` bergner at gcc dot gnu.org
@ 2024-02-22 14:59 ` jakub at gcc dot gnu.org
  2024-02-25  0:39 ` bergner at gcc dot gnu.org
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-22 14:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #26 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #25)
> CCing Mike and David for possible comments about the possible workarounds
> mentioned in Comment 23 and Comment 24.

Doing the workaround on the caller side is impossible, this is for calls from
C/C++ to Fortran code, directly or indirectly called and there is nothing the
compiler could use to guess that it actually calls Fortran code with hidden
Fortran character arguments.
But I still think the workaround is possible on the callee side.
Sure, if the DECL_HIDDEN_STRING_LENGTH argument(s) is(are) used in the
function, then there is no easy way but expect the parameter save area (ok,
sure, it could just load from the assumed parameter location and don't assume
the rest is there, nor allow storing to the slots it loaded them from).
But that is actually not what BLAS etc. suffers from.
If you have something like
subroutine foo (a, b, c, d, e, f, g, h)
  character a
  integer b, c, d, e, f, g, h
  call bar (a, b, c, d, e, f, g, h)
end subroutine foo
then the DECL_HIDDEN_STRING_LENGTH argument isn't used at all, on the callee
side the user said that one should treat it as if the length of a is 1, so
whatever the caller passes is unimportant and when passing to further calls it
will just use 1:
void foo (character(kind=1)[1:1] & restrict a, integer(kind=4) & restrict b,
integer(kind=4) & restrict c, integer(kind=4) & restrict d, integer(kind=4) &
restrict e, integer(kind=4) & restrict f, integer(kind=4) & restrict g,
integer(kind=4) & restrict h, integer(kind=8) _a)
{
  <bb 2> :
  bar (a_2(D), b_3(D), c_4(D), d_5(D), e_6(D), f_7(D), g_8(D), h_9(D), 1);
  return;

}
It would seem that the _a argument is useless, but as explained in PR90329 that
is because in Fortran you can call foo ("foo", 1, 2, 3, 4, 5, 6, 7) without
interfaces etc.
and the first argument could be character, character(len=1), character(len=3)
or character(len=*) etc.  And only in the last case the argument is actually
needed, in other cases it is ignored.

So, the workaround could be for the case of unused DECL_HIDDEN_STRING_LENGTH
arguments at the end of PARM_DECLs don't try to load those at all and don't
assume there is parameter save area unless the non-DECL_HIDDEN_STRING_LENGTH or
used DECL_HIDDEN_STRING_LENGTH arguments actually require it.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (25 preceding siblings ...)
  2024-02-22 14:59 ` jakub at gcc dot gnu.org
@ 2024-02-25  0:39 ` bergner at gcc dot gnu.org
  2024-02-26  9:58 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2024-02-25  0:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #27 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #26)
> But I still think the workaround is possible on the callee side.
> Sure, if the DECL_HIDDEN_STRING_LENGTH argument(s) is(are) used in the
> function, then there is no easy way but expect the parameter save area (ok,
> sure, it could just load from the assumed parameter location and don't
> assume the rest is there, nor allow storing to the slots it loaded them
> from).
> But that is actually not what BLAS etc. suffers from.
[snip]
> So, the workaround could be for the case of unused DECL_HIDDEN_STRING_LENGTH
> arguments at the end of PARM_DECLs don't try to load those at all and don't
> assume there is parameter save area unless the non-DECL_HIDDEN_STRING_LENGTH
> or used DECL_HIDDEN_STRING_LENGTH arguments actually require it.
So I looked closer at what the failure mode was in this PR (versus the one
you're seeing with flexiblas).  As in your case, there is a mismatch in the
number of parameters the C caller thinks there are (8 args, so no param save
area needed) versus what the Fortran callee thinks there are (9 params which
include the one hidden arg, so there is a param save area).  The Fortran
function doesn't actually access the hidden argument in our test case above, in
fact the character argument is never used either.  What I see in the rtl dumps
is that *all* incoming args have a REG_EQUIV generated that points to the param
save area (this doesn't happen when there are 8 or fewer formal params), even
for the first 8 args that are passed in registers:

(insn 2 12 3 2 (set (reg/v/f:DI 117 [ r3 ])
        (reg:DI 3 3 [ r3 ])) "callee-3.c":6:1 685 {*movdi_internal64}
     (expr_list:REG_EQUIV (mem/f/c:DI (plus:DI (reg/f:DI 99 ap)
                (const_int 32 [0x20])) [1 r3+0 S8 A64])
        (nil)))
(insn 3 2 4 2 (set (reg/v:DI 118 [ r4 ])
        (reg:DI 4 4 [ r4 ])) "callee-3.c":6:1 685 {*movdi_internal64}
     (expr_list:REG_EQUIV (mem/c:DI (plus:DI (reg/f:DI 99 ap)
                (const_int 40 [0x28])) [2 r4+0 S8 A64])
        (nil)))
...

We then get to RA and we end up spilling one of the pseudos associated with one
of the other parameters (not the character param JOB).  LRA then uses that
REG_EQUIV note and rather than allocating a new stack slot to spill to, it uses
the parameter save memory location for that parameter for the spill slot.  When
we store to that memory location and the C caller has not allocated the param
save area, we end up clobbering an important part of the C callers stack
causing a crash.

If we were to try and do a callee workaround, we would need to disable setting
those REG_EQUIV notes for the parameters... if that's even possible.  Since
Fortran uses call-by-name parameter passing, isn't the updated param value from
the callee returned in the parameter save area itself???


> Doing the workaround on the caller side is impossible, this is for calls
> from C/C++ to Fortran code, directly or indirectly called and there is
> nothing the compiler could use to guess that it actually calls Fortran code
> with hidden Fortran character arguments.
As a HUGE hammer, every caller could always allocate a param save area.  That
would "fix" the problem from this bug, but would that also fix the bug you're
seeing in flexiblas?

I'm not advocating this though.  I was thinking maybe making callers (under an
option?) conservatively assume the callee is a Fortran function and for those C
arguments that could map to a Fortran parameter with a hidden argument, bump
the number of counted args by 1.  For example, a C function with 2 char/char *
args and 6 int args would think there are 8 normal args and 2 hidden args, so
it needs to allocate a param save area.  Is that not feasible?  ...or does that
not even address the issue you're seeing in your bug?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (26 preceding siblings ...)
  2024-02-25  0:39 ` bergner at gcc dot gnu.org
@ 2024-02-26  9:58 ` jakub at gcc dot gnu.org
  2024-02-27  0:45 ` bergner at gcc dot gnu.org
                   ` (5 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-26  9:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #28 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #27)
> So I looked closer at what the failure mode was in this PR (versus the one
> you're seeing with flexiblas).  As in your case, there is a mismatch in the
> number of parameters the C caller thinks there are (8 args, so no param save
> area needed) versus what the Fortran callee thinks there are (9 params which
> include the one hidden arg, so there is a param save area).  The Fortran
> function doesn't actually access the hidden argument in our test case above,
> in fact the character argument is never used either.  What I see in the rtl
> dumps is that *all* incoming args have a REG_EQUIV generated that points to
> the param save area (this doesn't happen when there are 8 or fewer formal
> params), even for the first 8 args that are passed in registers:

Yes, so it is the backend that told function.cc that there is a parameter save
area and it should be adding REG_EQUIV notes.  So, the idea would be that for
the case we talk about (<= 8 normal arguments, then only unused
DECL_HIDDEN_STRING_LENGTH ones) that the backend would also say that there is
no parameter save area, basically pretend there are <= 8 arguments.

> > Doing the workaround on the caller side is impossible, this is for calls
> > from C/C++ to Fortran code, directly or indirectly called and there is
> > nothing the compiler could use to guess that it actually calls Fortran code
> > with hidden Fortran character arguments.
> As a HUGE hammer, every caller could always allocate a param save area. 
> That would "fix" the problem from this bug, but would that also fix the bug
> you're seeing in flexiblas?

Most likely yes.  Though of course that is way too high price to pay, even with
some non-default option.  If we can't workaround it in the backend just on the
callee side of calls which have the unused hidden string length arguments, then
better no changes
on the GCC side.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (27 preceding siblings ...)
  2024-02-26  9:58 ` jakub at gcc dot gnu.org
@ 2024-02-27  0:45 ` bergner at gcc dot gnu.org
  2024-02-27  7:26 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2024-02-27  0:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #29 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #28)
> Yes, so it is the backend that told function.cc that there is a parameter
> save area and it should be adding REG_EQUIV notes.  So, the idea would be
> that for the case we talk about (<= 8 normal arguments, then only unused
> DECL_HIDDEN_STRING_LENGTH ones) that the backend would also say that there
> is no parameter save area, basically pretend there are <= 8 arguments.

How can we know there are no uses of the hidden arg(s)?  That backend function
is being called at expand time, so we haven't yet run any RTL dataflow
information to tell us.  Is there some tree attribute for the arg that can tell
is whether it's used or not?  ...or is there some SSA data for that arg that
can show it has no use?  ...and if so, would that still work for -O0 compiles?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (28 preceding siblings ...)
  2024-02-27  0:45 ` bergner at gcc dot gnu.org
@ 2024-02-27  7:26 ` jakub at gcc dot gnu.org
  2024-02-27 15:30 ` bergner at gcc dot gnu.org
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-27  7:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #30 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Either tree parmdef = ssa_default_def (cfun, parm) is NULL, or has_zero_uses
(parmdef).
Not sure if has_zero_uses will work properly after some bbs are converted from
GIMPLE to RTL, but maybe it will, I think the expansion generally doesn't
gsi_remove statements it expands nor calls update_stmt on them.  One could
always also just compute in generic code at the start of expansion the number
of unused DECL_HIDDEN_STRING_LENGTH PARM_DECLs at the end of the argument list,
save that as a flag in struct function or where and let the backends use it
from there.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (29 preceding siblings ...)
  2024-02-27  7:26 ` jakub at gcc dot gnu.org
@ 2024-02-27 15:30 ` bergner at gcc dot gnu.org
  2024-03-01 15:25 ` bergner at gcc dot gnu.org
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2024-02-27 15:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #31 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #30)
> Either tree parmdef = ssa_default_def (cfun, parm) is NULL, or has_zero_uses
> (parmdef).
> Not sure if has_zero_uses will work properly after some bbs are converted
> from GIMPLE to RTL, but maybe it will, I think the expansion generally
> doesn't gsi_remove statements it expands nor calls update_stmt on them.  One
> could always also just compute in generic code at the start of expansion the
> number of unused DECL_HIDDEN_STRING_LENGTH PARM_DECLs at the end of the
> argument list, save that as a flag in struct function or where and let the
> backends use it from there.

Ok, I think that gives us some idea what needs to be done.  I'll look for
someone in the team to have a look at implementing this workaround.  Thanks.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (30 preceding siblings ...)
  2024-02-27 15:30 ` bergner at gcc dot gnu.org
@ 2024-03-01 15:25 ` bergner at gcc dot gnu.org
  2024-03-22  7:44 ` aagarwa at gcc dot gnu.org
  2024-03-22  7:45 ` aagarwa at gcc dot gnu.org
  33 siblings, 0 replies; 35+ messages in thread
From: bergner at gcc dot gnu.org @ 2024-03-01 15:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aagarwa at gcc dot gnu.org

--- Comment #32 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #31)
> Ok, I think that gives us some idea what needs to be done.  I'll look for
> someone in the team to have a look at implementing this workaround.  Thanks.

Ajit has agreed to try and implement the workaround.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (31 preceding siblings ...)
  2024-03-01 15:25 ` bergner at gcc dot gnu.org
@ 2024-03-22  7:44 ` aagarwa at gcc dot gnu.org
  2024-03-22  7:45 ` aagarwa at gcc dot gnu.org
  33 siblings, 0 replies; 35+ messages in thread
From: aagarwa at gcc dot gnu.org @ 2024-03-22  7:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #33 from Ajit Kumar Agarwal <aagarwa at gcc dot gnu.org> ---
Sent the patch for review.

Here is the patch:
PATCH] rs6000: Stackoverflow in optimized code on PPC (PR100799)

When using FlexiBLAS with OpenBLAS we noticed corruption of
the parameters passed to OpenBLAS functions. FlexiBLAS
basically provides a BLAS interface where each function
is a stub that forwards the arguments to a real BLAS lib,
like OpenBLAS.

Fixes the corruption of caller frame checking number of
arguments is less than equal to GP_ARG_NUM_REG (8)
excluding hidden unused DECLS.

2024-03-22  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>

gcc/ChangeLog:

        PR rtk-optimization/100799
        * config/rs600/rs600-calls.cc (rs6000_function_arg): Don't
        generate parameter save area if number of arguments passed
        less than equal to GP_ARG_NUM_REG (8) excluding hidden
        paramter.
        * function.cc (assign_parms_initialize_all): Check for hidden
        parameter in fortran code and set the flag hidden_string_length
        and actual paramter passed excluding hidden unused DECLS.
        * function.h: Add new field hidden_string_length and
        actual_parm_length in function structure.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Bug target/100799] Stackoverflow in optimized code on PPC
  2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
                   ` (32 preceding siblings ...)
  2024-03-22  7:44 ` aagarwa at gcc dot gnu.org
@ 2024-03-22  7:45 ` aagarwa at gcc dot gnu.org
  33 siblings, 0 replies; 35+ messages in thread
From: aagarwa at gcc dot gnu.org @ 2024-03-22  7:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #34 from Ajit Kumar Agarwal <aagarwa at gcc dot gnu.org> ---
Sent the patch for review.

Here is the patch:
PATCH] rs6000: Stackoverflow in optimized code on PPC (PR100799)

When using FlexiBLAS with OpenBLAS we noticed corruption of
the parameters passed to OpenBLAS functions. FlexiBLAS
basically provides a BLAS interface where each function
is a stub that forwards the arguments to a real BLAS lib,
like OpenBLAS.

Fixes the corruption of caller frame checking number of
arguments is less than equal to GP_ARG_NUM_REG (8)
excluding hidden unused DECLS.

2024-03-22  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>

gcc/ChangeLog:

        PR rtk-optimization/100799
        * config/rs600/rs600-calls.cc (rs6000_function_arg): Don't
        generate parameter save area if number of arguments passed
        less than equal to GP_ARG_NUM_REG (8) excluding hidden
        paramter.
        * function.cc (assign_parms_initialize_all): Check for hidden
        parameter in fortran code and set the flag hidden_string_length
        and actual paramter passed excluding hidden unused DECLS.
        * function.h: Add new field hidden_string_length and
        actual_parm_length in function structure.

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2024-03-22  7:45 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-27 11:20 [Bug fortran/100799] New: Stackoverflow in optimized code on PPC alexander.grund@tu-dresden.de
2021-05-28 16:42 ` [Bug target/100799] " alexander.grund@tu-dresden.de
2021-06-01 19:08 ` bergner at gcc dot gnu.org
2021-06-01 21:09 ` segher at gcc dot gnu.org
2021-06-02  0:31 ` amodra at gmail dot com
2021-10-05 22:45 ` bergner at gcc dot gnu.org
2022-01-09 11:13 ` kenneth.hoste at ugent dot be
2022-07-08 10:53 ` alexander.grund@tu-dresden.de
2022-07-08 16:38 ` bergner at gcc dot gnu.org
2022-07-14 20:10 ` bergner at gcc dot gnu.org
2022-07-20 11:45 ` alexander.grund@tu-dresden.de
2022-07-20 14:14 ` alexander.grund@tu-dresden.de
2022-07-20 17:42 ` segher at gcc dot gnu.org
2022-07-20 17:59 ` segher at gcc dot gnu.org
2022-09-13 19:29 ` segher at gcc dot gnu.org
2022-09-19  5:46 ` jskumari at gcc dot gnu.org
2022-09-20 22:45 ` segher at gcc dot gnu.org
2022-10-17  8:17 ` jskumari at gcc dot gnu.org
2022-10-17  9:42 ` jskumari at gcc dot gnu.org
2022-10-17 17:10 ` jskumari at gcc dot gnu.org
2022-10-31  3:00 ` linkw at gcc dot gnu.org
2022-11-09 16:43 ` jskumari at gcc dot gnu.org
2023-06-19 20:25 ` bergner at gcc dot gnu.org
2024-02-21  7:38 ` jakub at gcc dot gnu.org
2024-02-22  2:51 ` bergner at gcc dot gnu.org
2024-02-22 14:44 ` bergner at gcc dot gnu.org
2024-02-22 14:59 ` jakub at gcc dot gnu.org
2024-02-25  0:39 ` bergner at gcc dot gnu.org
2024-02-26  9:58 ` jakub at gcc dot gnu.org
2024-02-27  0:45 ` bergner at gcc dot gnu.org
2024-02-27  7:26 ` jakub at gcc dot gnu.org
2024-02-27 15:30 ` bergner at gcc dot gnu.org
2024-03-01 15:25 ` bergner at gcc dot gnu.org
2024-03-22  7:44 ` aagarwa at gcc dot gnu.org
2024-03-22  7:45 ` aagarwa at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).