[Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs
@ 2023-04-26 13:03 tnfchris at gcc dot gnu.org
  2023-04-26 14:14 ` [Bug target/109632] " rguenth at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-04-26 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

            Bug ID: 109632
           Summary: Inefficient codegen when complex numbers are emulated
                    with structs
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64*

The following two cases are the same

struct complx_t {
    float re;
    float im;
};

complx_t
add(const complx_t &a, const complx_t &b) {
  return {a.re + b.re, a.im + b.im};
}

_Complex float
add(const _Complex float *a, const _Complex float *b) {
  return {__real__ *a + __real__ *b, __imag__ *a + __imag__ *b};
}

But we generate much different code (looking at -O2),  For the first one we do:

        ldr     d1, [x1]
        ldr     d0, [x0]
        fadd    v0.2s, v0.2s, v1.2s
        fmov    x0, d0
        lsr     x1, x0, 32
        lsr     w0, w0, 0
        fmov    s1, w1
        fmov    s0, w0
        ret

which is bad for obvious reasons, but also also never needed to go through the
genreg for such a reversal. we could have used many other NEON instructions.

For the second one we generate the good instructions:

add(float _Complex const*, float _Complex const*):
        ldp     s3, s2, [x0]
        ldp     s0, s1, [x1]
        fadd    s1, s2, s1
        fadd    s0, s3, s0
        ret

The difference being that in the second one we have decomposed the initial
structure by loading the elements:

  <bb 2> [local count: 1073741824]:
  _1 = REALPART_EXPR <*a_8(D)>;
  _2 = REALPART_EXPR <*b_9(D)>;
  _3 = _1 + _2;
  _4 = IMAGPART_EXPR <*a_8(D)>;
  _5 = IMAGPART_EXPR <*b_9(D)>;
  _6 = _4 + _5;
  _10 = COMPLEX_EXPR <_3, _6>;
  return _10;

In the first one we've kept them as vectors:

  <bb 2> [local count: 1073741824]:
  vect__1.6_13 = MEM <const vector(2) float> [(float *)a_8(D)];
  vect__2.9_15 = MEM <const vector(2) float> [(float *)b_9(D)];
  vect__3.10_16 = vect__1.6_13 + vect__2.9_15;
  MEM <vector(2) float> [(float *)&D.4435] = vect__3.10_16;
  return D.4435;

This part is probably a costing issue, we SLP them even though it's not
profitable because for the APCS we have to return them in separate registers.

Using -fno-tree-vectorize gets the gimple code right:

  <bb 2> [local count: 1073741824]:
  _1 = a_8(D)->re;
  _2 = b_9(D)->re;
  _3 = _1 + _2;
  D.4435.re = _3;
  _4 = a_8(D)->im;
  _5 = b_9(D)->im;
  _6 = _4 + _5;
  D.4435.im = _6;
  return D.4435;

But we generate worse code:

        ldp     s1, s0, [x0]
        mov     x2, 0
        ldp     s3, s2, [x1]
        fadd    s1, s1, s3
        fadd    s0, s0, s2
        fmov    w1, s1
        fmov    w0, s0
        bfi     x2, x1, 0, 32
        bfi     x2, x0, 32, 32
        lsr     x0, x2, 32
        lsr     w2, w2, 0
        fmov    s1, w0
        fmov    s0, w2

where we again use genreg as a very complicated way to do a no-op.

So there are two bugs here:

1. a costing, we shouldn't SLP
2. an expansion, the code out of expand is bad to begin with.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
@ 2023-04-26 14:14 ` rguenth at gcc dot gnu.org
  2023-04-26 14:40 ` tnfchris at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-04-26 14:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, the usual unknown ABI boundary at function entry/exit.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
  2023-04-26 14:14 ` [Bug target/109632] " rguenth at gcc dot gnu.org
@ 2023-04-26 14:40 ` tnfchris at gcc dot gnu.org
  2023-04-26 15:23 ` tnfchris at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-04-26 14:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #2 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> Well, the usual unknown ABI boundary at function entry/exit.

Yes but LLVM gets it right, so should be a solve able computer science problem.
:)

Note that this was reduced from a bigger routine but end result the same, the
thing shouldn't have been vectorized.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
  2023-04-26 14:14 ` [Bug target/109632] " rguenth at gcc dot gnu.org
  2023-04-26 14:40 ` tnfchris at gcc dot gnu.org
@ 2023-04-26 15:23 ` tnfchris at gcc dot gnu.org
  2023-04-27  7:52 ` rsandifo at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-04-26 15:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
note that even if we can't stop SLP, we should be able to generate as efficient
code by being creative about the instruction selection, that's why I marked it
as a target bug :)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-04-26 15:23 ` tnfchris at gcc dot gnu.org
@ 2023-04-27  7:52 ` rsandifo at gcc dot gnu.org
  2023-04-27 11:17 ` rsandifo at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2023-04-27  7:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #4 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Maybe worth noting that if the complex arguments are passed
by value, to give:

struct complx_t {
    float re;
    float im;
};

complx_t
add(const complx_t a, const complx_t b) {
  return {a.re + b.re, a.im + b.im};
}

and SLP is disabled, we get:

        fmov    w4, s1
        fmov    w3, s3
        fmov    x0, d0
        fmov    x1, d2
        mov     x2, 0
        bfi     x0, x4, 32, 32
        bfi     x1, x3, 32, 32
        fmov    d0, x0
        fmov    d1, x1
        sbfx    x3, x0, 0, 32
        sbfx    x0, x1, 0, 32
        ushr    d1, d1, 32
        fmov    d3, x0
        fmov    d2, x3
        ushr    d0, d0, 32
        fadd    s2, s2, s3
        fadd    s0, s0, s1
        fmov    w1, s2
        fmov    w0, s0
        bfi     x2, x1, 0, 32
        bfi     x2, x0, 32, 32
        lsr     x0, x2, 32
        lsr     w2, w2, 0
        fmov    s1, w0
        fmov    s0, w2
        ret

which is almost impressive, in its way.

I think we need a way in gimple of “SRA-ing” the arguments
and return value, in cases where that's forced by the ABI.
I.e. provide separate incoming values of a.re and a.im,
and store them to “a” on entry.  Then similarly make the
return stmt return RETURN_DECL.re and RETURN_DECL.im
separately.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-04-27  7:52 ` rsandifo at gcc dot gnu.org
@ 2023-04-27 11:17 ` rsandifo at gcc dot gnu.org
  2023-04-27 11:24 ` tnfchris at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2023-04-27 11:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #5 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Created attachment 54941
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54941&action=edit
hacky proof-of-concept patch

This is a very hacky proof of concept patch.  Don't try it on
anything serious, and certainly don't try to bootstrap with it --
it'll fall over in the slightest breeze.

But it does produce:

        ldp     s3, s2, [x0]
        ldp     s0, s1, [x1]
        fadd    s1, s2, s1
        fadd    s0, s3, s0
        ret

for the original testcase.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-04-27 11:17 ` rsandifo at gcc dot gnu.org
@ 2023-04-27 11:24 ` tnfchris at gcc dot gnu.org
  2023-04-27 13:33 ` rsandifo at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-04-27 11:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
That's an interesting approach, I think it would also fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109391 would it not? Since the
int16x8x3_t return would be "scalarized" avoiding the bad expansion?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-04-27 11:24 ` tnfchris at gcc dot gnu.org
@ 2023-04-27 13:33 ` rsandifo at gcc dot gnu.org
  2023-04-27 17:50 ` rsandifo at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2023-04-27 13:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #7 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Thinking more about it, it would probably be better to defer the
split until around lower_complex time, after IPA (especially inlining),
NRV and tail-recursion.  Doing it there should also make it easier
to split arguments.

(In reply to Tamar Christina from comment #6)
> That's an interesting approach, I think it would also fix
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109391 would it not? Since the
> int16x8x3_t return would be "scalarized" avoiding the bad expansion?
I don't think it will help with that, since the returned value
there is a natural V3x8HI (rather than something that the ABI splits
apart).  Splitting in that case might pessimise cases where the
return value is loaded as a whole, rather than assigned to
individually.

But it might be worth giving SRA the option of splitting even
in that case, as a follow-on optimisation, if it fits naturally
with the definitions.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2023-04-27 13:33 ` rsandifo at gcc dot gnu.org
@ 2023-04-27 17:50 ` rsandifo at gcc dot gnu.org
  2023-04-27 19:11 ` tnfchris at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2023-04-27 17:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2023-04-27
     Ever confirmed|0                           |1
           Assignee|unassigned at gcc dot gnu.org      |rsandifo at gcc dot gnu.org

--- Comment #8 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Have a (still hacky) patch that also fixes the example in
comment 4, giving:

        fadd    s1, s1, s3
        fadd    s0, s0, s2
        ret

Will work on it a bit more before sending an RFC.  Can imagine
the approach will be somewhat controversial!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2023-04-27 17:50 ` rsandifo at gcc dot gnu.org
@ 2023-04-27 19:11 ` tnfchris at gcc dot gnu.org
  2023-05-02 15:00 ` rsandifo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-04-27 19:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #9 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
Thank you!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2023-04-27 19:11 ` tnfchris at gcc dot gnu.org
@ 2023-05-02 15:00 ` rsandifo at gcc dot gnu.org
  2023-05-23 10:34 ` cvs-commit at gcc dot gnu.org
  2023-05-23 19:16 ` rsandifo at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2023-05-02 15:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #10 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
After prototyping this further, I no longer think that lowering
at the gimple level is the best answer.  (I should have listened
to Richi.)  Although it works, its major drawback is that
it's one-sided: it allows the current function's PARM_DECLs
and returns to be lowered to individual scalars, but it does
nothing for calls to other functions.  Being one-sided means
(a) that lowering only solves half the problem and (b) that tail
calls cannot be handled easily after lowering.

One thing that does seem to work is to force the structure to have
V2SF (and fix the inevitable ABI fallout).  That could only be done
conditionally, based on a target hook.  But it seems to fix both
test cases: the pass-by-reference one and the pass-by-value one.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2023-05-02 15:00 ` rsandifo at gcc dot gnu.org
@ 2023-05-23 10:34 ` cvs-commit at gcc dot gnu.org
  2023-05-23 19:16 ` rsandifo at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-23 10:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:b096a6ebe9d9f9fed4c105f6555f724eb32af95c

commit r14-1131-gb096a6ebe9d9f9fed4c105f6555f724eb32af95c
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Tue May 23 11:34:42 2023 +0100

    aarch64: Provide FPR alternatives for some bit insertions [PR109632]

    At -O2, and so with SLP vectorisation enabled:

        struct complx_t { float re, im; };
        complx_t add(complx_t a, complx_t b) {
          return {a.re + b.re, a.im + b.im};
        }

    generates:

            fmov    w3, s1
            fmov    x0, d0
            fmov    x1, d2
            fmov    w2, s3
            bfi     x0, x3, 32, 32
            fmov    d31, x0
            bfi     x1, x2, 32, 32
            fmov    d30, x1
            fadd    v31.2s, v31.2s, v30.2s
            fmov    x1, d31
            lsr     x0, x1, 32
            fmov    s1, w0
            lsr     w0, w1, 0
            fmov    s0, w0
            ret

    This is because complx_t is passed and returned in FPRs, but GCC gives
    it DImode.  We therefore âneedâ to assemble a DImode pseudo from the
    two individual floats, bitcast it to a vector, do the arithmetic,
    bitcast it back to a DImode pseudo, then extract the individual floats.

    There are many problems here.  The most basic is that we shouldn't
    use SLP for such a trivial example.  But SLP should in principle be
    beneficial for more complicated examples, so preventing SLP for the
    example above just changes the reproducer needed.  A more fundamental
    problem is that it doesn't make sense to use single DImode pseudos in a
    testcase like this.  I have a WIP patch to allow re and im to be stored
    in individual SFmode pseudos instead, but it's quite an invasive change
    and might end up going nowhere.

    A simpler problem to tackle is that we allow DImode pseudos to be stored
    in FPRs, but we don't provide any patterns for inserting values into
    them, even though INS makes that easy for element-like insertions.
    This patch adds some patterns for that.

    Doing that showed that aarch64_modes_tieable_p was too strict:
    it didn't allow SFmode and DImode values to be tied, even though
    both of them occupy a single GPR and FPR, and even though we allow
    both classes to change between the modes.

    The *aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS> pattern is
    especially ugly, but it's not clear what target-independent
    code ought to simplify it to, if it was going to simplify it.

    We should probably do the same thing for extractions, but that's left
    as future work.

    After the patch we generate:

            ins     v0.s[1], v1.s[0]
            ins     v2.s[1], v3.s[0]
            fadd    v0.2s, v0.2s, v2.2s
            fmov    x0, d0
            ushr    d1, d0, 32
            lsr     w0, w0, 0
            fmov    s0, w0
            ret

    which seems like a step in the right direction.

    All in all, there's nothing elegant about this patchh.  It just
    seems like the least worst option.

    gcc/
            PR target/109632
            * config/aarch64/aarch64.cc (aarch64_modes_tieable_p): Allow
            subregs between any scalars that are 64 bits or smaller.
            * config/aarch64/iterators.md (SUBDI_BITS): New int iterator.
            (bits_etype): New int attribute.
            * config/aarch64/aarch64.md (*insv_reg<mode>_<SUBDI_BITS>)
            (*aarch64_bfi<GPI:mode><ALLX:mode>_<SUBDI_BITS>): New patterns.
            (*aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS>): Likewise.

    gcc/testsuite/
            * gcc.target/aarch64/ins_bitfield_1.c: New test.
            * gcc.target/aarch64/ins_bitfield_2.c: Likewise.
            * gcc.target/aarch64/ins_bitfield_3.c: Likewise.
            * gcc.target/aarch64/ins_bitfield_4.c: Likewise.
            * gcc.target/aarch64/ins_bitfield_5.c: Likewise.
            * gcc.target/aarch64/ins_bitfield_6.c: Likewise.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/109632] Inefficient codegen when complex numbers are emulated with structs
  2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2023-05-23 10:34 ` cvs-commit at gcc dot gnu.org
@ 2023-05-23 19:16 ` rsandifo at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2023-05-23 19:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109632

--- Comment #12 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
The patch in comment 11 is just a related spot improvement.
The PR itself is still unfixed.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-05-23 19:16 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-26 13:03 [Bug target/109632] New: Inefficient codegen when complex numbers are emulated with structs tnfchris at gcc dot gnu.org
2023-04-26 14:14 ` [Bug target/109632] " rguenth at gcc dot gnu.org
2023-04-26 14:40 ` tnfchris at gcc dot gnu.org
2023-04-26 15:23 ` tnfchris at gcc dot gnu.org
2023-04-27  7:52 ` rsandifo at gcc dot gnu.org
2023-04-27 11:17 ` rsandifo at gcc dot gnu.org
2023-04-27 11:24 ` tnfchris at gcc dot gnu.org
2023-04-27 13:33 ` rsandifo at gcc dot gnu.org
2023-04-27 17:50 ` rsandifo at gcc dot gnu.org
2023-04-27 19:11 ` tnfchris at gcc dot gnu.org
2023-05-02 15:00 ` rsandifo at gcc dot gnu.org
2023-05-23 10:34 ` cvs-commit at gcc dot gnu.org
2023-05-23 19:16 ` rsandifo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).