public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/114350] New: missing support for SVE widening floating point conversion
@ 2024-03-15  7:56 tnfchris at gcc dot gnu.org
  2024-03-15  8:02 ` [Bug target/114350] " pinskia at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-03-15  7:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114350

            Bug ID: 114350
           Summary: missing support for SVE widening floating point
                    conversion
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64*

The following example:

#include <arm_sve.h>
svfloat64_t widening (svint32_t a)
{
    svbool_t pred = svptrue_b32 ();
    svint64_t cvt = svreinterpret_s64_s32 (a);
    svint64_t ext = svextw_s64_x (pred, cvt);
    svfloat64_t res = svcvt_f64_s64_x (pred, ext);
    return res;
}

compiled with -Ofast -march=armv9-a generates:

widening(__SVInt32_t):
        ptrue   p3.b, all
        sxtw    z0.d, p3/m, z0.d
        scvtf   z0.d, p3/m, z0.d
        ret

but SVE has widening and narrowing floating point conversions, as such this
should generate:

widening(__SVInt32_t):
        ptrue   p3.b, all
        scvtf   z0.d, p3/m, z0.s
        ret

The autovec equivalent is:


void f(int n, double *data) {
    for (int i=0;i<n;i++) {
        data[i] = i;
    }
}

which generates:

.L5:
        mov     z29.d, z31.d
        sxtw    z29.d, p7/m, z29.d
        scvtf   z29.d, p7/m, z29.d
        st1d    z29.d, p6, [x1, x2, lsl 3]
        add     z31.s, z31.s, z30.s
        incd    x2
        whilelo p6.d, w2, w0
        b.any   .L5

note that scalar has the widening variant as well (which we do use) and we
account for in vectorizer costing:

(double) i_14 1 times scalar_stmt costs 1 in epilogue
_4 1 times scalar_store costs 1 in epilogue

Adv. SIMD costing is right:

(double) i_14 2 times vec_promote_demote costs 4 in body

but SVE costing is wrong:

(double) i_14 2 times vector_stmt costs 2 in body

which makes SVE seem as expensive than Adv. SIMD.
Note that we can also use the widening instruction for Adv. SIMD on a SVE
capable system.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug target/114350] missing support for SVE widening floating point conversion
  2024-03-15  7:56 [Bug target/114350] New: missing support for SVE widening floating point conversion tnfchris at gcc dot gnu.org
@ 2024-03-15  8:02 ` pinskia at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-15  8:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114350

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-03-15  8:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-15  7:56 [Bug target/114350] New: missing support for SVE widening floating point conversion tnfchris at gcc dot gnu.org
2024-03-15  8:02 ` [Bug target/114350] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).