public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <tschwinge@baylibre.com>
To: Andrew Stubbs <ams@baylibre.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [committed] amdgcn: Prefer V32 on RDNA devices
Date: Thu, 28 Mar 2024 08:00:50 +0100	[thread overview]
Message-ID: <87bk6yali5.fsf@dem-tschwing-1.schwinge.ddns.net> (raw)
In-Reply-To: <20240322155449.747518-1-ams@baylibre.com>

Hi Andrew!

On 2024-03-22T15:54:48+0000, Andrew Stubbs <ams@baylibre.com> wrote:
> This patch alters the default (preferred) vector size to 32 on RDNA devices to
> better match the actual hardware.  64-lane vectors will continue to be
> used where they are hard-coded (such as function prologues).
>
> We run these devices in wavefrontsize64 for compatibility, but they actually
> only have 32-lane vectors, natively.  If the upper part of a V64 is masked
> off (as it is in V32) then RDNA devices will skip execution of the upper part
> for most operations, so this adjustment shouldn't leave too much performance on
> the table.  One exception is memory instructions, so full wavefrontsize32
> support would be better.
>
> The advantage is that we avoid the missing V64 operations (such as permute and
> vec_extract).
>
> Committed to mainline.

In my GCN target '-march=gfx1100' testing, this commit
"amdgcn: Prefer V32 on RDNA devices" does resolve (or, make latent?) a
number of execution test FAILs (that is, regressions compared to earlier
'-march=gfx90a' etc. testing).

This commit also resolves (for my '-march=gfx1100' testing) one
pre-existing FAIL (that is, already seen in '-march=gfx90a' earlier
etc. testing):

    PASS: gcc.dg/tree-ssa/scev-14.c (test for excess errors)
    [-FAIL:-]{+PASS:+} gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts "Overflowness wrto loop niter:\tNo-overflow"

That means, this test case specifically (or, just its 'scan-tree-dump'?)
needs to be adjusted for GCN V64 testing?

This commit, as you'd also mentioned elsewhere, however also causes a
number of regressions in 'gcc.target/gcn/gcn.exp', see list below.

Those can be "fixed" with 'dg-additional-options -march=gfx90a' (or
similar) in the affected test cases (let me know if you'd like me to
'git push' that), but I suppose something more elaborate may be in order?
(Conditionalize those on 'target { ! gcn_rdna }', and add respective
scanning for 'target gcn_rdna'?  I can help with effective-target
'gcn_rdna' (or similar), if you'd like me to.)

And/or, have a '-mpreferred-simd-mode=v64' (or similar) to be used for
such test cases, to override 'if (TARGET_RDNA2_PLUS)' etc. in
'gcn_vectorize_preferred_simd_mode'?

Best, probably, both these things, to properly test both V32 and V64?

    PASS: gcc.target/gcn/cond_fmaxnm_1.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times smaxv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times smaxv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fmaxnm_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fmaxnm_1_run.c execution test

    PASS: gcc.target/gcn/cond_fmaxnm_2.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times smaxv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times smaxv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fmaxnm_2_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fmaxnm_2_run.c execution test

    PASS: gcc.target/gcn/cond_fmaxnm_3.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times movv64df_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times movv64sf_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times smaxv64sf3 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times smaxv64sf3 3
    PASS: gcc.target/gcn/cond_fmaxnm_3_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fmaxnm_3_run.c execution test

    PASS: gcc.target/gcn/cond_fmaxnm_4.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times movv64df_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times movv64sf_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times smaxv64sf3 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times smaxv64sf3 3
    PASS: gcc.target/gcn/cond_fmaxnm_4_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fmaxnm_4_run.c execution test

    PASS: gcc.target/gcn/cond_fmaxnm_5.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times smaxv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times smaxv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fmaxnm_5_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fmaxnm_5_run.c execution test

    PASS: gcc.target/gcn/cond_fmaxnm_6.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times smaxv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times smaxv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fmaxnm_6_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fmaxnm_6_run.c execution test

    PASS: gcc.target/gcn/cond_fmaxnm_7.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times smaxv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times smaxv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fmaxnm_7_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fmaxnm_7_run.c execution test

    PASS: gcc.target/gcn/cond_fmaxnm_8.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times smaxv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times smaxv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fmaxnm_8_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fmaxnm_8_run.c execution test

    PASS: gcc.target/gcn/cond_fminnm_1.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times sminv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times sminv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fminnm_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fminnm_1_run.c execution test

    PASS: gcc.target/gcn/cond_fminnm_2.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times sminv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times sminv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fminnm_2_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fminnm_2_run.c execution test

    PASS: gcc.target/gcn/cond_fminnm_3.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times movv64df_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times movv64sf_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times sminv64sf3 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times sminv64sf3 3
    PASS: gcc.target/gcn/cond_fminnm_3_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fminnm_3_run.c execution test

    PASS: gcc.target/gcn/cond_fminnm_4.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times movv64df_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times movv64sf_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times sminv64sf3 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times sminv64sf3 3
    PASS: gcc.target/gcn/cond_fminnm_4_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fminnm_4_run.c execution test

    PASS: gcc.target/gcn/cond_fminnm_5.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times sminv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times sminv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fminnm_5_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fminnm_5_run.c execution test

    PASS: gcc.target/gcn/cond_fminnm_6.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times sminv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times sminv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fminnm_6_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fminnm_6_run.c execution test

    PASS: gcc.target/gcn/cond_fminnm_7.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times sminv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times sminv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fminnm_7_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fminnm_7_run.c execution test

    PASS: gcc.target/gcn/cond_fminnm_8.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_..
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times sminv64df3_exec 3
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times sminv64sf3_exec 3
    PASS: gcc.target/gcn/cond_fminnm_8_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_fminnm_8_run.c execution test

    @@ -124634,12 +124634,12 @@ PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-not movv64di_exec/2
    PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-not v_cndmask_b32
    PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times \\tv_ashrrev_i32\\tv[0-9]+, 3, v[0-9]+ 1
    PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times \\tv_lshlrev_b32\\tv[0-9]+, 3, v[0-9]+ 10
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashlv64di3_exec 2
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashlv64si3_exec 18
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashrv64di3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vashrv64si3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vlshrv64di3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times vlshrv64si3_exec 1
    PASS: gcc.target/gcn/cond_shift_3_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_shift_3_run.c execution test

    PASS: gcc.target/gcn/cond_shift_4.c (test for excess errors)
    @@ -124647,77 +124647,77 @@ PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-not movv64di_exec/2
    PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-not v_cndmask_b32
    PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times \\tv_ashrrev_i32\\tv[0-9]+, 3, v[0-9]+ 1
    PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times \\tv_lshlrev_b32\\tv[0-9]+, 3, v[0-9]+ 10
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashlv64di3_exec 2
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashlv64si3_exec 18
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashrv64di3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vashrv64si3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vlshrv64di3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times vlshrv64si3_exec 1
    PASS: gcc.target/gcn/cond_shift_4_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_shift_4_run.c execution test

    PASS: gcc.target/gcn/cond_shift_8.c (test for excess errors)
    PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64di_exec/0
    PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64si_exec/0
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashlv64di3_exec 2
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashlv64si3_exec 18
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashrv64di3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vashrv64si3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vlshrv64di3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times vlshrv64si3_exec 1
    PASS: gcc.target/gcn/cond_shift_8_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_shift_8_run.c execution test

    PASS: gcc.target/gcn/cond_shift_9.c (test for excess errors)
    PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64di_exec/1
    PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64si_exec/2
    PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not v_cndmask_b32
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashlv64di3_exec 2
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashlv64si3_exec 18
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashrv64di3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vashrv64si3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vlshrv64di3_exec 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times vlshrv64si3_exec 1
    PASS: gcc.target/gcn/cond_shift_9_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_shift_9_run.c execution test

    PASS: gcc.target/gcn/cond_smax_1.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
    PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+
    PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
    PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not smaxv64si3/0
    PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
    PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
    PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-times smaxv64si3_exec 30
    PASS: gcc.target/gcn/cond_smax_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_smax_1_run.c execution test

    PASS: gcc.target/gcn/cond_smin_1.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
    PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+
    PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
    PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not sminv64si3/0
    PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
    PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
    PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-times sminv64si3_exec 30
    PASS: gcc.target/gcn/cond_smin_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_smin_1_run.c execution test

    PASS: gcc.target/gcn/cond_umax_1.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
    PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
    PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not umaxv64si3/0
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
    PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
    PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times umaxv64si3_exec 20
    PASS: gcc.target/gcn/cond_umax_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_umax_1_run.c execution test

    PASS: gcc.target/gcn/cond_umin_1.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo, 0
    PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not \\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
    PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not uminv64si3/0
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
    PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
    PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8
    [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times uminv64si3_exec 20
    PASS: gcc.target/gcn/cond_umin_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/cond_umin_1_run.c execution test

    PASS: gcc.target/gcn/simd-math-1.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_acos"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_acosh"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_asin"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_asinh"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atan"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atan2"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_atanh"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_copysign"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_cos"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_cosh"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_erf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_exp"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_exp2"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_fmod"
    XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_gamma"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_hypot"
    XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_lgamma"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log10"
    XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log2"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_pow"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_remainder"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_rint"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_scalb"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_significand"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sin"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sinh"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_sqrt"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tan"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tanh"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_tgamma"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_acosf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_acoshf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_asinf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_asinhf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atan2f"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atanf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_atanhf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_copysignf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_cosf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_coshf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_erff"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_exp2f"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_expf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_fmodf"
    XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_gammaf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_hypotf"
    XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_lgammaf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_log10f"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_log2f"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_logf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_powf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_remainderf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_rintf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_scalbf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_significandf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sinf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sinhf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_sqrtf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tanf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tanhf"
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_tgammaf"

    @@ -125130,7 +125130,7 @@ PASS: gcc.target/gcn/simd-math-5-char-run.c (test for excess errors)
    PASS: gcc.target/gcn/simd-math-5-char-run.c execution test
    PASS: gcc.target/gcn/simd-math-5-char.c (test for excess errors)
    XFAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divmodv64si4@rel32@lo 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divv64hi3@rel32@lo 1
    PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divv64qi3@rel32@lo 0
    FAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __modv64qi3@rel32@lo 1
    PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __udivv64qi3@rel32@lo 0

    @@ -125171,8 +125171,8 @@ PASS: gcc.target/gcn/simd-math-5-long-run.c (test for excess errors)
    PASS: gcc.target/gcn/simd-math-5-long-run.c execution test
    PASS: gcc.target/gcn/simd-math-5-long.c (test for excess errors)
    XFAIL: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __divmodv64di4@rel32@lo 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-times __divv64di3@rel32@lo 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-times __modv64di3@rel32@lo 1
    PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __udivv64di3@rel32@lo 0
    PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __umodv64di3@rel32@lo 0

    PASS: gcc.target/gcn/simd-math-5-short.c (test for excess errors)
    XFAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divmodv64si4@rel32@lo 1
    PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divv64hi3@rel32@lo 0
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divv64si3@rel32@lo 1
    FAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __modv64hi3@rel32@lo 1
    PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __udivv64hi3@rel32@lo 0
    PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __umodv64hi3@rel32@lo 0

    PASS: gcc.target/gcn/simd-math-5.c (test for excess errors)
    XFAIL: gcc.target/gcn/simd-math-5.c scan-assembler-times __divmodv64si4@rel32@lo 1
    PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __divsi3@rel32@lo 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times __divv64si3@rel32@lo 1
    [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times __modv64si3@rel32@lo 1
    PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivmodv64si4@rel32@lo 0
    PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivsi3@rel32@lo 0
    PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivv64si3@rel32@lo 0
    @@ -125242,13 +125242,13 @@ PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __umodv64si3@rel32@lo 0

    PASS: gcc.target/gcn/smax_1.c (test for excess errors)
    PASS: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
    FAIL: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
    [-PASS:-]{+FAIL:+} gcc.target/gcn/smax_1.c scan-assembler-times vec_cmpv64didi 10
    PASS: gcc.target/gcn/smax_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/smax_1_run.c execution test

    PASS: gcc.target/gcn/smin_1.c (test for excess errors)
    PASS: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
    FAIL: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
    [-PASS:-]{+FAIL:+} gcc.target/gcn/smin_1.c scan-assembler-times vec_cmpv64didi 10
    PASS: gcc.target/gcn/smin_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/smin_1_run.c execution test

    PASS: gcc.target/gcn/sram-ecc-3.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-3.c scan-assembler (\\*zero_extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift)

    PASS: gcc.target/gcn/sram-ecc-4.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-4.c scan-assembler (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift)

    PASS: gcc.target/gcn/sram-ecc-7.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-7.c scan-assembler (\\*zero_extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift)

    PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors)
    [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-8.c scan-assembler (\\*zero_extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift)

    PASS: gcc.target/gcn/umax_1.c (test for excess errors)
    PASS: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
    FAIL: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
    [-PASS:-]{+FAIL:+} gcc.target/gcn/umax_1.c scan-assembler-times vec_cmpv64didi 8
    PASS: gcc.target/gcn/umax_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/umax_1_run.c execution test

    PASS: gcc.target/gcn/umin_1.c (test for excess errors)
    PASS: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8
    FAIL: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56
    [-PASS:-]{+FAIL:+} gcc.target/gcn/umin_1.c scan-assembler-times vec_cmpv64didi 8
    PASS: gcc.target/gcn/umin_1_run.c (test for excess errors)
    PASS: gcc.target/gcn/umin_1_run.c execution test


Grüße
 Thomas


> gcc/ChangeLog:
>
> 	* config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode): Prefer V32 on
> 	RDNA devices.
> ---
>  gcc/config/gcn/gcn.cc | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
>
> diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
> index 498146dcde9..efb73af50c4 100644
> --- a/gcc/config/gcn/gcn.cc
> +++ b/gcc/config/gcn/gcn.cc
> @@ -5226,6 +5226,32 @@ gcn_vector_mode_supported_p (machine_mode mode)
>  static machine_mode
>  gcn_vectorize_preferred_simd_mode (scalar_mode mode)
>  {
> +  /* RDNA devices have 32-lane vectors with limited support for 64-bit vectors
> +     (in particular, permute operations are only available for cases that don't
> +     span the 32-lane boundary).
> +
> +     From the RDNA3 manual: "Hardware may choose to skip either half if the
> +     EXEC mask for that half is all zeros...". This means that preferring
> +     32-lanes is a good stop-gap until we have proper wave32 support.  */
> +  if (TARGET_RDNA2_PLUS)
> +    switch (mode)
> +      {
> +      case E_QImode:
> +	return V32QImode;
> +      case E_HImode:
> +	return V32HImode;
> +      case E_SImode:
> +	return V32SImode;
> +      case E_DImode:
> +	return V32DImode;
> +      case E_SFmode:
> +	return V32SFmode;
> +      case E_DFmode:
> +	return V32DFmode;
> +      default:
> +	return word_mode;
> +      }
> +
>    switch (mode)
>      {
>      case E_QImode:
> -- 
> 2.41.0

  reply	other threads:[~2024-03-28  7:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-22 15:54 Andrew Stubbs
2024-03-28  7:00 ` Thomas Schwinge [this message]
2024-04-08 10:45   ` GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]' (was: [committed] amdgcn: Prefer V32 on RDNA devices) Thomas Schwinge
2024-04-08 12:24     ` GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]' Andrew Stubbs
2024-04-08 20:17       ` GCN: '--param=gcn-preferred-vectorization-factor=[default,32,64]' (was: GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]') Thomas Schwinge
2024-04-11 17:52         ` [PATCH] contrib/check-params-in-docs.py: Ignore gcn-preferred-vectorization-factor Martin Jambor
2024-04-11 18:51           ` Thomas Schwinge
2024-04-12  7:08             ` [PATCH] contrib/check-params-in-docs.py: Ignore target-specific params Filip Kastl
2024-04-12  7:40               ` Thomas Schwinge
2024-04-12  8:39               ` Martin Jambor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bk6yali5.fsf@dem-tschwing-1.schwinge.ddns.net \
    --to=tschwinge@baylibre.com \
    --cc=ams@baylibre.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).