From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by sourceware.org (Postfix) with ESMTPS id CF3D1385E454 for ; Thu, 28 Mar 2024 07:02:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CF3D1385E454 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=baylibre.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=baylibre.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CF3D1385E454 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::42a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711609363; cv=none; b=abxPlnI75BCT4a0qrhk7EWzCQWGAeqUwOSGmpEdIkLn8i1znYCIeqAN12T0xILmnkjo9n63mrw2bEa9LIUriUmY1AHCYcxbyI8E/OCRtjHdx5HznBp88GrrM/msmOwhbxMQ6GY8303ZcuajoE/ZhVH++4qK+AH0rZEqv8jmZQbM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711609363; c=relaxed/simple; bh=Uty3OjNonfPBTXrHb8Sis2syqkTdzDb5UtjhTcsWQIw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=BvDS0NtiqFVT489uEl3XIWGLVEolIQ/+OT1XeZp5pZ5eaPycsvd9D4B+YVfKQ0IJP+sIxlof9tPnE7/il83YCWFV1yQxvPSCQVr3ng5CluF0SAb8nJT0Ra/FyQr6iTuILrvhnyk7Gke2g0bvY6UmmVzEbS/+iLjwjTHSEEKfGFg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x42a.google.com with SMTP id ffacd0b85a97d-33d90dfe73cso304069f8f.0 for ; Thu, 28 Mar 2024 00:02:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=baylibre-com.20230601.gappssmtp.com; s=20230601; t=1711609356; x=1712214156; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=F6x6vVirlSebqA0FiQy8dPdKOBiq1QM1L5+a9GMtos0=; b=kjHQRm7WR1TwXW6FYUmjW/KqyhASpU4P/QPBNwVT8svoD7xwy5B3RB3QpmbxlK762M hnFuNhQlbidOdyeu+VceYqJvhrdZkLnd0Ozhu0TSLs4JAIJdnkW20aDAzS1ZveRiYkKV c+OHbvOe5CDwCaWZYlcHLJX/kPx5QCpr3rnAZJVVDRC0E0A6zPPFXwN6auigRVTidBEP VPT+imJ6yzkoZZqBQbsdLeX8tG9JGJM9TeWZSU3aTZFi9TkmlWBbtRAm2mi6DjUuVAvW AKWUuWbSy3lT2Hwr1HdikuLxRfq4lcUqb9Y4BFKnkqfbXcj456791SyOgmm1QUlKwXdu gdSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711609356; x=1712214156; h=content-transfer-encoding:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=F6x6vVirlSebqA0FiQy8dPdKOBiq1QM1L5+a9GMtos0=; b=VHhPDMwqHo31f+kdANkfVf0msP4b18QB6aBArQJrijCBv3J/fXCVMKz9WYhmKxPepH l3tlNoOzzhVdU5mp+Te4eXkt3DMjYCh6jUlBe3v47PXgz8HZyjIkF/zd8YTGFKs0VYl+ L1iao+T3q/Xr6v70O/kJCD2SWmzpFM3h24xYNdKaXyVMm5zpz74yzoxKCGJY8o/WyyB+ ZWKjOLMfMrlbwOUGhKY7nDN38U3JiLFWQmC2ltFNkdThwycWT4KF+/TJ7czlxagOBiHj jXM4Gl1SwbowtTNEw0fu44zfm2ojcHbxPKZhdxRoc3oEfCGdOVM0Q+fmUXxNCsv+DkJ5 nNwQ== X-Gm-Message-State: AOJu0YzJsXU4f+FnRqBN1iOTXbHNpLm4PFyHRhb9KD8BjHY7s5hvsQIp S8prhDbuKCjrM5BZrH52rHPZthb3EfI0lgXlNfK521R7x+cB/Aqaqk0hD9tvapT6KY/7/4sLuXJ m6J4= X-Google-Smtp-Source: AGHT+IFTBVgcKYk19v+HpM2TuW8fZK5V7M3SKePxHAXzLxPABC/c1Eob0m8g2dkxpunTQk6SksxITA== X-Received: by 2002:a5d:674d:0:b0:341:c9d5:ae23 with SMTP id l13-20020a5d674d000000b00341c9d5ae23mr948106wrw.18.1711609355993; Thu, 28 Mar 2024 00:02:35 -0700 (PDT) Received: from dem-tschwing-1.ger.mentorg.com (apoitiers-157-1-11-86.w90-5.abo.wanadoo.fr. [90.5.14.86]) by smtp.gmail.com with ESMTPSA id v17-20020adfe291000000b0034174566ec4sm912164wri.16.2024.03.28.00.02.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Mar 2024 00:02:35 -0700 (PDT) From: Thomas Schwinge To: Andrew Stubbs Cc: gcc-patches@gcc.gnu.org Subject: Re: [committed] amdgcn: Prefer V32 on RDNA devices In-Reply-To: <20240322155449.747518-1-ams@baylibre.com> References: <20240322155449.747518-1-ams@baylibre.com> User-Agent: Notmuch/0.30+7~gb1d4d05 (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Thu, 28 Mar 2024 08:00:50 +0100 Message-ID: <87bk6yali5.fsf@dem-tschwing-1.schwinge.ddns.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Andrew! On 2024-03-22T15:54:48+0000, Andrew Stubbs wrote: > This patch alters the default (preferred) vector size to 32 on RDNA devic= es to > better match the actual hardware. 64-lane vectors will continue to be > used where they are hard-coded (such as function prologues). > > We run these devices in wavefrontsize64 for compatibility, but they actua= lly > only have 32-lane vectors, natively. If the upper part of a V64 is masked > off (as it is in V32) then RDNA devices will skip execution of the upper = part > for most operations, so this adjustment shouldn't leave too much performa= nce on > the table. One exception is memory instructions, so full wavefrontsize32 > support would be better. > > The advantage is that we avoid the missing V64 operations (such as permut= e and > vec_extract). > > Committed to mainline. In my GCN target '-march=3Dgfx1100' testing, this commit "amdgcn: Prefer V32 on RDNA devices" does resolve (or, make latent?) a number of execution test FAILs (that is, regressions compared to earlier '-march=3Dgfx90a' etc. testing). This commit also resolves (for my '-march=3Dgfx1100' testing) one pre-existing FAIL (that is, already seen in '-march=3Dgfx90a' earlier etc. testing): PASS: gcc.dg/tree-ssa/scev-14.c (test for excess errors) [-FAIL:-]{+PASS:+} gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts "Ove= rflowness wrto loop niter:\tNo-overflow" That means, this test case specifically (or, just its 'scan-tree-dump'?) needs to be adjusted for GCN V64 testing? This commit, as you'd also mentioned elsewhere, however also causes a number of regressions in 'gcc.target/gcn/gcn.exp', see list below. Those can be "fixed" with 'dg-additional-options -march=3Dgfx90a' (or similar) in the affected test cases (let me know if you'd like me to 'git push' that), but I suppose something more elaborate may be in order? (Conditionalize those on 'target { ! gcn_rdna }', and add respective scanning for 'target gcn_rdna'? I can help with effective-target 'gcn_rdna' (or similar), if you'd like me to.) And/or, have a '-mpreferred-simd-mode=3Dv64' (or similar) to be used for such test cases, to override 'if (TARGET_RDNA2_PLUS)' etc. in 'gcn_vectorize_preferred_simd_mode'? Best, probably, both these things, to properly test both V32 and V64? PASS: gcc.target/gcn/cond_fmaxnm_1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times = smaxv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_1.c scan-assembler-times = smaxv64sf3_exec 3 PASS: gcc.target/gcn/cond_fmaxnm_1_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fmaxnm_1_run.c execution test PASS: gcc.target/gcn/cond_fmaxnm_2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times = smaxv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_2.c scan-assembler-times = smaxv64sf3_exec 3 PASS: gcc.target/gcn/cond_fmaxnm_2_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fmaxnm_2_run.c execution test PASS: gcc.target/gcn/cond_fmaxnm_3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times = movv64df_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times = movv64sf_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times = smaxv64sf3 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_3.c scan-assembler-times = smaxv64sf3 3 PASS: gcc.target/gcn/cond_fmaxnm_3_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fmaxnm_3_run.c execution test PASS: gcc.target/gcn/cond_fmaxnm_4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times = movv64df_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times = movv64sf_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times = smaxv64sf3 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_4.c scan-assembler-times = smaxv64sf3 3 PASS: gcc.target/gcn/cond_fmaxnm_4_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fmaxnm_4_run.c execution test PASS: gcc.target/gcn/cond_fmaxnm_5.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times = smaxv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_5.c scan-assembler-times = smaxv64sf3_exec 3 PASS: gcc.target/gcn/cond_fmaxnm_5_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fmaxnm_5_run.c execution test PASS: gcc.target/gcn/cond_fmaxnm_6.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times = smaxv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_6.c scan-assembler-times = smaxv64sf3_exec 3 PASS: gcc.target/gcn/cond_fmaxnm_6_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fmaxnm_6_run.c execution test PASS: gcc.target/gcn/cond_fmaxnm_7.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times = smaxv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_7.c scan-assembler-times = smaxv64sf3_exec 3 PASS: gcc.target/gcn/cond_fmaxnm_7_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fmaxnm_7_run.c execution test PASS: gcc.target/gcn/cond_fmaxnm_8.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times = smaxv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fmaxnm_8.c scan-assembler-times = smaxv64sf3_exec 3 PASS: gcc.target/gcn/cond_fmaxnm_8_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fmaxnm_8_run.c execution test PASS: gcc.target/gcn/cond_fminnm_1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times = sminv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_1.c scan-assembler-times = sminv64sf3_exec 3 PASS: gcc.target/gcn/cond_fminnm_1_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fminnm_1_run.c execution test PASS: gcc.target/gcn/cond_fminnm_2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times = sminv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_2.c scan-assembler-times = sminv64sf3_exec 3 PASS: gcc.target/gcn/cond_fminnm_2_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fminnm_2_run.c execution test PASS: gcc.target/gcn/cond_fminnm_3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times = movv64df_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times = movv64sf_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times = sminv64sf3 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_3.c scan-assembler-times = sminv64sf3 3 PASS: gcc.target/gcn/cond_fminnm_3_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fminnm_3_run.c execution test PASS: gcc.target/gcn/cond_fminnm_4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times = movv64df_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times = movv64sf_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times = sminv64sf3 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_4.c scan-assembler-times = sminv64sf3 3 PASS: gcc.target/gcn/cond_fminnm_4_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fminnm_4_run.c execution test PASS: gcc.target/gcn/cond_fminnm_5.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times = sminv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_5.c scan-assembler-times = sminv64sf3_exec 3 PASS: gcc.target/gcn/cond_fminnm_5_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fminnm_5_run.c execution test PASS: gcc.target/gcn/cond_fminnm_6.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times = sminv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_6.c scan-assembler-times = sminv64sf3_exec 3 PASS: gcc.target/gcn/cond_fminnm_6_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fminnm_6_run.c execution test PASS: gcc.target/gcn/cond_fminnm_7.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times = sminv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_7.c scan-assembler-times = sminv64sf3_exec 3 PASS: gcc.target/gcn/cond_fminnm_7_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fminnm_7_run.c execution test PASS: gcc.target/gcn/cond_fminnm_8.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-not \\= tv_writelane_b32\\tv[0-9]+, vcc_.. [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times = sminv64df3_exec 3 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_fminnm_8.c scan-assembler-times = sminv64sf3_exec 3 PASS: gcc.target/gcn/cond_fminnm_8_run.c (test for excess errors) PASS: gcc.target/gcn/cond_fminnm_8_run.c execution test @@ -124634,12 +124634,12 @@ PASS: gcc.target/gcn/cond_shift_3.c scan-as= sembler-not movv64di_exec/2 PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-not v_cndmask_b32 PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times \\tv_ashrrev_i= 32\\tv[0-9]+, 3, v[0-9]+ 1 PASS: gcc.target/gcn/cond_shift_3.c scan-assembler-times \\tv_lshlrev_b= 32\\tv[0-9]+, 3, v[0-9]+ 10 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times v= ashlv64di3_exec 2 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times v= ashlv64si3_exec 18 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times v= ashrv64di3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times v= ashrv64si3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times v= lshrv64di3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_3.c scan-assembler-times v= lshrv64si3_exec 1 PASS: gcc.target/gcn/cond_shift_3_run.c (test for excess errors) PASS: gcc.target/gcn/cond_shift_3_run.c execution test PASS: gcc.target/gcn/cond_shift_4.c (test for excess errors) @@ -124647,77 +124647,77 @@ PASS: gcc.target/gcn/cond_shift_4.c scan-as= sembler-not movv64di_exec/2 PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-not v_cndmask_b32 PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times \\tv_ashrrev_i= 32\\tv[0-9]+, 3, v[0-9]+ 1 PASS: gcc.target/gcn/cond_shift_4.c scan-assembler-times \\tv_lshlrev_b= 32\\tv[0-9]+, 3, v[0-9]+ 10 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times v= ashlv64di3_exec 2 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times v= ashlv64si3_exec 18 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times v= ashrv64di3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times v= ashrv64si3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times v= lshrv64di3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_4.c scan-assembler-times v= lshrv64si3_exec 1 PASS: gcc.target/gcn/cond_shift_4_run.c (test for excess errors) PASS: gcc.target/gcn/cond_shift_4_run.c execution test PASS: gcc.target/gcn/cond_shift_8.c (test for excess errors) PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64di_exec/0 PASS: gcc.target/gcn/cond_shift_8.c scan-assembler-not movv64si_exec/0 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times v= ashlv64di3_exec 2 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times v= ashlv64si3_exec 18 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times v= ashrv64di3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times v= ashrv64si3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times v= lshrv64di3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_8.c scan-assembler-times v= lshrv64si3_exec 1 PASS: gcc.target/gcn/cond_shift_8_run.c (test for excess errors) PASS: gcc.target/gcn/cond_shift_8_run.c execution test PASS: gcc.target/gcn/cond_shift_9.c (test for excess errors) PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64di_exec/1 PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not movv64si_exec/2 PASS: gcc.target/gcn/cond_shift_9.c scan-assembler-not v_cndmask_b32 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times v= ashlv64di3_exec 2 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times v= ashlv64si3_exec 18 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times v= ashrv64di3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times v= ashrv64si3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times v= lshrv64di3_exec 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_shift_9.c scan-assembler-times v= lshrv64si3_exec 1 PASS: gcc.target/gcn/cond_shift_9_run.c (test for excess errors) PASS: gcc.target/gcn/cond_shift_9_run.c execution test PASS: gcc.target/gcn/cond_smax_1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-not \\ts= _cmpk_lg_u32\\tvcc_lo, 0 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_cmpx_gt_i32\= \tvcc, s[0-9]+, v[0-9]+ PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not \\tv_writelane_b3= 2\\tv[0-9]+, vcc_??, 0 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not smaxv64si3/0 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i32= \\tvcc, s[0-9]+, v[0-9]+ 80 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_gt_i64= \\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times \\tv_cmp_ne_u64= \\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-times sm= axv64si3_exec 30 PASS: gcc.target/gcn/cond_smax_1_run.c (test for excess errors) PASS: gcc.target/gcn/cond_smax_1_run.c execution test PASS: gcc.target/gcn/cond_smin_1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-not \\ts= _cmpk_lg_u32\\tvcc_lo, 0 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_cmpx_gt_i32\= \tvcc, s[0-9]+, v[0-9]+ PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not \\tv_writelane_b3= 2\\tv[0-9]+, vcc_??, 0 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not sminv64si3/0 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_gt_i32= \\tvcc, s[0-9]+, v[0-9]+ 80 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_lt_i64= \\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times \\tv_cmp_ne_u64= \\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-times sm= inv64si3_exec 30 PASS: gcc.target/gcn/cond_smin_1_run.c (test for excess errors) PASS: gcc.target/gcn/cond_smin_1_run.c execution test PASS: gcc.target/gcn/cond_umax_1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-not \\ts= _cmpk_lg_u32\\tvcc_lo, 0 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not \\tv_writelane_b3= 2\\tv[0-9]+, vcc_??, 0 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not umaxv64si3/0 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times \\= tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_gt_u64= \\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-times \\tv_cmp_ne_u64= \\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times um= axv64si3_exec 20 PASS: gcc.target/gcn/cond_umax_1_run.c (test for excess errors) PASS: gcc.target/gcn/cond_umax_1_run.c execution test PASS: gcc.target/gcn/cond_umin_1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-not \\ts= _cmpk_lg_u32\\tvcc_lo, 0 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not \\tv_writelane_b3= 2\\tv[0-9]+, vcc_??, 0 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-not uminv64si3/0 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times \\= tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_lt_u64= \\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 PASS: gcc.target/gcn/cond_umin_1.c scan-assembler-times \\tv_cmp_ne_u64= \\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], 1 8 [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times um= inv64si3_exec 20 PASS: gcc.target/gcn/cond_umin_1_run.c (test for excess errors) PASS: gcc.target/gcn/cond_umin_1_run.c execution test PASS: gcc.target/gcn/simd-math-1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_acos" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_acosh" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_asin" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_asinh" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_atan" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_atan2" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_atanh" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_copysign" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_cos" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_cosh" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_erf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_exp" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_exp2" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_fmod" XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_gamma" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_hypot" XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_lgamma" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_log" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_log10" XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64df_log2" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_pow" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_remainder" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_rint" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_scalb" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_significand" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_sin" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_sinh" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_sqrt" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_tan" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_tanh" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4df_tgamma" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_acosf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_acoshf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_asinf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_asinhf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_atan2f" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_atanf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_atanhf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_copysignf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_cosf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_coshf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_erff" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_exp2f" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_expf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_fmodf" XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_gammaf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_hypotf" XFAIL: gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v64sf_lgammaf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_log10f" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_log2f" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_logf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_powf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_remainderf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_rintf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_scalbf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_significandf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_sinf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_sinhf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_sqrtf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_tanf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_tanhf" [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-1.c scan-tree-dump vect "v6= 4sf_tgammaf" @@ -125130,7 +125130,7 @@ PASS: gcc.target/gcn/simd-math-5-char-run.c (= test for excess errors) PASS: gcc.target/gcn/simd-math-5-char-run.c execution test PASS: gcc.target/gcn/simd-math-5-char.c (test for excess errors) XFAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divmodv= 64si4@rel32@lo 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-char.c scan-assembler-tim= es __divv64hi3@rel32@lo 1 PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __divv64qi= 3@rel32@lo 0 FAIL: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __modv64qi= 3@rel32@lo 1 PASS: gcc.target/gcn/simd-math-5-char.c scan-assembler-times __udivv64q= i3@rel32@lo 0 @@ -125171,8 +125171,8 @@ PASS: gcc.target/gcn/simd-math-5-long-run.c (= test for excess errors) PASS: gcc.target/gcn/simd-math-5-long-run.c execution test PASS: gcc.target/gcn/simd-math-5-long.c (test for excess errors) XFAIL: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __divmodv= 64di4@rel32@lo 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-tim= es __divv64di3@rel32@lo 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-long.c scan-assembler-tim= es __modv64di3@rel32@lo 1 PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __udivv64d= i3@rel32@lo 0 PASS: gcc.target/gcn/simd-math-5-long.c scan-assembler-times __umodv64d= i3@rel32@lo 0 PASS: gcc.target/gcn/simd-math-5-short.c (test for excess errors) XFAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divmod= v64si4@rel32@lo 1 PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __divv64h= i3@rel32@lo 0 [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-short.c scan-assembler-ti= mes __divv64si3@rel32@lo 1 FAIL: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __modv64h= i3@rel32@lo 1 PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __udivv64= hi3@rel32@lo 0 PASS: gcc.target/gcn/simd-math-5-short.c scan-assembler-times __umodv64= hi3@rel32@lo 0 PASS: gcc.target/gcn/simd-math-5.c (test for excess errors) XFAIL: gcc.target/gcn/simd-math-5.c scan-assembler-times __divmodv64si4= @rel32@lo 1 PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __divsi3@rel32@= lo 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times __= divv64si3@rel32@lo 1 [-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5.c scan-assembler-times __= modv64si3@rel32@lo 1 PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivmodv64si4= @rel32@lo 0 PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivsi3@rel32= @lo 0 PASS: gcc.target/gcn/simd-math-5.c scan-assembler-times __udivv64si3@re= l32@lo 0 @@ -125242,13 +125242,13 @@ PASS: gcc.target/gcn/simd-math-5.c scan-ass= embler-times __umodv64si3@rel32@lo 0 PASS: gcc.target/gcn/smax_1.c (test for excess errors) PASS: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmp_gt_i64\\tvc= c, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 FAIL: gcc.target/gcn/smax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tv= cc, s[0-9]+, v[0-9]+ 80 [-PASS:-]{+FAIL:+} gcc.target/gcn/smax_1.c scan-assembler-times vec_cmp= v64didi 10 PASS: gcc.target/gcn/smax_1_run.c (test for excess errors) PASS: gcc.target/gcn/smax_1_run.c execution test PASS: gcc.target/gcn/smin_1.c (test for excess errors) PASS: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmp_lt_i64\\tvc= c, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10 FAIL: gcc.target/gcn/smin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tv= cc, s[0-9]+, v[0-9]+ 80 [-PASS:-]{+FAIL:+} gcc.target/gcn/smin_1.c scan-assembler-times vec_cmp= v64didi 10 PASS: gcc.target/gcn/smin_1_run.c (test for excess errors) PASS: gcc.target/gcn/smin_1_run.c execution test PASS: gcc.target/gcn/sram-ecc-3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-3.c scan-assembler (\\*zero_= extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift) PASS: gcc.target/gcn/sram-ecc-4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-4.c scan-assembler (\\*zero_= extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift) PASS: gcc.target/gcn/sram-ecc-7.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-7.c scan-assembler (\\*zero_= extendv64qiv64si_sdwa|\\*zero_extendv64qiv64si_shift) PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-8.c scan-assembler (\\*zero_= extendv64hiv64si_sdwa|\\*zero_extendv64hiv64si_shift) PASS: gcc.target/gcn/umax_1.c (test for excess errors) PASS: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmp_gt_u64\\tvc= c, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 FAIL: gcc.target/gcn/umax_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tv= cc, s[0-9]+, v[0-9]+ 56 [-PASS:-]{+FAIL:+} gcc.target/gcn/umax_1.c scan-assembler-times vec_cmp= v64didi 8 PASS: gcc.target/gcn/umax_1_run.c (test for excess errors) PASS: gcc.target/gcn/umax_1_run.c execution test PASS: gcc.target/gcn/umin_1.c (test for excess errors) PASS: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmp_lt_u64\\tvc= c, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 8 FAIL: gcc.target/gcn/umin_1.c scan-assembler-times \\tv_cmpx_gt_i32\\tv= cc, s[0-9]+, v[0-9]+ 56 [-PASS:-]{+FAIL:+} gcc.target/gcn/umin_1.c scan-assembler-times vec_cmp= v64didi 8 PASS: gcc.target/gcn/umin_1_run.c (test for excess errors) PASS: gcc.target/gcn/umin_1_run.c execution test Gr=C3=BC=C3=9Fe Thomas > gcc/ChangeLog: > > * config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode): Prefer V32 on > RDNA devices. > --- > gcc/config/gcn/gcn.cc | 26 ++++++++++++++++++++++++++ > 1 file changed, 26 insertions(+) > > diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc > index 498146dcde9..efb73af50c4 100644 > --- a/gcc/config/gcn/gcn.cc > +++ b/gcc/config/gcn/gcn.cc > @@ -5226,6 +5226,32 @@ gcn_vector_mode_supported_p (machine_mode mode) > static machine_mode > gcn_vectorize_preferred_simd_mode (scalar_mode mode) > { > + /* RDNA devices have 32-lane vectors with limited support for 64-bit v= ectors > + (in particular, permute operations are only available for cases tha= t don't > + span the 32-lane boundary). > + > + From the RDNA3 manual: "Hardware may choose to skip either half if = the > + EXEC mask for that half is all zeros...". This means that preferring > + 32-lanes is a good stop-gap until we have proper wave32 support. */ > + if (TARGET_RDNA2_PLUS) > + switch (mode) > + { > + case E_QImode: > + return V32QImode; > + case E_HImode: > + return V32HImode; > + case E_SImode: > + return V32SImode; > + case E_DImode: > + return V32DImode; > + case E_SFmode: > + return V32SFmode; > + case E_DFmode: > + return V32DFmode; > + default: > + return word_mode; > + } > + > switch (mode) > { > case E_QImode: > --=20 > 2.41.0