From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1039) id 9600B397B830; Wed, 4 Aug 2021 20:02:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9600B397B830 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: H.J. Lu To: gcc-cvs@gcc.gnu.org Subject: [gcc r12-2748] x86: Update STORE_MAX_PIECES X-Act-Checkin: gcc X-Git-Author: H.J. Lu X-Git-Refname: refs/heads/master X-Git-Oldrev: 09dba016db937e61be21ef1e9581065a9ed2847d X-Git-Newrev: 5738a64f8b3cf132b88b39af84b9f5f5a9a1554c Message-Id: <20210804200247.9600B397B830@sourceware.org> Date: Wed, 4 Aug 2021 20:02:47 +0000 (GMT) X-BeenThere: gcc-cvs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 20:02:47 -0000 https://gcc.gnu.org/g:5738a64f8b3cf132b88b39af84b9f5f5a9a1554c commit r12-2748-g5738a64f8b3cf132b88b39af84b9f5f5a9a1554c Author: H.J. Lu Date: Tue Aug 3 06:17:22 2021 -0700 x86: Update STORE_MAX_PIECES Update STORE_MAX_PIECES to allow 16/32/64 bytes only if inter-unit move is enabled since vec_duplicate enabled by inter-unit move is used to implement store_by_pieces of 16/32/64 bytes. gcc/ PR target/101742 * config/i386/i386.h (STORE_MAX_PIECES): Allow 16/32/64 bytes only if TARGET_INTER_UNIT_MOVES_TO_VEC is true. gcc/testsuite/ PR target/101742 * gcc.target/i386/pr101742a.c: New test. * gcc.target/i386/pr101742b.c: Likewise. Diff: --- gcc/config/i386/i386.h | 26 +++++++++++++++----------- gcc/testsuite/gcc.target/i386/pr101742a.c | 16 ++++++++++++++++ gcc/testsuite/gcc.target/i386/pr101742b.c | 4 ++++ 3 files changed, 35 insertions(+), 11 deletions(-) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index bed9cd9da18..21fe51bba40 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1780,18 +1780,22 @@ typedef struct ix86_args { && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ ? 16 : UNITS_PER_WORD))) -/* STORE_MAX_PIECES is the number of bytes at a time that we can - store efficiently. */ +/* STORE_MAX_PIECES is the number of bytes at a time that we can store + efficiently. Allow 16/32/64 bytes only if inter-unit move is enabled + since vec_duplicate enabled by inter-unit move is used to implement + store_by_pieces of 16/32/64 bytes. */ #define STORE_MAX_PIECES \ - ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ - ? 64 \ - : ((TARGET_AVX \ - && !TARGET_PREFER_AVX128 \ - && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ - ? 32 \ - : ((TARGET_SSE2 \ - && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ - ? 16 : UNITS_PER_WORD))) + (TARGET_INTER_UNIT_MOVES_TO_VEC \ + ? ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ + ? 64 \ + : ((TARGET_AVX \ + && !TARGET_PREFER_AVX128 \ + && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ + ? 32 \ + : ((TARGET_SSE2 \ + && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ + ? 16 : UNITS_PER_WORD))) \ + : UNITS_PER_WORD) /* If a memory-to-memory move would take MOVE_RATIO or more simple move-instruction pairs, we will do a cpymem or libcall instead. diff --git a/gcc/testsuite/gcc.target/i386/pr101742a.c b/gcc/testsuite/gcc.target/i386/pr101742a.c new file mode 100644 index 00000000000..67ea40587dd --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101742a.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -mtune=nano-x2" } */ + +int n2; + +__attribute__ ((simd)) char +w7 (void) +{ + short int xb = n2; + int qp; + + for (qp = 0; qp < 2; ++qp) + xb = xb < 1; + + return xb; +} diff --git a/gcc/testsuite/gcc.target/i386/pr101742b.c b/gcc/testsuite/gcc.target/i386/pr101742b.c new file mode 100644 index 00000000000..ba19064077b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101742b.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -mtune=nano-x2 -mtune-ctrl=sse_unaligned_store_optimal" } */ + +#include "pr101742a.c"