From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x834.google.com (mail-qt1-x834.google.com [IPv6:2607:f8b0:4864:20::834]) by sourceware.org (Postfix) with ESMTPS id 46C7A3951EDD for ; Wed, 4 Aug 2021 18:46:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 46C7A3951EDD Received: by mail-qt1-x834.google.com with SMTP id w10so2079183qtj.3 for ; Wed, 04 Aug 2021 11:46:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=8b+QMjpsd8XgDKYh4fAPo3xTKCKNHXgUtrzkNrnW/Jc=; b=ahEMamCzgCf/A3chJWCgHUXVo7e4WQxZWPTbo4Lnm5gGhWexwfWVxqHay7OVBFopmA AVv/W/7go3Pd6AlWg0avRWXSKfLGGKJQ7+6Lwm9qWiBpGCkpFBBr5DGa9QBYb7FbybwG IIMxwNWidakzy8uqpFW/VCL612IdpOrf8vHF8aIpqJPgRhgVwyZ+Romx6jJxUOzmStvg TR3Q/tHfN5cEUUvzX3I5zzr79vYz51fMcoYDqxbxmsLcFAL/+VIy1jRNYq8iEeDzQLun UJpZnda63wYLnPYZBQ1/dVKzy9BwUFnDGX9MAEuIiX73YQvJq4mBM86zWTB13DJi42db OFKw== X-Gm-Message-State: AOAM5304xNGQVqugZFt8DIz6W8GCHSlQi1xd3DD11sev+abjKAq6jqg+ mthP2i4/v0MjJ+f293wC/P0srm6SAP5fOSaQ1xw= X-Google-Smtp-Source: ABdhPJzfzSSLbyowH900/stEvpVo4nrzPZg2cCWvb60rCWJJgJ0BLT0pE/lVrmbyGZY4Z+f4JIrGxVpifpe6mosrAik= X-Received: by 2002:ac8:734b:: with SMTP id q11mr911185qtp.105.1628102814842; Wed, 04 Aug 2021 11:46:54 -0700 (PDT) MIME-Version: 1.0 References: <20210803135646.2545430-1-hjl.tools@gmail.com> In-Reply-To: From: Uros Bizjak Date: Wed, 4 Aug 2021 20:46:43 +0200 Message-ID: Subject: Re: [PATCH v2] x86: Update STORE_MAX_PIECES To: "H.J. Lu" Cc: GCC Patches , Hongtao Liu Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 18:46:56 -0000 On Wed, Aug 4, 2021 at 3:34 PM H.J. Lu wrote: > > On Tue, Aug 3, 2021 at 6:56 AM H.J. Lu wrote: > > > > 1. Update x86 STORE_MAX_PIECES to use OImode and XImode only if inter-unit > > move is enabled since x86 uses vec_duplicate, which is enabled only when > > inter-unit move is enabled, to implement store_by_pieces. > > 2. Update op_by_pieces_d::op_by_pieces_d to set m_max_size to > > STORE_MAX_PIECES for store_by_pieces and to COMPARE_MAX_PIECES for > > compare_by_pieces. > > > > gcc/ > > > > PR target/101742 > > * expr.c (op_by_pieces_d::op_by_pieces_d): Set m_max_size to > > STORE_MAX_PIECES for store_by_pieces and to COMPARE_MAX_PIECES > > for compare_by_pieces. > > * config/i386/i386.h (STORE_MAX_PIECES): Use OImode and XImode > > only if TARGET_INTER_UNIT_MOVES_TO_VEC is true. > > > > gcc/testsuite/ > > > > PR target/101742 > > * gcc.target/i386/pr101742a.c: New test. > > * gcc.target/i386/pr101742b.c: Likewise. > > --- > > gcc/config/i386/i386.h | 20 +++++++++++--------- > > gcc/expr.c | 6 +++++- > > gcc/testsuite/gcc.target/i386/pr101742a.c | 16 ++++++++++++++++ > > gcc/testsuite/gcc.target/i386/pr101742b.c | 4 ++++ > > 4 files changed, 36 insertions(+), 10 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101742a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101742b.c > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > index bed9cd9da18..9b416abd5f4 100644 > > --- a/gcc/config/i386/i386.h > > +++ b/gcc/config/i386/i386.h > > @@ -1783,15 +1783,17 @@ typedef struct ix86_args { > > /* STORE_MAX_PIECES is the number of bytes at a time that we can > > store efficiently. */ > > #define STORE_MAX_PIECES \ > > - ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ > > - ? 64 \ > > - : ((TARGET_AVX \ > > - && !TARGET_PREFER_AVX128 \ > > - && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ > > - ? 32 \ > > - : ((TARGET_SSE2 \ > > - && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ > > - ? 16 : UNITS_PER_WORD))) > > + (TARGET_INTER_UNIT_MOVES_TO_VEC \ > > + ? ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ > > + ? 64 \ > > + : ((TARGET_AVX \ > > + && !TARGET_PREFER_AVX128 \ > > + && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ > > + ? 32 \ > > + : ((TARGET_SSE2 \ > > + && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ > > + ? 16 : UNITS_PER_WORD))) \ > > + : UNITS_PER_WORD) > > > > /* If a memory-to-memory move would take MOVE_RATIO or more simple > > move-instruction pairs, we will do a cpymem or libcall instead. > > expr.c has been fixed. Here is the v2 patch for x86 backend. > OK for master? OK, but please add the comment about vec_duplicate before the define to explain the situation with TARGET_INTER_UNIT_MOVES_TO_VEC. Thanks, Uros.