From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by sourceware.org (Postfix) with ESMTPS id 9BA1A3854812 for ; Tue, 3 Aug 2021 13:56:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9BA1A3854812 Received: by mail-pj1-x102b.google.com with SMTP id mz5-20020a17090b3785b0290176ecf64922so3943848pjb.3 for ; Tue, 03 Aug 2021 06:56:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=J4OdvrCzKJEJDHqnvEBKpPqrS9CLxka03KdeBcam/z8=; b=bdw4m9RU3KQL3iLlreCynX5GQXoGMKB8JygD2D3KfUjlvGiB8EvXXL9J9ZnLDZAsPC FABlUr66KZRfYSj2RDewnGyPlM/IM0892O/axmRR+aOlttPUUyYxcyF9yrrcWyJn8d5x F3TY87TlLIW5uAIsSxr9snvrX5CyrvszpzcA1fURZT7Dc7BqvJEwvva2AB6It4AudtuR 6NikKKIeXUM+vGyz+LQvadbF/l1naC+8wBmm/coT4rpy8OkS5yAra+tkBxGAHCu0KZk1 yfPOgahoJzZ/9I4c2Yg8bwNs564Pmn18ejtZC7YtNJ463HdGbw23PLfn9zK2y3oV6pRa zIBQ== X-Gm-Message-State: AOAM533MD8q6qoSPOjjsd5n/zSCPshIUZAp4lZNnLK0b71j5kvcfv7sH U1eomEwPNdXi63XxyEI6/aU= X-Google-Smtp-Source: ABdhPJwR6ZTL7MektjL7WM4vQ91AtvucqoV0RM+KqDMvLD9owFEjAa3BGFAC+BoI9jSnaWSStRcLZg== X-Received: by 2002:a65:62da:: with SMTP id m26mr5566404pgv.370.1627999018502; Tue, 03 Aug 2021 06:56:58 -0700 (PDT) Received: from gnu-cfl-2.localdomain ([172.58.38.240]) by smtp.gmail.com with ESMTPSA id g20sm6071596pfv.88.2021.08.03.06.56.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Aug 2021 06:56:58 -0700 (PDT) Received: from gnu-tgl-2.localdomain (gnu-tgl-2 [192.168.1.34]) by gnu-cfl-2.localdomain (Postfix) with ESMTPS id DA6D4C0133; Tue, 3 Aug 2021 06:56:56 -0700 (PDT) Received: from gnu-tgl-2.lan (localhost [IPv6:::1]) by gnu-tgl-2.localdomain (Postfix) with ESMTP id B4D143002BA; Tue, 3 Aug 2021 06:56:46 -0700 (PDT) From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Cc: Uros Bizjak , Richard Sandiford Subject: [PATCH] by_pieces: Properly set m_max_size in op_by_pieces Date: Tue, 3 Aug 2021 06:56:46 -0700 Message-Id: <20210803135646.2545430-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3033.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Aug 2021 13:57:01 -0000 1. Update x86 STORE_MAX_PIECES to use OImode and XImode only if inter-unit move is enabled since x86 uses vec_duplicate, which is enabled only when inter-unit move is enabled, to implement store_by_pieces. 2. Update op_by_pieces_d::op_by_pieces_d to set m_max_size to STORE_MAX_PIECES for store_by_pieces and to COMPARE_MAX_PIECES for compare_by_pieces. gcc/ PR target/101742 * expr.c (op_by_pieces_d::op_by_pieces_d): Set m_max_size to STORE_MAX_PIECES for store_by_pieces and to COMPARE_MAX_PIECES for compare_by_pieces. * config/i386/i386.h (STORE_MAX_PIECES): Use OImode and XImode only if TARGET_INTER_UNIT_MOVES_TO_VEC is true. gcc/testsuite/ PR target/101742 * gcc.target/i386/pr101742a.c: New test. * gcc.target/i386/pr101742b.c: Likewise. --- gcc/config/i386/i386.h | 20 +++++++++++--------- gcc/expr.c | 6 +++++- gcc/testsuite/gcc.target/i386/pr101742a.c | 16 ++++++++++++++++ gcc/testsuite/gcc.target/i386/pr101742b.c | 4 ++++ 4 files changed, 36 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr101742a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr101742b.c diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index bed9cd9da18..9b416abd5f4 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1783,15 +1783,17 @@ typedef struct ix86_args { /* STORE_MAX_PIECES is the number of bytes at a time that we can store efficiently. */ #define STORE_MAX_PIECES \ - ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ - ? 64 \ - : ((TARGET_AVX \ - && !TARGET_PREFER_AVX128 \ - && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ - ? 32 \ - : ((TARGET_SSE2 \ - && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ - ? 16 : UNITS_PER_WORD))) + (TARGET_INTER_UNIT_MOVES_TO_VEC \ + ? ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ + ? 64 \ + : ((TARGET_AVX \ + && !TARGET_PREFER_AVX128 \ + && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ + ? 32 \ + : ((TARGET_SSE2 \ + && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ + ? 16 : UNITS_PER_WORD))) \ + : UNITS_PER_WORD) /* If a memory-to-memory move would take MOVE_RATIO or more simple move-instruction pairs, we will do a cpymem or libcall instead. diff --git a/gcc/expr.c b/gcc/expr.c index b65cfcfdcd1..2964b38b9a5 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -1131,7 +1131,11 @@ op_by_pieces_d::op_by_pieces_d (rtx to, bool to_load, bool qi_vector_mode) : m_to (to, to_load, NULL, NULL), m_from (from, from_load, from_cfn, from_cfn_data), - m_len (len), m_max_size (MOVE_MAX_PIECES + 1), + m_len (len), + m_max_size (((!to_load && from == nullptr) + ? STORE_MAX_PIECES + : (from_cfn != nullptr + ? COMPARE_MAX_PIECES : MOVE_MAX_PIECES)) + 1), m_push (push), m_qi_vector_mode (qi_vector_mode) { int toi = m_to.get_addr_inc (); diff --git a/gcc/testsuite/gcc.target/i386/pr101742a.c b/gcc/testsuite/gcc.target/i386/pr101742a.c new file mode 100644 index 00000000000..67ea40587dd --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101742a.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -mtune=nano-x2" } */ + +int n2; + +__attribute__ ((simd)) char +w7 (void) +{ + short int xb = n2; + int qp; + + for (qp = 0; qp < 2; ++qp) + xb = xb < 1; + + return xb; +} diff --git a/gcc/testsuite/gcc.target/i386/pr101742b.c b/gcc/testsuite/gcc.target/i386/pr101742b.c new file mode 100644 index 00000000000..ba19064077b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101742b.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -mtune=nano-x2 -mtune-ctrl=sse_unaligned_store_optimal" } */ + +#include "pr101742a.c" -- 2.31.1