From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by sourceware.org (Postfix) with ESMTPS id 7CC45386480A for ; Wed, 4 Aug 2021 20:02:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7CC45386480A Received: by mail-pj1-x1030.google.com with SMTP id u21-20020a17090a8915b02901782c36f543so1521502pjn.4 for ; Wed, 04 Aug 2021 13:02:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lIS0d88wWiZmJXtoprh7whTa4VlKXrGf0SsnpfNl7ds=; b=Hw6Gr+1oaZ8DEAOsOs+4N62+nQ4RukHMrWnbleAsnEWNV7wraBxjkZzWiMcpMMlmmt uubug8OSK48CykJNgq9HpfygOyoEra18BUOR1bRyFVXBhMfRf056Mj1WKHBK6md1rTkD aabE/xubZ+3UNcf6xMbEdZa6g7quSNWQL9hxht3GWEJaLWZRwomzxbZdaPhUASBsZ2qP Sg4OjJpAAuZlyiZLZ9Pjpsw3t1r989C/ujejpM3zk+5sisW4LCKEjc1UUYdH8zgeBi2j uBeL9As0yaomA355R/8whJEolBZyf/yOHDw+1jgpVCkPGnQc7yTGniXKXu/R/i6i+SAM DjJQ== X-Gm-Message-State: AOAM533jVAekt/QzfxvMuir3NagTMN6aLo5Y65wtWDdZS8LvfCo4J8Gg io3z6MUbA1f70mE6CSAb6y73stO6hhdXDwoc6ps= X-Google-Smtp-Source: ABdhPJzZxL6H8unZjsuwLQyAbO9Jf00i7JAC9DGhG4Kl9JKJMx2dOfxoEr2NrBBa3eeS1FHVUM9VGxo0RdxlyCLXL88= X-Received: by 2002:a65:5083:: with SMTP id r3mr643548pgp.161.1628107352498; Wed, 04 Aug 2021 13:02:32 -0700 (PDT) MIME-Version: 1.0 References: <20210803135646.2545430-1-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Wed, 4 Aug 2021 13:01:56 -0700 Message-ID: Subject: [PATCH v3] x86: Update STORE_MAX_PIECES To: Uros Bizjak Cc: GCC Patches , Hongtao Liu Content-Type: multipart/mixed; boundary="000000000000676caf05c8c14810" X-Spam-Status: No, score=-3031.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 20:02:35 -0000 --000000000000676caf05c8c14810 Content-Type: text/plain; charset="UTF-8" On Wed, Aug 4, 2021 at 11:46 AM Uros Bizjak wrote: > > On Wed, Aug 4, 2021 at 3:34 PM H.J. Lu wrote: > > > > On Tue, Aug 3, 2021 at 6:56 AM H.J. Lu wrote: > > > > > > 1. Update x86 STORE_MAX_PIECES to use OImode and XImode only if inter-unit > > > move is enabled since x86 uses vec_duplicate, which is enabled only when > > > inter-unit move is enabled, to implement store_by_pieces. > > > 2. Update op_by_pieces_d::op_by_pieces_d to set m_max_size to > > > STORE_MAX_PIECES for store_by_pieces and to COMPARE_MAX_PIECES for > > > compare_by_pieces. > > > > > > gcc/ > > > > > > PR target/101742 > > > * expr.c (op_by_pieces_d::op_by_pieces_d): Set m_max_size to > > > STORE_MAX_PIECES for store_by_pieces and to COMPARE_MAX_PIECES > > > for compare_by_pieces. > > > * config/i386/i386.h (STORE_MAX_PIECES): Use OImode and XImode > > > only if TARGET_INTER_UNIT_MOVES_TO_VEC is true. > > > > > > gcc/testsuite/ > > > > > > PR target/101742 > > > * gcc.target/i386/pr101742a.c: New test. > > > * gcc.target/i386/pr101742b.c: Likewise. > > > --- > > > gcc/config/i386/i386.h | 20 +++++++++++--------- > > > gcc/expr.c | 6 +++++- > > > gcc/testsuite/gcc.target/i386/pr101742a.c | 16 ++++++++++++++++ > > > gcc/testsuite/gcc.target/i386/pr101742b.c | 4 ++++ > > > 4 files changed, 36 insertions(+), 10 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101742a.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101742b.c > > > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > > index bed9cd9da18..9b416abd5f4 100644 > > > --- a/gcc/config/i386/i386.h > > > +++ b/gcc/config/i386/i386.h > > > @@ -1783,15 +1783,17 @@ typedef struct ix86_args { > > > /* STORE_MAX_PIECES is the number of bytes at a time that we can > > > store efficiently. */ > > > #define STORE_MAX_PIECES \ > > > - ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ > > > - ? 64 \ > > > - : ((TARGET_AVX \ > > > - && !TARGET_PREFER_AVX128 \ > > > - && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ > > > - ? 32 \ > > > - : ((TARGET_SSE2 \ > > > - && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ > > > - ? 16 : UNITS_PER_WORD))) > > > + (TARGET_INTER_UNIT_MOVES_TO_VEC \ > > > + ? ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ > > > + ? 64 \ > > > + : ((TARGET_AVX \ > > > + && !TARGET_PREFER_AVX128 \ > > > + && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ > > > + ? 32 \ > > > + : ((TARGET_SSE2 \ > > > + && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ > > > + ? 16 : UNITS_PER_WORD))) \ > > > + : UNITS_PER_WORD) > > > > > > /* If a memory-to-memory move would take MOVE_RATIO or more simple > > > move-instruction pairs, we will do a cpymem or libcall instead. > > > > expr.c has been fixed. Here is the v2 patch for x86 backend. > > OK for master? > > OK, but please add the comment about vec_duplicate before the define > to explain the situation with TARGET_INTER_UNIT_MOVES_TO_VEC. This is what I am checking in with /* STORE_MAX_PIECES is the number of bytes at a time that we can store efficiently. Allow 16/32/64 bytes only if inter-unit move is enabled since vec_duplicate enabled by inter-unit move is used to implement store_by_pieces of 16/32/64 bytes. */ > Thanks, > Uros. Thanks. -- H.J. --000000000000676caf05c8c14810 Content-Type: text/x-patch; charset="US-ASCII"; name="v3-0001-x86-Update-STORE_MAX_PIECES.patch" Content-Disposition: attachment; filename="v3-0001-x86-Update-STORE_MAX_PIECES.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_krxx0aca0 RnJvbSA5NDg3YzE2NWFmYjViNjA4M2EzZmMwOWEyZThiN2JjYWJmZTI4NzY1IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiAiSC5KLiBMdSIgPGhqbC50b29sc0BnbWFpbC5jb20+CkRhdGU6 IFR1ZSwgMyBBdWcgMjAyMSAwNjoxNzoyMiAtMDcwMApTdWJqZWN0OiBbUEFUQ0ggdjNdIHg4Njog VXBkYXRlIFNUT1JFX01BWF9QSUVDRVMKClVwZGF0ZSBTVE9SRV9NQVhfUElFQ0VTIHRvIGFsbG93 IDE2LzMyLzY0IGJ5dGVzIG9ubHkgaWYgaW50ZXItdW5pdCBtb3ZlCmlzIGVuYWJsZWQgc2luY2Ug dmVjX2R1cGxpY2F0ZSBlbmFibGVkIGJ5IGludGVyLXVuaXQgbW92ZSBpcyB1c2VkIHRvCmltcGxl bWVudCBzdG9yZV9ieV9waWVjZXMgb2YgMTYvMzIvNjQgYnl0ZXMuCgpnY2MvCgoJUFIgdGFyZ2V0 LzEwMTc0MgoJKiBjb25maWcvaTM4Ni9pMzg2LmggKFNUT1JFX01BWF9QSUVDRVMpOiBBbGxvdyAx Ni8zMi82NCBieXRlcwoJb25seSBpZiBUQVJHRVRfSU5URVJfVU5JVF9NT1ZFU19UT19WRUMgaXMg dHJ1ZS4KCmdjYy90ZXN0c3VpdGUvCgoJUFIgdGFyZ2V0LzEwMTc0MgoJKiBnY2MudGFyZ2V0L2kz ODYvcHIxMDE3NDJhLmM6IE5ldyB0ZXN0LgoJKiBnY2MudGFyZ2V0L2kzODYvcHIxMDE3NDJiLmM6 IExpa2V3aXNlLgotLS0KIGdjYy9jb25maWcvaTM4Ni9pMzg2LmggICAgICAgICAgICAgICAgICAg IHwgMjYgKysrKysrKysrKysrKy0tLS0tLS0tLS0KIGdjYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9p Mzg2L3ByMTAxNzQyYS5jIHwgMTYgKysrKysrKysrKysrKysKIGdjYy90ZXN0c3VpdGUvZ2NjLnRh cmdldC9pMzg2L3ByMTAxNzQyYi5jIHwgIDQgKysrKwogMyBmaWxlcyBjaGFuZ2VkLCAzNSBpbnNl cnRpb25zKCspLCAxMSBkZWxldGlvbnMoLSkKIGNyZWF0ZSBtb2RlIDEwMDY0NCBnY2MvdGVzdHN1 aXRlL2djYy50YXJnZXQvaTM4Ni9wcjEwMTc0MmEuYwogY3JlYXRlIG1vZGUgMTAwNjQ0IGdjYy90 ZXN0c3VpdGUvZ2NjLnRhcmdldC9pMzg2L3ByMTAxNzQyYi5jCgpkaWZmIC0tZ2l0IGEvZ2NjL2Nv bmZpZy9pMzg2L2kzODYuaCBiL2djYy9jb25maWcvaTM4Ni9pMzg2LmgKaW5kZXggYmVkOWNkOWRh MTguLjIxZmU1MWJiYTQwIDEwMDY0NAotLS0gYS9nY2MvY29uZmlnL2kzODYvaTM4Ni5oCisrKyBi L2djYy9jb25maWcvaTM4Ni9pMzg2LmgKQEAgLTE3ODAsMTggKzE3ODAsMjIgQEAgdHlwZWRlZiBz dHJ1Y3QgaXg4Nl9hcmdzIHsKIAkgICYmIFRBUkdFVF9TU0VfVU5BTElHTkVEX1NUT1JFX09QVElN QUwpIFwKIAkgPyAxNiA6IFVOSVRTX1BFUl9XT1JEKSkpCiAKLS8qIFNUT1JFX01BWF9QSUVDRVMg aXMgdGhlIG51bWJlciBvZiBieXRlcyBhdCBhIHRpbWUgdGhhdCB3ZSBjYW4KLSAgIHN0b3JlIGVm ZmljaWVudGx5LiAgKi8KKy8qIFNUT1JFX01BWF9QSUVDRVMgaXMgdGhlIG51bWJlciBvZiBieXRl cyBhdCBhIHRpbWUgdGhhdCB3ZSBjYW4gc3RvcmUKKyAgIGVmZmljaWVudGx5LiAgQWxsb3cgMTYv MzIvNjQgYnl0ZXMgb25seSBpZiBpbnRlci11bml0IG1vdmUgaXMgZW5hYmxlZAorICAgc2luY2Ug dmVjX2R1cGxpY2F0ZSBlbmFibGVkIGJ5IGludGVyLXVuaXQgbW92ZSBpcyB1c2VkIHRvIGltcGxl bWVudAorICAgc3RvcmVfYnlfcGllY2VzIG9mIDE2LzMyLzY0IGJ5dGVzLiAgKi8KICNkZWZpbmUg U1RPUkVfTUFYX1BJRUNFUyBcCi0gICgoVEFSR0VUX0FWWDUxMkYgJiYgIVRBUkdFVF9QUkVGRVJf QVZYMjU2KSBcCi0gICA/IDY0IFwKLSAgIDogKChUQVJHRVRfQVZYIFwKLSAgICAgICAmJiAhVEFS R0VUX1BSRUZFUl9BVlgxMjggXAotICAgICAgICYmICFUQVJHRVRfQVZYMjU2X1NQTElUX1VOQUxJ R05FRF9TVE9SRSkgXAotICAgICAgPyAzMiBcCi0gICAgICA6ICgoVEFSR0VUX1NTRTIgXAotCSAg JiYgVEFSR0VUX1NTRV9VTkFMSUdORURfU1RPUkVfT1BUSU1BTCkgXAotCSA/IDE2IDogVU5JVFNf UEVSX1dPUkQpKSkKKyAgKFRBUkdFVF9JTlRFUl9VTklUX01PVkVTX1RPX1ZFQyBcCisgICA/ICgo VEFSR0VUX0FWWDUxMkYgJiYgIVRBUkdFVF9QUkVGRVJfQVZYMjU2KSBcCisgICAgICA/IDY0IFwK KyAgICAgIDogKChUQVJHRVRfQVZYIFwKKwkgICYmICFUQVJHRVRfUFJFRkVSX0FWWDEyOCBcCisJ ICAmJiAhVEFSR0VUX0FWWDI1Nl9TUExJVF9VTkFMSUdORURfU1RPUkUpIFwKKwkgID8gMzIgXAor CSAgOiAoKFRBUkdFVF9TU0UyIFwKKwkgICAgICAmJiBUQVJHRVRfU1NFX1VOQUxJR05FRF9TVE9S RV9PUFRJTUFMKSBcCisJICAgICAgPyAxNiA6IFVOSVRTX1BFUl9XT1JEKSkpIFwKKyAgIDogVU5J VFNfUEVSX1dPUkQpCiAKIC8qIElmIGEgbWVtb3J5LXRvLW1lbW9yeSBtb3ZlIHdvdWxkIHRha2Ug TU9WRV9SQVRJTyBvciBtb3JlIHNpbXBsZQogICAgbW92ZS1pbnN0cnVjdGlvbiBwYWlycywgd2Ug d2lsbCBkbyBhIGNweW1lbSBvciBsaWJjYWxsIGluc3RlYWQuCmRpZmYgLS1naXQgYS9nY2MvdGVz dHN1aXRlL2djYy50YXJnZXQvaTM4Ni9wcjEwMTc0MmEuYyBiL2djYy90ZXN0c3VpdGUvZ2NjLnRh cmdldC9pMzg2L3ByMTAxNzQyYS5jCm5ldyBmaWxlIG1vZGUgMTAwNjQ0CmluZGV4IDAwMDAwMDAw MDAwLi42N2VhNDA1ODdkZAotLS0gL2Rldi9udWxsCisrKyBiL2djYy90ZXN0c3VpdGUvZ2NjLnRh cmdldC9pMzg2L3ByMTAxNzQyYS5jCkBAIC0wLDAgKzEsMTYgQEAKKy8qIHsgZGctZG8gY29tcGls ZSB9ICovCisvKiB7IGRnLWFkZGl0aW9uYWwtb3B0aW9ucyAiLU8zIC1tdHVuZT1uYW5vLXgyIiB9 ICovCisKK2ludCBuMjsKKworX19hdHRyaWJ1dGVfXyAoKHNpbWQpKSBjaGFyCit3NyAodm9pZCkK K3sKKyAgc2hvcnQgaW50IHhiID0gbjI7CisgIGludCBxcDsKKworICBmb3IgKHFwID0gMDsgcXAg PCAyOyArK3FwKQorICAgIHhiID0geGIgPCAxOworCisgIHJldHVybiB4YjsKK30KZGlmZiAtLWdp dCBhL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9pMzg2L3ByMTAxNzQyYi5jIGIvZ2NjL3Rlc3Rz dWl0ZS9nY2MudGFyZ2V0L2kzODYvcHIxMDE3NDJiLmMKbmV3IGZpbGUgbW9kZSAxMDA2NDQKaW5k ZXggMDAwMDAwMDAwMDAuLmJhMTkwNjQwNzdiCi0tLSAvZGV2L251bGwKKysrIGIvZ2NjL3Rlc3Rz dWl0ZS9nY2MudGFyZ2V0L2kzODYvcHIxMDE3NDJiLmMKQEAgLTAsMCArMSw0IEBACisvKiB7IGRn LWRvIGNvbXBpbGUgfSAqLworLyogeyBkZy1hZGRpdGlvbmFsLW9wdGlvbnMgIi1PMyAtbXR1bmU9 bmFuby14MiAtbXR1bmUtY3RybD1zc2VfdW5hbGlnbmVkX3N0b3JlX29wdGltYWwiIH0gKi8KKwor I2luY2x1ZGUgInByMTAxNzQyYS5jIgotLSAKMi4zMS4xCgo= --000000000000676caf05c8c14810--