From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-oln040092070021.outbound.protection.outlook.com [40.92.70.21]) by sourceware.org (Postfix) with ESMTPS id 3E35A386FC00 for ; Wed, 19 May 2021 13:27:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3E35A386FC00 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=hotmail.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=bernd.edlinger@hotmail.de ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=f9e+3+GdNKlkanrt7/6HY6bLXSKZt/+ItFfV8moAbIKGxMKciLjfIA3JePO12oxGNm4a6FgaaindM3apLYmC1oWuOyHVFY8ZrRseT+0p7/fWy2IRwt0XEBfr+njYv9b7MXnS2XOIGYESB6JrxQz+ewQo0/pi4rTjlZ2rJCUQZiAydaX8eOBhdA06Dz7cROSFPCZlwAr203gE6i9ItFGqRHq5X3mupCp/zDncnXyaMlPlxbdjd0yNoUtLvgArYEDnXV3gIk3dOxiGNbOdmwrF/rF8tJlf54DhHxni8SG6a/y8CY93h61j83EEzZpt8vZbmMVFcpjL3VUQvwLclEZh3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Wj2Q8s+v71B0mF/p+x0zjWedr6++keiYIffXEs7PYV0=; b=bXayAVnLoBp3Qljerdy94d7Z3l0v95x+51cflXWRm4rPxaziE9EJcY1pr86aQI+a4120augr3oz1wN5vi5XfYWxY0M6hlikHG8jbsRpMijOBVhcRVQ80KgQkjo0Dfgga+isYf0D3qFTf9drHUt3KciSzK/6ctaWTayVD8PLSTEoCcc4LSm2fuAbacjet9Gq7bAEp/y8ZIlM1+whxTHIRzD6PJtQ1r5XfFY7+YStLt8t+0digZgZgICKL2GZtVyOU/XuD7YZME5Vj0TrHWUdpFHTAlAO/wHk6y8szWDwAHFDLhCke67KCkQGuQYl0FNTPTA7Pv5ugleencE844vNM0w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none Received: from DB5EUR03FT060.eop-EUR03.prod.protection.outlook.com (2a01:111:e400:7e0a::42) by DB5EUR03HT164.eop-EUR03.prod.protection.outlook.com (2a01:111:e400:7e0a::471) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.25; Wed, 19 May 2021 13:27:57 +0000 Received: from AM8PR10MB4708.EURPRD10.PROD.OUTLOOK.COM (2a01:111:e400:7e0a::45) by DB5EUR03FT060.mail.protection.outlook.com (2a01:111:e400:7e0a::487) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.25 via Frontend Transport; Wed, 19 May 2021 13:27:57 +0000 X-IncomingTopHeaderMarker: OriginalChecksum:FF6A074F377B7219BD8E3C321AAD08A9FD3E54F20FE2888EFE11A5D130EE49C4; UpperCasedChecksum:63E16B4C1EE2B2632BA4913D82D80AEACE9E451A61CDA5FDAA9F08BBB1391146; SizeAsReceived:9011; Count:48 Received: from AM8PR10MB4708.EURPRD10.PROD.OUTLOOK.COM ([fe80::e41b:107f:af82:150a]) by AM8PR10MB4708.EURPRD10.PROD.OUTLOOK.COM ([fe80::e41b:107f:af82:150a%7]) with mapi id 15.20.4129.033; Wed, 19 May 2021 13:27:57 +0000 Subject: Re: [PATCH v4 12/12] constructor: Check if it is faster to load constant from memory To: "H.J. Lu" , Richard Biener Cc: GCC Patches , Richard Sandiford , Uros Bizjak References: <20210518191646.3071005-1-hjl.tools@gmail.com> <20210518191646.3071005-13-hjl.tools@gmail.com> From: Bernd Edlinger Message-ID: Date: Wed, 19 May 2021 15:27:55 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-TMN: [wl8spMdFD1L0s4z5f/A7rWof9W004iVf] X-ClientProxiedBy: PR3P195CA0011.EURP195.PROD.OUTLOOK.COM (2603:10a6:102:b6::16) To AM8PR10MB4708.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:20b:364::23) X-Microsoft-Original-Message-ID: <3f1d5b0c-b0ea-fcc3-2abc-948c783bafb5@hotmail.de> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [192.168.1.101] (84.57.61.94) by PR3P195CA0011.EURP195.PROD.OUTLOOK.COM (2603:10a6:102:b6::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.25 via Frontend Transport; Wed, 19 May 2021 13:27:57 +0000 X-MS-PublicTrafficType: Email X-IncomingHeaderCount: 48 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-Correlation-Id: 45606722-c2b2-4f89-2b21-08d91ac9ea77 X-MS-TrafficTypeDiagnostic: DB5EUR03HT164: X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 9JrJpAn8tB+kBSUvN0j27V8Cug59WUggr9Wj2w3s3ZGQk8QT/aKTsAnlfBj+YYIVgIdgiM7n62iyivVa43yLbHRQ/hwKSCRSHfWSqMGgXb0YhqbdMFmniujbrbZ8AhRfr3BN13jBFJeQcJJwAcxxmeiYw4f2foNNK+g6DQoLLKxMbQRPu3uUEUZyY3eRCm0Ke1kJMwjJmLr66Et8yfwMi3GlK6HiHrgvvPb9G4cRvUOYPbWB0CTeGHms3NMkvj3cnt+qH8MV11oXKo37JhqcU8JT5Cb7q4z/hh5xlz9Dp9L8c43jwkr3oKTPKT2NfTbcbwDCIQQvpJlugPYK5SSr8bN+lVUGgzGW5ERKTERdiQkUlhFebQcrLsT8/yHA5vZYkksMVgWKnJE002gA59z4Ew== X-MS-Exchange-AntiSpam-MessageData: mrfs3SXp0d9+X7eaiLBV0YI47ZZDb2z3qLP0WqtGgNoSHO148Jh+BMTlqaRjDLSqXkSbFW7+Xab99uDEXJt9D01Etmoi/vT+1cKub7NY81YAKU3sySf1XEbgKHzNSN81DaJTtkvpwaWEhLd2HyB5Lg== X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 45606722-c2b2-4f89-2b21-08d91ac9ea77 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2021 13:27:57.6691 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT060.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: Internet X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5EUR03HT164 X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, FORGED_MUA_MOZILLA, FREEMAIL_FROM, GIT_PATCH_0, KAM_DMARC_STATUS, MSGID_FROM_MTA_HEADER, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 May 2021 13:28:01 -0000 On 5/19/21 3:22 PM, H.J. Lu wrote: > On Wed, May 19, 2021 at 2:33 AM Richard Biener > wrote: >> >> On Tue, May 18, 2021 at 9:16 PM H.J. Lu wrote: >>> >>> When expanding a constant constructor, don't call expand_constructor if >>> it is more efficient to load the data from the memory via move by pieces. >>> >>> gcc/ >>> >>> PR middle-end/90773 >>> * expr.c (expand_expr_real_1): Don't call expand_constructor if >>> it is more efficient to load the data from the memory. >>> >>> gcc/testsuite/ >>> >>> PR middle-end/90773 >>> * gcc.target/i386/pr90773-24.c: New test. >>> * gcc.target/i386/pr90773-25.c: Likewise. >>> --- >>> gcc/expr.c | 10 ++++++++++ >>> gcc/testsuite/gcc.target/i386/pr90773-24.c | 22 ++++++++++++++++++++++ >>> gcc/testsuite/gcc.target/i386/pr90773-25.c | 20 ++++++++++++++++++++ >>> 3 files changed, 52 insertions(+) >>> create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-24.c >>> create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-25.c >>> >>> diff --git a/gcc/expr.c b/gcc/expr.c >>> index d09ee42e262..80e01ea1cbe 100644 >>> --- a/gcc/expr.c >>> +++ b/gcc/expr.c >>> @@ -10886,6 +10886,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, >>> unsigned HOST_WIDE_INT ix; >>> tree field, value; >>> >>> + /* Check if it is more efficient to load the data from >>> + the memory directly. FIXME: How many stores do we >>> + need here if not moved by pieces? */ >>> + unsigned HOST_WIDE_INT bytes >>> + = tree_to_uhwi (TYPE_SIZE_UNIT (type)); >> >> that's prone to fail - it could be a VLA. > > What do you mean by fail? Is it ICE or missed optimization? > Do you have a testcase? > I think for a VLA the TYPE_SIZE_UNIT may be unknown (NULL), or something like "x". for instance something like int test (int x) { int vla[x]; vla[x-1] = 0; return vla[x-1]; } Bernd. >> >>> + if ((bytes / UNITS_PER_WORD) > 2 >>> + && MOVE_MAX_PIECES > UNITS_PER_WORD >>> + && can_move_by_pieces (bytes, TYPE_ALIGN (type))) >>> + goto normal_inner_ref; >>> + >> >> It looks like you're concerned about aggregate copies but this also handles >> non-aggregates (which on GIMPLE might already be optimized of course). > > Here I check if we copy more than 2 words and we can move more than > a word in a single instruction. > >> Also you say "if it's cheaper" but I see no cost considerations. How do >> we generally handle immed const vs. load from constant pool costs? > > This trades 2 (update to 8) stores with one load plus one store. Is there > a way to check which one is faster? > >>> FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init), ix, >>> field, value) >>> if (tree_int_cst_equal (field, index)) >>> diff --git a/gcc/testsuite/gcc.target/i386/pr90773-24.c b/gcc/testsuite/gcc.target/i386/pr90773-24.c >>> new file mode 100644 >>> index 00000000000..4a4b62533dc >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/i386/pr90773-24.c >>> @@ -0,0 +1,22 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -march=x86-64" } */ >>> + >>> +struct S >>> +{ >>> + long long s1 __attribute__ ((aligned (8))); >>> + unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14; >>> +}; >>> + >>> +const struct S array[] = { >>> + { 0, 60, 640, 2112543726, 39682, 48, 16, 33, 10, 96, 2, 0, 0, 4 } >>> +}; >>> + >>> +void >>> +foo (struct S *x) >>> +{ >>> + x[0] = array[0]; >>> +} >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, \\(%\[\^,\]+\\)" 1 } } */ >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 16\\(%\[\^,\]+\\)" 1 } } */ >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 32\\(%\[\^,\]+\\)" 1 } } */ >>> +/* { dg-final { scan-assembler-times "movups\[\\t \]%xmm\[0-9\]+, 48\\(%\[\^,\]+\\)" 1 } } */ >>> diff --git a/gcc/testsuite/gcc.target/i386/pr90773-25.c b/gcc/testsuite/gcc.target/i386/pr90773-25.c >>> new file mode 100644 >>> index 00000000000..2520b670989 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/i386/pr90773-25.c >>> @@ -0,0 +1,20 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -march=skylake" } */ >>> + >>> +struct S >>> +{ >>> + long long s1 __attribute__ ((aligned (8))); >>> + unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14; >>> +}; >>> + >>> +const struct S array[] = { >>> + { 0, 60, 640, 2112543726, 39682, 48, 16, 33, 10, 96, 2, 0, 0, 4 } >>> +}; >>> + >>> +void >>> +foo (struct S *x) >>> +{ >>> + x[0] = array[0]; >>> +} >>> +/* { dg-final { scan-assembler-times "vmovdqu\[\\t \]%ymm\[0-9\]+, \\(%\[\^,\]+\\)" 1 } } */ >>> +/* { dg-final { scan-assembler-times "vmovdqu\[\\t \]%ymm\[0-9\]+, 32\\(%\[\^,\]+\\)" 1 } } */ >>> -- >>> 2.31.1 >>> > > >