From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.tachyum.com (mx2.tachyum.com [50.229.46.110]) by sourceware.org (Postfix) with ESMTPS id D14103858C2C for ; Fri, 9 Jul 2021 02:39:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D14103858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tachyum.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tachyum.com Received: by mx2.tachyum.com (Postfix, from userid 1000) id 4FF9710055F2; Thu, 8 Jul 2021 19:39:17 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-Spam-Level: X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_NUMSUBJECT, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 Received: from THQ-EX1.tachyum.com (thq-ex1.tachyum.com [10.7.1.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx2.tachyum.com (Postfix) with ESMTPS id 6792010055F1 for ; Thu, 8 Jul 2021 19:39:16 -0700 (PDT) Received: from [10.0.96.2] (10.0.96.2) by THQ-EX1.tachyum.com (10.7.1.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.14; Thu, 8 Jul 2021 19:39:15 -0700 Subject: Re: [RFA] Attach MEM_EXPR information when flushing BLKmode args to the stack - V2 From: Jeff Law To: GCC Patches References: <45e3b746-c762-cd92-6429-78af069bac9c@tachyum.com> Message-ID: Date: Thu, 8 Jul 2021 20:39:13 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <45e3b746-c762-cd92-6429-78af069bac9c@tachyum.com> Content-Type: multipart/mixed; boundary="------------E70D21C894DBDDEF49AA7EFD" Content-Language: en-US X-Originating-IP: [10.0.96.2] X-ClientProxiedBy: THQ-EX3.tachyum.com (10.7.1.26) To THQ-EX1.tachyum.com (10.7.1.6) X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jul 2021 02:39:19 -0000 --------------E70D21C894DBDDEF49AA7EFD Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit On 7/2/2021 10:13 AM, Jeff Law wrote: > > This is a minor missed optimization we found with our internal port. > > Given this code: > > typedef struct {short a; short b;} T; > > extern void g1(); > > void f(T s) > { >         if (s.a < 0) >                 g1(); > } > > > "s" is passed in a register, but it's still a BLKmode object because > the alignment of T is smaller than the alignment that an integer of > the same size would have (16 bits vs 32 bits). > > > Because "s" is BLKmode function.c is going to store it into a stack > slot and we'll load it from the that slot for each reference.  So on > the v850 (just to pick a port that likely has the same behavior we're > seeing) we have this RTL from CSE2: > > > (insn 2 4 3 2 (set (mem/c:SI (plus:SI (reg/f:SI 34 .fp) >                 (const_int -4 [0xfffffffffffffffc])) [2 S4 A32]) >         (reg:SI 6 r6)) "j.c":6:1 7 {*movsi_internal} >      (expr_list:REG_DEAD (reg:SI 6 r6) >         (nil))) > (note 3 2 8 2 NOTE_INSN_FUNCTION_BEG) > (insn 8 3 9 2 (set (reg:HI 44 [ s.a ]) >         (mem/c:HI (plus:SI (reg/f:SI 34 .fp) >                 (const_int -4 [0xfffffffffffffffc])) [1 s.a+0 S2 > A32])) "j.c":7:5 3 {*movhi_internal} >      (nil)) > (insn 9 8 10 2 (parallel [ >             (set (reg:SI 45) >                 (ashift:SI (subreg:SI (reg:HI 44 [ s.a ]) 0) >                     (const_int 16 [0x10]))) >             (clobber (reg:CC 32 psw)) >         ]) "j.c":7:5 94 {ashlsi3_clobber_flags} >      (expr_list:REG_DEAD (reg:HI 44 [ s.a ]) >         (expr_list:REG_UNUSED (reg:CC 32 psw) >             (nil)))) > (insn 10 9 11 2 (parallel [ >             (set (reg:SI 43) >                 (ashiftrt:SI (reg:SI 45) >                     (const_int 16 [0x10]))) >             (clobber (reg:CC 32 psw)) >         ]) "j.c":7:5 104 {ashrsi3_clobber_flags} >      (expr_list:REG_DEAD (reg:SI 45) >         (expr_list:REG_UNUSED (reg:CC 32 psw) >             (nil)))) > > > Insn 2 is the store into the stack. insn 8 is the load for s.a in the > conditional.  DSE1 replaces the MEM in insn 8 with (reg 6) since (reg > 6) has the value we want.  After that the store at insn 2 is dead.  > Sadly DSE never removes the store. > > The problem is RTL DSE considers a store with no MEM_EXPR as escaping, > which keeps the MEM live.  The lack of a MEM_EXPR is due to call to > change_address to twiddle the mode on the MEM for the store at insn > 2.  It should be safe to copy the MEM_EXPR (which should always be a > PARM_DECL) from the original memory to the memory returned by > change_address.  Doing so results in DSE1 removing the store at insn 2. > > It would be nice to remove the stack setup/teardown.   I'm not offhand > aware of mechanisms to remove the setup/teardown after we've already > allocated a slot, even if the slot is no longer used. > > Bootstrapped and regression tested on x86, though I don't think that's > a particularly useful test.  So I also ran it through my tester across > those pesky embedded targets without regressions as well. > > I didn't include a test simply because I didn't want to have an insane > target selector.  I guess if we really wanted a test we could look > after DSE1 is done and verify there aren't any MEMs left at all.  > Willing to try that if the consensus is we want this tested. > > OK for the trunk? So Richi questioned if using adjust_address rather than change_address was sufficient to solve the problem.  It is.  I've bootstrapped & regression tested on x86_64 and regression tested this against the usual set of crosses in the tester. OK for the trunk now? JEff --------------E70D21C894DBDDEF49AA7EFD Content-Type: text/plain; charset="UTF-8"; name="P" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="P" CSogZnVuY3Rpb24uYyAoYXNzaWduX3Bhcm1fc2V0dXBfYmxvY2spOiBVc2UgYWRqdXN0X2Fk ZHJlc3MgaW5zdGVhZAoJb2YgY2hhbmdlX2FkZHJlc3MgdG8gcHJlc2VydmUgTUVNX0VYUFIg YW5kIGZyaWVuZHMuCgpkaWZmIC0tZ2l0IGEvZ2NjL2Z1bmN0aW9uLmMgYi9nY2MvZnVuY3Rp b24uYwppbmRleCAwMGIyZmU3MGM3ZC4uYWYzZDU3YjMyYTMgMTAwNjQ0Ci0tLSBhL2djYy9m dW5jdGlvbi5jCisrKyBiL2djYy9mdW5jdGlvbi5jCkBAIC0zMDM2LDcgKzMwMzYsMTUgQEAg YXNzaWduX3Bhcm1fc2V0dXBfYmxvY2sgKHN0cnVjdCBhc3NpZ25fcGFybV9kYXRhX2FsbCAq YWxsLAogCQkgIHJlZyA9IGdlbl9ydHhfUkVHICh3b3JkX21vZGUsIFJFR05PIChlbnRyeV9w YXJtKSk7CiAJCSAgcmVnID0gY29udmVydF90b19tb2RlIChtb2RlLCBjb3B5X3RvX3JlZyAo cmVnKSwgMSk7CiAJCX0KLQkgICAgICBlbWl0X21vdmVfaW5zbiAoY2hhbmdlX2FkZHJlc3Mg KG1lbSwgbW9kZSwgMCksIHJlZyk7CisKKwkgICAgICAvKiBXZSB1c2UgYWRqdXN0X2FkZHJl c3MgdG8gZ2V0IGEgbmV3IE1FTSB3aXRoIHRoZSBtb2RlCisJCSBjaGFuZ2VkLiAgYWRqdXN0 X2FkZHJlc3MgaXMgYmV0dGVyIHRoYW4gY2hhbmdlX2FkZHJlc3MKKwkJIGZvciB0aGlzIHB1 cnBvc2UgYmVjYXVzZSBhZGp1c3RfYWRkcmVzcyBkb2VzIG5vdCBsb3NlCisJCSB0aGUgTUVN X0VYUFIgYXNzb2NpYXRlZCB3aXRoIHRoZSBNRU0uCisKKwkJIElmIHRoZSBNRU1fRVhQUiBp cyBsb3N0LCB0aGVuIG9wdGltaXphdGlvbnMgbGlrZSBEU0UKKwkJIGFzc3VtZSB0aGUgTUVN IGVzY2FwZXMgYW5kIHRodXMgaXMgbm90IHN1YmplY3QgdG8gRFNFLiAgKi8KKwkgICAgICBl bWl0X21vdmVfaW5zbiAoYWRqdXN0X2FkZHJlc3MgKG1lbSwgbW9kZSwgMCksIHJlZyk7CiAJ ICAgIH0KIAogI2lmZGVmIEJMT0NLX1JFR19QQURESU5HCg== --------------E70D21C894DBDDEF49AA7EFD--