From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.tachyum.com (mx2.tachyum.com [50.229.46.110]) by sourceware.org (Postfix) with ESMTPS id BE8683857808 for ; Fri, 2 Jul 2021 16:13:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BE8683857808 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tachyum.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tachyum.com Received: by mx2.tachyum.com (Postfix, from userid 1000) id 4E0EA10055F4; Fri, 2 Jul 2021 09:13:14 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-Spam-Level: X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 Received: from THQ-EX1.tachyum.com (thq-ex1.tachyum.com [10.7.1.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx2.tachyum.com (Postfix) with ESMTPS id 91C7810055EA for ; Fri, 2 Jul 2021 09:13:13 -0700 (PDT) Received: from [10.0.96.2] (10.0.96.2) by THQ-EX1.tachyum.com (10.7.1.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.14; Fri, 2 Jul 2021 09:13:12 -0700 To: GCC Patches From: Jeff Law Subject: [RFA] Attach MEM_EXPR information when flushing BLKmode args to the stack Message-ID: <45e3b746-c762-cd92-6429-78af069bac9c@tachyum.com> Date: Fri, 2 Jul 2021 10:13:11 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------C43673FD911D0FA2FC0ED47A" Content-Language: en-US X-Originating-IP: [10.0.96.2] X-ClientProxiedBy: THQ-EX3.tachyum.com (10.7.1.26) To THQ-EX1.tachyum.com (10.7.1.6) X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jul 2021 16:13:16 -0000 --------------C43673FD911D0FA2FC0ED47A Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit This is a minor missed optimization we found with our internal port. Given this code: typedef struct {short a; short b;} T; extern void g1(); void f(T s) {         if (s.a < 0)                 g1(); } "s" is passed in a register, but it's still a BLKmode object because the alignment of T is smaller than the alignment that an integer of the same size would have (16 bits vs 32 bits). Because "s" is BLKmode function.c is going to store it into a stack slot and we'll load it from the that slot for each reference.  So on the v850 (just to pick a port that likely has the same behavior we're seeing) we have this RTL from CSE2: (insn 2 4 3 2 (set (mem/c:SI (plus:SI (reg/f:SI 34 .fp)                 (const_int -4 [0xfffffffffffffffc])) [2 S4 A32])         (reg:SI 6 r6)) "j.c":6:1 7 {*movsi_internal}      (expr_list:REG_DEAD (reg:SI 6 r6)         (nil))) (note 3 2 8 2 NOTE_INSN_FUNCTION_BEG) (insn 8 3 9 2 (set (reg:HI 44 [ s.a ])         (mem/c:HI (plus:SI (reg/f:SI 34 .fp)                 (const_int -4 [0xfffffffffffffffc])) [1 s.a+0 S2 A32])) "j.c":7:5 3 {*movhi_internal}      (nil)) (insn 9 8 10 2 (parallel [             (set (reg:SI 45)                 (ashift:SI (subreg:SI (reg:HI 44 [ s.a ]) 0)                     (const_int 16 [0x10])))             (clobber (reg:CC 32 psw))         ]) "j.c":7:5 94 {ashlsi3_clobber_flags}      (expr_list:REG_DEAD (reg:HI 44 [ s.a ])         (expr_list:REG_UNUSED (reg:CC 32 psw)             (nil)))) (insn 10 9 11 2 (parallel [             (set (reg:SI 43)                 (ashiftrt:SI (reg:SI 45)                     (const_int 16 [0x10])))             (clobber (reg:CC 32 psw))         ]) "j.c":7:5 104 {ashrsi3_clobber_flags}      (expr_list:REG_DEAD (reg:SI 45)         (expr_list:REG_UNUSED (reg:CC 32 psw)             (nil)))) Insn 2 is the store into the stack. insn 8 is the load for s.a in the conditional.  DSE1 replaces the MEM in insn 8 with (reg 6) since (reg 6) has the value we want.  After that the store at insn 2 is dead.  Sadly DSE never removes the store. The problem is RTL DSE considers a store with no MEM_EXPR as escaping, which keeps the MEM live.  The lack of a MEM_EXPR is due to call to change_address to twiddle the mode on the MEM for the store at insn 2.  It should be safe to copy the MEM_EXPR (which should always be a PARM_DECL) from the original memory to the memory returned by change_address.  Doing so results in DSE1 removing the store at insn 2. It would be nice to remove the stack setup/teardown.   I'm not offhand aware of mechanisms to remove the setup/teardown after we've already allocated a slot, even if the slot is no longer used. Bootstrapped and regression tested on x86, though I don't think that's a particularly useful test.  So I also ran it through my tester across those pesky embedded targets without regressions as well. I didn't include a test simply because I didn't want to have an insane target selector.  I guess if we really wanted a test we could look after DSE1 is done and verify there aren't any MEMs left at all.  Willing to try that if the consensus is we want this tested. OK for the trunk? Jeff --------------C43673FD911D0FA2FC0ED47A Content-Type: text/plain; charset="UTF-8"; name="dse.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="dse.patch" ZGlmZiAtLWdpdCBhL2djYy9mdW5jdGlvbi5jIGIvZ2NjL2Z1bmN0aW9uLmMKaW5kZXggMDBi MmZlNzBjN2QuLmU0NzkzMWEyNjk1IDEwMDY0NAotLS0gYS9nY2MvZnVuY3Rpb24uYworKysg Yi9nY2MvZnVuY3Rpb24uYwpAQCAtMzAzNiw3ICszMDM2LDIzIEBAIGFzc2lnbl9wYXJtX3Nl dHVwX2Jsb2NrIChzdHJ1Y3QgYXNzaWduX3Bhcm1fZGF0YV9hbGwgKmFsbCwKIAkJICByZWcg PSBnZW5fcnR4X1JFRyAod29yZF9tb2RlLCBSRUdOTyAoZW50cnlfcGFybSkpOwogCQkgIHJl ZyA9IGNvbnZlcnRfdG9fbW9kZSAobW9kZSwgY29weV90b19yZWcgKHJlZyksIDEpOwogCQl9 Ci0JICAgICAgZW1pdF9tb3ZlX2luc24gKGNoYW5nZV9hZGRyZXNzIChtZW0sIG1vZGUsIDAp LCByZWcpOworCisJICAgICAgLyogV2UgdXNlIGNoYW5nZV9hZGRyZXNzIHRvIGdldCBhIG5l dyBNRU0gd2l0aCB0aGUgbW9kZQorCQkgY2hhbmdlZC4gIEJ1dCB0aGF0IG5ldyBNRU0gd2ls bCBub3QgaGF2ZSBhIGZ1bGwgc2V0CisJCSBvZiBhdHRyaWJ1dGVzIHN1Y2ggYXMgTUVNX0VY UFIuCisKKwkJIFRoaXMgY2FuIGluaGliaXQgbGF0ZXIgb3B0aW1pemF0aW9ucyBsaWtlIERT RSB3aGljaAorCQkgYXNzdW1lcyBpZiB0aGVyZSBpcyBubyBNRU1fRVhQUiwgdGhlbiB0aGUg bWVtb3J5CisJCSBtdXN0IGVzY2FwZSB0aGUgY3VycmVudCBmdW5jdGlvbi4KKworCQkgVGhl IE1FTV9FWFBSIG9uIHRoZSBvcmlnaW5hbCBNRU0gZXhwcmVzc2lvbiBzaG91bGQgYmUKKwkJ IHRoZSBQQVJNX0RFQ0wgd2hpY2ggc2hvdWxkIGJlIHNhZmUgdG8gYXR0YWNoIHRvIHRoZQor CQkgbmV3IE1FTS4gICovCisJICAgICAgZ2NjX2Fzc2VydCAoTUVNX0VYUFIgKG1lbSkpOwor CSAgICAgIGdjY19hc3NlcnQgKFRSRUVfQ09ERSAoTUVNX0VYUFIgKG1lbSkpID09IFBBUk1f REVDTCk7CisJICAgICAgcnR4IG5ld19tZW0gPSBjaGFuZ2VfYWRkcmVzcyAobWVtLCBtb2Rl LCAwKTsKKwkgICAgICBzZXRfbWVtX2F0dHJpYnV0ZXMgKG5ld19tZW0sIE1FTV9FWFBSICht ZW0pLCAxKTsKKwkgICAgICBlbWl0X21vdmVfaW5zbiAobmV3X21lbSwgcmVnKTsKIAkgICAg fQogCiAjaWZkZWYgQkxPQ0tfUkVHX1BBRERJTkcK --------------C43673FD911D0FA2FC0ED47A--