From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id 965F93857C4C for ; Tue, 27 Sep 2022 19:53:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 965F93857C4C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ventanamicro.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ventanamicro.com Received: by mail-pl1-x62e.google.com with SMTP id d11so10012451pll.8 for ; Tue, 27 Sep 2022 12:53:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date; bh=uUUgkGKoWT0oOYZbwuGp3qSoIV/XP3O3RLkATGSuvfI=; b=iBxxXQVrGu9MMu+c1gEG+AK3TJP0txJIIcquN7jzSFa5esAEu0GFAjsqVQ30OKWTXn syfhZQZNEs+pluePpZog3V3wuxBuSKNrmQxQ6FY2tNO9xp/ojPAgHW9aaAs1qGSuUFsa Dorfxhu4T6p04d5apzrHps7GklPi3nVsm7YfBYcXTw9v72NYqYyuHLkx20GTXwzfiBPK FXQhct9qlk13/CZ/1RHDPejd8Cm4rca67vWdTp4A7VKquZuWGZnXpMr2D+Rmw5sgegE0 GfiJ0GxEvfG5BS/f5r0k3/keaKvQDIk+uNfs72KV0ailI8vwtK39Ym1Fsy/8VRtDpRlH N1Lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date; bh=uUUgkGKoWT0oOYZbwuGp3qSoIV/XP3O3RLkATGSuvfI=; b=URMVGyrbPRL1Gyw+i3ifMU0euDOvANtOG5bDg8mYCNpziQr84dLh0E9oQg7viTduV+ 2iahuOGibJavt4fQUR+S2R0AOMktkYNW3VcRb1e9vmPinpV69cDmDwQ9Oh/FhzMJ+MOZ fnFeAfzP9rN++l8faS6Ty4xsFOF6BOzTof+VHEXyGOS7YAZY0n3CciBq83DrAfcgE5e4 QQ61IedAVmOGHFr02cwPGumU7hIkqazEeh5wBT/u9Vth3ez64Xc2pK24E1wXWXKMzSc4 3HkfMNQ8xkodoZyPV/O3nAP1xDndedRTJFcm7S1MKBI7wHHLpstZHG2qBnN3QsOHbeM/ W8fg== X-Gm-Message-State: ACrzQf1P4LBwT0pbNAlczYRKr3ZI9K7pcR7eKEojaphG2e5uVhYeAWnC 2VyZAKT4oOPbS9RXKdy9l4f9DXnDC2Wy5x3/ X-Google-Smtp-Source: AMsMyM7AwlrA1q0EZwQCuxQ4POttc7AGui/b3rt88kln6F4HM1pkmgOmUgrHHihoi+5fTeQkNX/eUg== X-Received: by 2002:a17:90b:3ec1:b0:203:5eef:fe1e with SMTP id rm1-20020a17090b3ec100b002035eeffe1emr6161571pjb.143.1664308438288; Tue, 27 Sep 2022 12:53:58 -0700 (PDT) Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a]) by smtp.gmail.com with ESMTPSA id v66-20020a622f45000000b00543780ba53asm2168988pfv.124.2022.09.27.12.53.57 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Sep 2022 12:53:57 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------r1ZD09LeaoeSRF06DHdx4zAC" Message-ID: Date: Tue, 27 Sep 2022 13:53:56 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.1 Content-Language: en-US From: Jeff Law Subject: [RFA] Avoid unnecessary load-immediate in coremark To: gcc-patches@gcc.gnu.org X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------r1ZD09LeaoeSRF06DHdx4zAC Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit This is another minor improvement to coremark.   I suspect this only improves code size as the load-immediate was likely issuing with the ret statement on multi-issue machines. Basically we're failing to utilize conditional equivalences during the post-reload CSE pass.  So if a particular block is only reached when a certain condition holds (say for example a4 == 0) and the block has an assignment like a4 = 0, we would fail to eliminate the unnecessary assignment. So the way this works, as we enter each block in reload_cse_regs_1 we look at the block's predecessors to see if all of them have the same implicit assignment.  If they do, then we create a dummy insn representing that implicit assignment. Before processing the first real insn, we enter the implicit assignment into the cselib hash tables.    This deferred action is necessary because of CODE_LABEL handling in cselib -- when it sees a CODE_LABEL it wipes state.  So we have to add the implicit assignment after processing the (optional) CODE_LABEL, but before processing real insns. Note we have to walk all the block's predecessors to verify they all have the same implicit assignment.  That could potentially be expensive, so we limit it to cases where there are only a few predecessors.   For reference on x86_64, 81% of the cases where implicit assignments can be found are for single predecessor blocks.  96% have two preds, 99.1% have 3 preds, 99.6% have 4 preds, 99.8% have 5 preds and so-on.   While there were cases where all 19 preds had the same implicit assignment capturing those cases just doesn't seem terribly important.   I put the clamp at 3 preds.    If folks think it's important, I could certainly make that a PARAM. Bootstrapped and regression tested on x86.  Bootstrapped on riscv as well. OK for the trunk? Jeff --------------r1ZD09LeaoeSRF06DHdx4zAC Content-Type: text/plain; charset=UTF-8; name="P" Content-Disposition: attachment; filename="P" Content-Transfer-Encoding: base64 Z2NjLwoJKiBwb3N0cmVsb2FkLmNjIChyZWxvYWRfY3NlX3JlZ3NfMSk6IFJlY29yZCBpbXBs aWNpdCBzZXRzIGZyb20KCWNvbmRpdGlvbmFsIGJyYW5jaGVzIGludG8gdGhlIGNzZWxpYiB0 YWJsZXMuCgpnY2MvdGVzdHN1aXRlLwoKCSogZ2NjLnRhcmdldC9yaXNjdi9pbXBsaWN0LXNl dC5jOiBOZXcgdGVzdC4KCgkKZGlmZiAtLWdpdCBhL2djYy9wb3N0cmVsb2FkLmNjIGIvZ2Nj L3Bvc3RyZWxvYWQuY2MKaW5kZXggNDFmNjFkMzI2NDguLjJmMTU1YTIzOWFlIDEwMDY0NAot LS0gYS9nY2MvcG9zdHJlbG9hZC5jYworKysgYi9nY2MvcG9zdHJlbG9hZC5jYwpAQCAtMzMs NiArMzMsNyBAQCBhbG9uZyB3aXRoIEdDQzsgc2VlIHRoZSBmaWxlIENPUFlJTkczLiAgSWYg bm90IHNlZQogI2luY2x1ZGUgImVtaXQtcnRsLmgiCiAjaW5jbHVkZSAicmVjb2cuaCIKIAor I2luY2x1ZGUgImNmZ2hvb2tzLmgiCiAjaW5jbHVkZSAiY2ZncnRsLmgiCiAjaW5jbHVkZSAi Y2ZnYnVpbGQuaCIKICNpbmNsdWRlICJjZmdjbGVhbnVwLmgiCkBAIC0yMjEsMTMgKzIyMiwx MDggQEAgcmVsb2FkX2NzZV9yZWdzXzEgKHZvaWQpCiAgIGluaXRfYWxpYXNfYW5hbHlzaXMg KCk7CiAKICAgRk9SX0VBQ0hfQkJfRk4gKGJiLCBjZnVuKQotICAgIEZPUl9CQl9JTlNOUyAo YmIsIGluc24pCi0gICAgICB7Ci0JaWYgKElOU05fUCAoaW5zbikpCi0JICBjZmdfY2hhbmdl ZCB8PSByZWxvYWRfY3NlX3NpbXBsaWZ5IChpbnNuLCB0ZXN0cmVnKTsKKyAgICB7CisgICAg ICAvKiBJZiBCQiBoYXMgYSBzbWFsbCBudW1iZXIgb2YgcHJlZGVjZXNzb3JzLCBzZWUgaWYg ZWFjaCBvZiB0aGUKKwkgaGFzIHRoZSBzYW1lIGltcGxpY2l0IHNldC4gIElmIHNvLCByZWNv cmQgdGhhdCBpbXBsaWNpdCBzZXQgc28KKwkgdGhhdCB3ZSBjYW4gYWRkIGl0IHRvIHRoZSBj c2VsaWIgdGFibGVzLiAgKi8KKyAgICAgIHJ0eF9pbnNuICppbXBsaWNpdF9zZXQ7CiAKLQlj c2VsaWJfcHJvY2Vzc19pbnNuIChpbnNuKTsKLSAgICAgIH0KKyAgICAgIGltcGxpY2l0X3Nl dCA9IE5VTEw7CisgICAgICBpZiAoRURHRV9DT1VOVCAoYmItPnByZWRzKSA8PSAzKQorCXsK KwkgIGVkZ2UgZTsKKwkgIGVkZ2VfaXRlcmF0b3IgZWk7CisJICBydHggc3JjID0gTlVMTF9S VFg7CisJICBydHggZGVzdCA9IE5VTExfUlRYOworCSAgYm9vbCBmb3VuZCA9IHRydWU7CisK KwkgIC8qIEl0ZXJhdGUgb3ZlciBlYWNoIGluY29taW5nIGVkZ2UgYW5kIHNlZSBpZiB0aGV5 CisJICAgICBhbGwgaGF2ZSB0aGUgc2FtZSBpbXBsaWNpdCBzZXQuICAqLworCSAgRk9SX0VB Q0hfRURHRSAoZSwgZWksIGJiLT5wcmVkcykKKwkgICAgeworCSAgICAgIC8qIElmIHRoZSBw cmVkZWNlc3NvciBkb2VzIG5vdCBlbmQgaW4gYSBjb25kaXRpb25hbAorCQkganVtcCwgdGhl biBpdCBkb2VzIG5vdCBoYXZlIGFuIGltcGxpY2l0IHNldC4gICovCisJICAgICAgaWYgKGUt PnNyYyAhPSBFTlRSWV9CTE9DS19QVFJfRk9SX0ZOIChjZnVuKQorCQkgICYmICFibG9ja19l bmRzX3dpdGhfY29uZGp1bXBfcCAoZS0+c3JjKSkKKwkJeworCQkgIGZvdW5kID0gZmFsc2U7 CisJCSAgYnJlYWs7CisJCX0KKworCSAgICAgIC8qIFdlIGtub3cgdGhlIHByZWRlY2Vzc29y IGVuZHMgd2l0aCBhIGNvbmRpdGlvbmFsCisJCSBqdW1wLiAgTm93IGRpZyBpbnRvIHRoZSBh Y3RhbCBmb3JtIG9mIHRoZSBqdW1wCisJCSB0byBwb3RlbnRpYWxseSBleHRyYWN0IGFuIGlt cGxpY2l0IHNldC4gICovCisJICAgICAgcnR4X2luc24gKmNvbmRqdW1wID0gQkJfRU5EIChl LT5zcmMpOworCSAgICAgIGlmIChjb25kanVtcAorCQkgICYmIGFueV9jb25kanVtcF9wIChj b25kanVtcCkKKwkJICAmJiBvbmx5anVtcF9wIChjb25kanVtcCkpCisJCXsKKwkJICAvKiBF eHRyYWN0IHRoZSBjb25kaXRpb24uICAqLworCQkgIHJ0eCBwYXQgPSBQQVRURVJOIChjb25k anVtcCk7CisJCSAgcnR4IGlfdF9lID0gU0VUX1NSQyAocGF0KTsKKwkJICBnY2NfYXNzZXJ0 IChHRVRfQ09ERSAoaV90X2UpID09IElGX1RIRU5fRUxTRSk7CisJCSAgcnR4IGNvbmQgPSBY RVhQIChpX3RfZSwgMCk7CisJCSAgaWYgKChHRVRfQ09ERSAoY29uZCkgPT0gRVEKKwkJICAg ICAgICYmIEdFVF9DT0RFIChYRVhQIChpX3RfZSwgMSkpID09IExBQkVMX1JFRgorCQkgICAg ICAgJiYgWEVYUCAoWEVYUCAoaV90X2UsIDEpLCAwKSA9PSBCQl9IRUFEIChiYikpCisJCSAg ICAgIHx8IChHRVRfQ09ERSAoY29uZCkgPT0gTkUKKwkJCSAgJiYgWEVYUCAoaV90X2UsIDIp ID09IHBjX3J0eAorCQkJICAmJiBlLT5zcmMtPm5leHRfYmIgPT0gYmIpKQorCQkgICAgewor CQkgICAgICAvKiBJZiB0aGlzIGlzIHRoZSBmaXJzdCB0aW1lIHRocm91Z2ggcmVjb3JkCisJ CQkgdGhlIHNvdXJjZSBhbmQgZGVzdGluYXRpb24uICAqLworCQkgICAgICBpZiAoIWRlc3Qp CisJCQl7CisJCQkgIGRlc3QgPSBYRVhQIChjb25kLCAwKTsKKwkJCSAgc3JjID0gWEVYUCAo Y29uZCwgMSk7CisJCQl9CisJCSAgICAgIC8qIElmIHRoaXMgaXMgbm90IHRoZSBmaXJzdCB0 aW1lIHRocm91Z2gsIHRoZW4KKwkJCSB2ZXJpZnkgdGhlIHNvdXJjZSBhbmQgZGVzdGluYXRp b24gbWF0Y2guICAqLworCQkgICAgICBlbHNlIGlmIChkZXN0ID09IFhFWFAgKGNvbmQsIDAp ICYmIHNyYyA9PSBYRVhQIChjb25kLCAxKSkKKwkJCTsKKwkJICAgICAgZWxzZQorCQkJewor CQkJICBmb3VuZCA9IGZhbHNlOworCQkJICBicmVhazsKKwkJCX0KKwkJICAgIH0KKwkJfQor CSAgICAgIGVsc2UKKwkJeworCQkgIGZvdW5kID0gZmFsc2U7CisJCSAgYnJlYWs7CisJCX0K KwkgICAgfQorCisJICAvKiBJZiBhbGwgdGhlIGluY29taW5nIGVkZ2VzIGhhZCB0aGUgc2Ft ZSBpbXBsaWNpdAorCSAgICAgc2V0LCB0aGVuIGNyZWF0ZSBhIGR1bW15IGluc24gZm9yIHRo YXQgc2V0LgorCisJICAgICBJdCB3aWxsIGJlIGVudGVyZWQgaW50byB0aGUgY3NlbGliIHRh YmxlcyBiZWZvcmUKKwkgICAgIHdlIHByb2Nlc3MgdGhlIGZpcnN0IHJlYWwgaW5zbiBpbiB0 aGlzIGJsb2NrLiAgKi8KKwkgIGlmIChkZXN0ICYmIGZvdW5kKQorCSAgICBpbXBsaWNpdF9z ZXQgPSBtYWtlX2luc25fcmF3IChnZW5fcnR4X1NFVCAoZGVzdCwgc3JjKSk7CisJfQorCisg ICAgICBGT1JfQkJfSU5TTlMgKGJiLCBpbnNuKQorCXsKKwkgIGlmIChJTlNOX1AgKGluc24p KQorCSAgICB7CisJICAgICAgLyogSWYgd2UgcmVjb3JkZWQgYW4gaW1wbGljaXQgc2V0LCBl bnRlciBpdAorCQkgaW50byB0aGUgdGFibGVzIGJlZm9yZSB0aGUgZmlyc3QgcmVhbCBpbnNu LgorCisJCSBXZSBoYXZlIHRvIGRvIGl0IHRoaXMgd2F5IGJlY2F1c2UgYSBDT0RFX0xBQkVM CisJCSB3aWxsIGZsdXNoIHRoZSBjc2VsaWIgdGFibGVzLiAgKi8KKwkgICAgICBpZiAoaW1w bGljaXRfc2V0KQorCQl7CisJCSAgY3NlbGliX3Byb2Nlc3NfaW5zbiAoaW1wbGljaXRfc2V0 KTsKKwkJICBpbXBsaWNpdF9zZXQgPSBOVUxMOworCQl9CisJICAgICAgY2ZnX2NoYW5nZWQg fD0gcmVsb2FkX2NzZV9zaW1wbGlmeSAoaW5zbiwgdGVzdHJlZyk7CisJICAgIH0KKworCSAg Y3NlbGliX3Byb2Nlc3NfaW5zbiAoaW5zbik7CisJfQorICAgIH0KIAogICAvKiBDbGVhbiB1 cC4gICovCiAgIGVuZF9hbGlhc19hbmFseXNpcyAoKTsKZGlmZiAtLWdpdCBhL2djYy90ZXN0 c3VpdGUvZ2NjLnRhcmdldC9yaXNjdi9pbXBsaWNpdC1zZXQuYyBiL2djYy90ZXN0c3VpdGUv Z2NjLnRhcmdldC9yaXNjdi9pbXBsaWNpdC1zZXQuYwpuZXcgZmlsZSBtb2RlIDEwMDY0NApp bmRleCAwMDAwMDAwMDAwMC4uOTExMDZiYjVkODAKLS0tIC9kZXYvbnVsbAorKysgYi9nY2Mv dGVzdHN1aXRlL2djYy50YXJnZXQvcmlzY3YvaW1wbGljaXQtc2V0LmMKQEAgLTAsMCArMSw0 MCBAQAorLyogeyBkZy1kbyBjb21waWxlIH0gKi8KKy8qIHsgZGctb3B0aW9ucyAiLU8yIC1k cCIgfSAqLworLyogVGhpcyB3YXMgZXh0cmFjdGVkIGZyb20gY29yZW1hcmsuICAqLworCisK K3R5cGVkZWYgc2lnbmVkIHNob3J0IGVlX3MxNjsKK3R5cGVkZWYgc3RydWN0IGxpc3RfZGF0 YV9zCit7CisgICAgZWVfczE2IGRhdGExNjsKKyAgICBlZV9zMTYgaWR4OworfSBsaXN0X2Rh dGE7CisKK3R5cGVkZWYgc3RydWN0IGxpc3RfaGVhZF9zCit7CisgICAgc3RydWN0IGxpc3Rf aGVhZF9zICpuZXh0OworICAgIHN0cnVjdCBsaXN0X2RhdGFfcyAqaW5mbzsKK30gbGlzdF9o ZWFkOworCisKK2xpc3RfaGVhZCAqCitjb3JlX2xpc3RfZmluZChsaXN0X2hlYWQgKmxpc3Qs IGxpc3RfZGF0YSAqaW5mbykKK3sKKyAgICBpZiAoaW5mby0+aWR4ID49IDApCisgICAgewor ICAgICAgICB3aGlsZSAobGlzdCAmJiAobGlzdC0+aW5mby0+aWR4ICE9IGluZm8tPmlkeCkp CisgICAgICAgICAgICBsaXN0ID0gbGlzdC0+bmV4dDsKKyAgICAgICAgcmV0dXJuIGxpc3Q7 CisgICAgfQorICAgIGVsc2UKKyAgICB7CisgICAgICAgIHdoaWxlIChsaXN0ICYmICgobGlz dC0+aW5mby0+ZGF0YTE2ICYgMHhmZikgIT0gaW5mby0+ZGF0YTE2KSkKKyAgICAgICAgICAg IGxpc3QgPSBsaXN0LT5uZXh0OworICAgICAgICByZXR1cm4gbGlzdDsKKyAgICB9Cit9CisK Ky8qIFRoZXJlIHdhcyBhbiB1bm5lY2Vzc2FyeSBhc3NpZ25tZW50IHRvIHRoZSByZXR1cm4g dmFsdWUgdW50aWwKKyAgIHJlY2VudGx5LiAgU2NhbiBmb3IgdGhhdCBpbiB0aGUgcmVzdWx0 aW5nIG91dHB1dC4gICovCisvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXItbm90ICJs aVxcdGEwLDAiIH0gfSAqLworCg== --------------r1ZD09LeaoeSRF06DHdx4zAC--