From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf35.google.com (mail-qv1-xf35.google.com [IPv6:2607:f8b0:4864:20::f35]) by sourceware.org (Postfix) with ESMTPS id 0BB973858D28 for ; Sun, 9 Apr 2023 22:55:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0BB973858D28 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=acm.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-qv1-xf35.google.com with SMTP id u4so2415050qvj.10 for ; Sun, 09 Apr 2023 15:55:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1681080916; h=subject:from:to:content-language:user-agent:mime-version:date :message-id:sender:from:to:cc:subject:date:message-id:reply-to; bh=juKmBplkecEOC8EURZkgOB5k8PqENS2u3KUAQHMdPhc=; b=X0kY3G2ZDOMdQX/YeaTtR7xWlmfF7VDqZv7hI0kMWcOqtC83AZsW0K+u3hjykdJtwI z6bhyw8DOQwA573/RCxf5+G0KySIrd2EM2z8KBWJkc9KxKpNrg2idReGk9YzDo3OBvNq XjvRES66aJ2amVqpCCN1IBHRAfqMgVGbTdHKxLNlu+MXrzJE2R89LfWBOHuphB+0UKqG ++jirdqy2tt4yQBbSCHrU7RvBy0StDCnIGI+tC7WCVbNPW12r0iRxEKJGkeYELv7Qjgp EGtF9G/O9XW6khcwQgNMrtVnT+JPsJlbakqCkcl5g/1jrSRJc+MInS6CsF5CI3znECCj 3Lew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681080916; h=subject:from:to:content-language:user-agent:mime-version:date :message-id:sender:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=juKmBplkecEOC8EURZkgOB5k8PqENS2u3KUAQHMdPhc=; b=oI8LlnDa41Eb28+J5mwXF8MbxMvSPj0Hegq45+ChxWsHWG6stsBiWJ6vmCeBE7oBJi NBg98UHphtRQQSE/+MBMoIepETW+CNcYP0IbeHHfUNpBLWpTzZIJAx5aWFYRhVbxu6uC fASHIagcfhBp3FpL9Cr5bLmkM3Tk8goMh0YKbz2nYdPBI4bhuxMS+Pcnq6q3u768HONY ZNzs6jp575+LASbiqUJnGsou9+3wok8SCbZVFKtJGNBP3DGtvD+K7YDhLTSKrefiT/1s dCDhU9Cw8oF3YtY55fmJcXmqu9Yi+YGw8eqY3AXUOO5vCK+K/b3Gn6jg4/vKW7z/SnWG olPA== X-Gm-Message-State: AAQBX9f/GzmnZiq6VFJq+ZwrgLkPTWhOISbE/3upq6Ry05B93O25+6iC jKzWXm0djsSoZZUKKX8drrg= X-Google-Smtp-Source: AKy350aDuyZbR+72fl4h2V6Qb5BzGNcw/zOXjIZNfOMkmEB2520DXagEywjf9O9GAXRqqzuNofGZfA== X-Received: by 2002:ad4:4ee3:0:b0:5e9:8487:3957 with SMTP id dv3-20020ad44ee3000000b005e984873957mr12210088qvb.7.1681080915650; Sun, 09 Apr 2023 15:55:15 -0700 (PDT) Received: from ?IPV6:2601:19c:527f:bfd0:cb20:e74:ead7:4cfe? ([2601:19c:527f:bfd0:cb20:e74:ead7:4cfe]) by smtp.googlemail.com with ESMTPSA id j9-20020a0cc349000000b005dd8b9345b9sm3032643qvi.81.2023.04.09.15.55.14 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 09 Apr 2023 15:55:15 -0700 (PDT) Sender: Nathan Sidwell Content-Type: multipart/mixed; boundary="------------iFDPzraDFSz6IUehfaKYq809" Message-ID: <745eb337-facd-244a-bd02-d9b8b1f653a5@acm.org> Date: Sun, 9 Apr 2023 18:55:13 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Content-Language: en-US To: Nick Clifton , binutils From: Nathan Sidwell Subject: bfd: optimize bfd_elf_hash X-Spam-Status: No, score=-3037.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------iFDPzraDFSz6IUehfaKYq809 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit I happened to be poking at llvm's implementation of the sysv hash and notice binutils' version had similar optimization issues. neither gcc nor llvm can spot these xforms, and the if (...) at least obscures the data flow IMHO anyway. The bfd_elf_hash loop is taken straight from the sysV document, but it is poorly optimized. This refactoring removes about 5 x86 insns from the 15 insn loop. 1) The if (..) is meaningless -- we're xoring with that value, and of course xor 0 is a nop. On x86 (at least) we actually compute the xor'd value and then cmov. Removing the if test removes the cmov. 2) The 'h ^ g' to clear the top 4 bits is not needed, as those 4 bits will be shifted out in the next iteration. All we need to do is sink a mask of those 4 bits out of the loop. 3) anding with 0xf0 after shifting by 24 bits can allow betterin encoding on RISC ISAs than masking with '0xf0 << 24' before shifting. RISC ISAs often require materializing larger constants. nathan -- Nathan Sidwell --------------iFDPzraDFSz6IUehfaKYq809 Content-Type: text/x-patch; charset=UTF-8; name="0001-bfd-optimize-bfd_elf_hash.patch" Content-Disposition: attachment; filename="0001-bfd-optimize-bfd_elf_hash.patch" Content-Transfer-Encoding: base64 RnJvbSA0ZmM2NjY4ZmNiZWUyOTQ1ZDc5YjQwZjUxY2RjYWVhYzc4ZjA4YWEzIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBOYXRoYW4gU2lkd2VsbCA8bmF0aGFuQGFjbS5vcmc+ CkRhdGU6IFN1biwgOSBBcHIgMjAyMyAxODo0NDowNyAtMDQwMApTdWJqZWN0OiBbUEFUQ0hd IGJmZDogb3B0aW1pemUgYmZkX2VsZl9oYXNoCgpUaGUgYmZkX2VsZl9oYXNoIGxvb3AgaXMg dGFrZW4gc3RyYWlnaHQgZnJvbSB0aGUgc3lzViBkb2N1bWVudCwgYnV0IGl0CmlzIHBvb3Js eSBvcHRpbWl6ZWQuIFRoaXMgcmVmYWN0b3JpbmcgcmVtb3ZlcyBhYm91dCA1IHg4NiBpbnNu cyBmcm9tCnRoZSAxNSBpbnNuIGxvb3AuCgoxKSBUaGUgaWYgKC4uKSBpcyBtZWFuaW5nbGVz cyAtLSB3ZSdyZSB4b3Jpbmcgd2l0aCB0aGF0IHZhbHVlLCBhbmQgb2YKY291cnNlIHhvciAw IGlzIGEgbm9wLiBPbiB4ODYgKGF0IGxlYXN0KSB3ZSBhY3R1YWxseSBjb21wdXRlIHRoZSB4 b3InZAp2YWx1ZSBhbmQgdGhlbiBjbW92LiAgUmVtb3ZpbmcgdGhlIGlmIHRlc3QgcmVtb3Zl cyB0aGUgY21vdi4KCjIpIFRoZSAnaCBeIGcnIHRvIGNsZWFyIHRoZSB0b3AgNCBiaXRzIGlz IG5vdCBuZWVkZWQsIGFzIHRob3NlIDQgYml0cwp3aWxsIGJlIHNoaWZ0ZWQgb3V0IGluIHRo ZSBuZXh0IGl0ZXJhdGlvbi4gIEFsbCB3ZSBuZWVkIHRvIGRvIGlzIHNpbmsKYSBtYXNrIG9m IHRob3NlIDQgYml0cyBvdXQgb2YgdGhlIGxvb3AuCgozKSBhbmRpbmcgd2l0aCAweGYwIGFm dGVyIHNoaWZ0aW5nIGJ5IDI0IGJpdHMgY2FuIGFsbG93IGJldHRlcmluCmVuY29kaW5nIG9u IFJJU0MgSVNBcyB0aGFuIG1hc2tpbmcgd2l0aCAnMHhmMCA8PCAyNCcgYmVmb3JlIHNoaWZ0 aW5nLgpSSVNDIElTQXMgb2Z0ZW4gcmVxdWlyZSBtYXRlcmlhbGl6aW5nIGxhcmdlciBjb25z dGFudHMuCgoJYmZkLwoJKiBlbGYuYyAoYmZkX2VsZl9oYXNoKTogUmVmYWN0b3IgdG8gb3B0 aW1pemUgbG9vcC4KCShiZmRfZWxmX2dudV9oYXNoKTogUmVmYWN0b3IgdG8gdXNlIDMyLWJp dCB0eXBlLgotLS0KIGJmZC9lbGYuYyB8IDMxICsrKysrKysrKysrLS0tLS0tLS0tLS0tLS0t LS0tLS0KIDEgZmlsZSBjaGFuZ2VkLCAxMSBpbnNlcnRpb25zKCspLCAyMCBkZWxldGlvbnMo LSkKCmRpZmYgLS1naXQgYS9iZmQvZWxmLmMgYi9iZmQvZWxmLmMKaW5kZXggODdlYzE2MjMz MTMuLmJkYTgzNDY5ZWRkIDEwMDY0NAotLS0gYS9iZmQvZWxmLmMKKysrIGIvYmZkL2VsZi5j CkBAIC0xOTYsMjMgKzE5NiwxNSBAQCBfYmZkX2VsZl9zd2FwX3ZlcnN5bV9vdXQgKGJmZCAq YWJmZCwKIHVuc2lnbmVkIGxvbmcKIGJmZF9lbGZfaGFzaCAoY29uc3QgY2hhciAqbmFtZWFy ZykKIHsKLSAgY29uc3QgdW5zaWduZWQgY2hhciAqbmFtZSA9IChjb25zdCB1bnNpZ25lZCBj aGFyICopIG5hbWVhcmc7Ci0gIHVuc2lnbmVkIGxvbmcgaCA9IDA7Ci0gIHVuc2lnbmVkIGxv bmcgZzsKLSAgaW50IGNoOworICB1aW50MzJfdCBoID0gMDsKIAotICB3aGlsZSAoKGNoID0g Km5hbWUrKykgIT0gJ1wwJykKKyAgZm9yIChjb25zdCB1bnNpZ25lZCBjaGFyICpuYW1lID0g KGNvbnN0IHVuc2lnbmVkIGNoYXIgKikgbmFtZWFyZzsKKyAgICAgICAqbmFtZTsgbmFtZSsr KQogICAgIHsKLSAgICAgIGggPSAoaCA8PCA0KSArIGNoOwotICAgICAgaWYgKChnID0gKGgg JiAweGYwMDAwMDAwKSkgIT0gMCkKLQl7Ci0JICBoIF49IGcgPj4gMjQ7Ci0JICAvKiBUaGUg RUxGIEFCSSBzYXlzIGBoICY9IH5nJywgYnV0IHRoaXMgaXMgZXF1aXZhbGVudCBpbgotCSAg ICAgdGhpcyBjYXNlIGFuZCBvbiBzb21lIG1hY2hpbmVzIG9uZSBpbnNuIGluc3RlYWQgb2Yg dHdvLiAgKi8KLQkgIGggXj0gZzsKLQl9CisgICAgICBoID0gKGggPDwgNCkgKyAqbmFtZTsK KyAgICAgIGggXj0gKGggPj4gMjQpICYgMHhmMDsKICAgICB9Ci0gIHJldHVybiBoICYgMHhm ZmZmZmZmZjsKKyAgcmV0dXJuIGggJiAweDBmZmZmZmZmOwogfQogCiAvKiBEVF9HTlVfSEFT SCBoYXNoIGZ1bmN0aW9uLiAgRG8gbm90IGNoYW5nZSB0aGlzIGZ1bmN0aW9uOyB5b3Ugd2ls bApAQCAtMjIxLDEzICsyMTMsMTIgQEAgYmZkX2VsZl9oYXNoIChjb25zdCBjaGFyICpuYW1l YXJnKQogdW5zaWduZWQgbG9uZwogYmZkX2VsZl9nbnVfaGFzaCAoY29uc3QgY2hhciAqbmFt ZWFyZykKIHsKLSAgY29uc3QgdW5zaWduZWQgY2hhciAqbmFtZSA9IChjb25zdCB1bnNpZ25l ZCBjaGFyICopIG5hbWVhcmc7Ci0gIHVuc2lnbmVkIGxvbmcgaCA9IDUzODE7Ci0gIHVuc2ln bmVkIGNoYXIgY2g7CisgIHVpbnQzMl90IGggPSA1MzgxOwogCi0gIHdoaWxlICgoY2ggPSAq bmFtZSsrKSAhPSAnXDAnKQotICAgIGggPSAoaCA8PCA1KSArIGggKyBjaDsKLSAgcmV0dXJu IGggJiAweGZmZmZmZmZmOworICBmb3IgKGNvbnN0IHVuc2lnbmVkIGNoYXIgKm5hbWUgPSAo Y29uc3QgdW5zaWduZWQgY2hhciAqKSBuYW1lYXJnOworICAgICAgICpuYW1lOyBuYW1lKysp CisgICAgaCA9IChoIDw8IDUpICsgaCArICpuYW1lOworICByZXR1cm4gaDsKIH0KIAogLyog Q3JlYXRlIGEgdGRhdGEgZmllbGQgT0JKRUNUX1NJWkUgYnl0ZXMgaW4gbGVuZ3RoLCB6ZXJv ZWQgb3V0IGFuZCB3aXRoCi0tIAoyLjM5LjIKCg== --------------iFDPzraDFSz6IUehfaKYq809--