From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mo4-p00-ob.smtp.rzone.de (mo4-p00-ob.smtp.rzone.de [85.215.255.23]) by sourceware.org (Postfix) with ESMTPS id B740B385843E for ; Thu, 25 May 2023 14:32:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B740B385843E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gjlay.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=gjlay.de ARC-Seal: i=1; a=rsa-sha256; t=1685025122; cv=none; d=strato.com; s=strato-dkim-0002; b=alFEMiJCbVYlVmJ9RP3Lk+6zXwypRs9yJWSjlHfPcWx7aPds6JCFDkAJB1FEBl7w1x ArQ5X0VKGI4b1TA+3XqRfFfD/myi/DugAzVXWJ7t3e9hxqR/wuBh/NCzuVQRuCe9wafX aoRuYUs1jb1CjJxaUXowgaNcbPWhbfip+YLo+l/TADUs3Cs+4FYOSBU1prc9CtgrG+0D TObb8O24cFSGOMud+L0l3S5GIlNnfcpvbwUgYQX7DDy0nq0LRy7alJ45pI2BX65WLutr wan/cQ4FUb+ekZ7VcsMXGk+drPNyIZNeC7OhC3hhmgsEhChfUKLrJZESFPmai27VIJ8N +wsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1685025122; s=strato-dkim-0002; d=strato.com; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=iukrtmB6plDtAKUResAXdxM9mb6LqKnhqMiUOGbU1Fk=; b=liq0YvA1WmpETong8e+tcIRxnScefsTo4JWsysm/XPU5bzVunBsQRL3RWNiY2Xy5de 5I72UorN7Fs89IPZUi5RQDpe8zjtgq+Jv4x2+ZIqAT4n4PdHdJWVMM8TK13MtojmtD/L wjd+sQ1WUlJBCp7797u0k+7gJDacxwgnM/bSDGSDqADGkJfGMduMLA9yIo7BVhwF0Qxk OoY5IqR+KeA6EgASL/dc41T1FPi7owy225xZ+9qhnSQMHr957VSw9nYIc9T7nGSBVOHS Tp5p1vbFg/31awZuAN7oz5KCeWubt/llE5M3NoItC66yeoL3Zq1s8I/9B8TavLG7SOq6 r8tw== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1685025122; s=strato-dkim-0002; d=gjlay.de; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=iukrtmB6plDtAKUResAXdxM9mb6LqKnhqMiUOGbU1Fk=; b=PghIBrBP3lzbsNkhIT2Q+x3OC/uo31eWDukmBvy/bMBPC/GKRKBdqjOl1vOpWLMF4i T8RvMJqwkRP38zgL2cto61BsXog2ifUOpwPY6WCi1QyD4SF3MywSkqOSAYVjNCWvUdUd ES/1osecY6BqMSwN0cBuO2B+fXI3kxKYS7w3b7ztEcTuouhFs0MDjZvnCStfgXtDUu1W 8ZVAdu+DcKFMbCeDUFk9Zu4znMXk6AYAkCWHPtnOx3RX7h34KAERoeCBQT5pA4C5GAkS j7K1+dIhzYeCjAHO/7HWwPmf5PeDgIALOzkF7QxOgpc5g9VQWBQFXqGysmyOUaGR4AyJ 7ppg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1685025122; s=strato-dkim-0003; d=gjlay.de; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=iukrtmB6plDtAKUResAXdxM9mb6LqKnhqMiUOGbU1Fk=; b=E1g/lxA1U8PBbSct2Crmq1Zz9bGL5MhbXjF5D2m5Y+D7fse8pgBJVD20UMGp04yTGm gXPWs/2hbDvJSHQA28CA== X-RZG-AUTH: ":LXoWVUeid/7A29J/hMvvT3koxZnKT7Qq0xotTetVnKkRmM69o2y+LiO3MutATA==" Received: from [192.168.2.102] by smtp.strato.de (RZmta 49.4.0 DYNA|AUTH) with ESMTPSA id z691f1z4PEW1tr2 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Thu, 25 May 2023 16:32:01 +0200 (CEST) Message-ID: <3c38dcd4-1e7a-2552-7df9-7033de7419fd@gjlay.de> Date: Thu, 25 May 2023 16:31:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: Wrong cost computation / conclusion ins insn combine? Content-Language: en-US To: "Richard Earnshaw (lists)" , gcc@gcc.gnu.org References: <1ed99189-7db3-6b97-db6b-86ce295c075a@gjlay.de> From: Georg-Johann Lay In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Am 24.05.23 um 14:31 schrieb Richard Earnshaw (lists): > On 23/05/2023 19:41, Georg-Johann Lay wrote: >> For some time now I am staring at the following test case and what >> combine does with it: >> >> typedef struct >> { >>      unsigned b0 : 1; >>      unsigned b1 : 1; >>      unsigned b2 : 1; >>      unsigned b3 : 1; >>      unsigned b4 : 1; >>      unsigned b5 : 1; >>      unsigned b6 : 1; >>      unsigned b7 : 1; >> } b_t; >> >> Prior to combine, there is: >> >> insn_cost 4 for    18: r52:QI = r24:QI >> insn_cost 4 for     2: r47:QI = r52:QI >> insn_cost 4 for     6: r48:QI = zero_extract(r47:QI,0x1,0) >> insn_cost 4 for     7: r50:QI = 0x1 >> insn_cost 4 for     8: r49:QI = r48:QI ^ r50:QI >> insn_cost 8 for     9: zero_extract(r47:QI,0x1,0) = r49:QI >> insn_cost 4 for    15: r24:QI = r47:QI >> >> So insn 6 extracts bit 0, insn 8 flips it, and insn 9 inserts >> it as bit 0 again. >> >> Combine then starts looking for combinations, and at some point >> comes up with: >> >> >> Trying 7 -> 9: >>      7: r50:QI = ~r47:QI >>      9: zero_extract(r47:QI,0x1,0) = r50:QI >> Successfully matched this instruction: >> (set (zero_extract:QI (reg/v:QI 47 [ a ]) >>                        (const_int 1 [0x1]) >>                        (const_int 0 [0])) >>       (not:QI (reg/v:QI 47 [ a ]))) >> allowing combination of insns 7 and 9 >> original costs 4 + 8 = 12 >> replacement cost 12 >> deferring deletion of insn with uid = 7. >> modifying insn i3     9: zero_extract(r47:QI,0x1,0)=~r47:QI >> deferring rescan insn with uid = 9. >> >> So the cost is 12 and this insn is accepted. But down the line, >> combine cooks up this: >> >> Trying 2, 9 -> 15: >>      2: r47:QI = r52:QI >>      9: zero_extract(r47:QI,0x1,0) = ~r47:QI >>     15: r24:QI = r47:QI >> ... >> Successfully matched this instruction: >> (set (reg/i:QI 24 r24) >>       (ior:QI (and:QI (reg:QI 52) >>                       (const_int -2 [0xfffffffffffffffe])) >>               (and:QI (reg/v:QI 47 [ a ]) >>                       (const_int 1 [0x1])))) >> allowing combination of insns 2, 9 and 15 >> original costs 4 + 12 + 4 = 20 >> replacement costs 4 + 12 = 16 >> deferring deletion of insn with uid = 2. >> modifying insn i2     9: r47:QI=~r52:QI >> deferring rescan insn with uid = 9. >> modifying insn i3    15: r24:QI=r52:QI&0xfffffffffffffffe|r47:QI&0x1 >> deferring rescan insn with uid = 15. >> >> So this one has a cost of 16 which is more expensive than the first >> combination. For example it still needs to flip the bit. >> >> So why is combine choosing the expensive replacement over the cheap one? > > Because it thinks it is cheaper.  As your log shows, the calculation is: > > original costs 4 + 12 + 4 = 20 > replacement costs 4 + 12 = 16 > > But the real problem is that two of the instructions in this example are > simple register-register move operations which will likely be eliminated > during register allocation anyway (due to coalescing) and this throws > off the cost calculations. > > I've seen this sort of thing before; perhaps the best solution would be > to override the cost of a simple (register to register) set and give it > a cost of zero.  Then we'd see that this new sequence is worse than the > original. Are you proposing to temporarily patch some cost hook during combine? Combine can't work out allocation costs because it won't look at constraints. And when function need special classes, register alloc has to kick out the hard regs again and re-alloc them... And when constraints like "+r" or "0" are present, combine won't do a great job in computing (spared) register allocation costs. For some insns there is even predicates that disallow combine to drop in hard-regs like combine_pseudo_register_operand, but that' a bit too much for this case. Johann >> >> Also it combines hard-registers in the 2nd case, but not in the >> 1st one.  So the costs get biased towards 2nd. >> >> Can someone explain why combine takes the more expensive solution? >> >> Target is avr, compiled with >> >> $ avr-gcc-14 bits.c -dumpbase "" -S -Os -fdump-rtl-combine-details -dp >> >> Johann > > R.