From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mo4-p00-ob.smtp.rzone.de (mo4-p00-ob.smtp.rzone.de [85.215.255.23]) by sourceware.org (Postfix) with ESMTPS id 466EC3858C98 for ; Fri, 19 Jan 2024 16:05:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 466EC3858C98 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gjlay.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=gjlay.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 466EC3858C98 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=85.215.255.23 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1705680353; cv=pass; b=Gq6W6IzKX+4PD1Ml0XFrYtH7pI0lXDPHI4csxh1sN95FtMEdKQY/+kNX+LqmXxw22Ej4E8GWAS4ezPGWcAd4EWpcRQ0D5rF2/Vdk55fkBTqlvpDh3wsZ5tqTIXdeMFi6rxUQDaSrYnUKNQ+jPz+jG4u2e46sS+iRdn5+gIdGv64= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1705680353; c=relaxed/simple; bh=bE2uxGbTeYf6pN26YPSImzqvKRA3Xe3UZuSyjDRvXxw=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version: Subject:To:From; b=ZEpa6z5gyATIvPqaaR1JdsRYUoSFiow1Gs/PH5BhePnxHlIcj1kF6JvmDWCcsle/WV8QvPiDHvKxQ8dfuPW30Yhv4FJnATukxm7vbwCUQObDI9iX59upki161bVgmY2ABf5P4vVD3+FtGTvgj8FWS+nshXeQgONhI7ju6zIRe1I= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; t=1705680350; cv=none; d=strato.com; s=strato-dkim-0002; b=DKs6XEPYGOrfmrjSKvxaAcCin4tjEb7KqmS8dY92xhhB8IJ3+cwI5QNSGxEoPKeTrj hp8zMNBMOxuRzRo0ixuodJDxqeGophm5/LQzAd2eDQA+X7MdLbyaXufjAchaSntHDYOh rRbA08m7BSJwZu5zIph6zHyvn1fuzToask576aavNns7etacH3mfzp2ds8/ftdShKlrR X9iJl9AcUTu+H9093Pb7FwF5nFbl0AbTaI239cHRn23DBNwFPDyJgc38xNkq6FHRxo9V yj4iE3fYYUIpb6qmeR9JgU1WnvG+PkpEtrYoTMDRgKDyK2FVqdSn4exGEHH0u+y4pWg3 pvVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1705680350; s=strato-dkim-0002; d=strato.com; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=haJDXK5e+2Hx/uPUBOa2ZOb5HwikaUEZyEZfncMqpps=; b=lIHekyl3S+59Av8PErS+aduxIq/3Du7tOjL4r4DFvy9hsmpNmXdNeiXwztBz1lYKCY 4BQwRiU72FPFCqh9f6kFxdZmcW8Y8Ueuw3FS7tyHPcvBcNrSjjgPeEgvn6uEA0qVf5wz ZhBGIYkw2aNWBB0nJu068QjjFG4xVoaZNaiEGqIvOtHyxsjg4+05bHao/uZ/JZy5gCkk YTZUVWADZWZwCaz4X6M4ivfSaieRlYNanD3opVlXi3ubh1jAGYYm74FT5qJc6fRqlQ3L JLr6SkORGM6C0F/+599vqvdW/0nSMS8PthBrsFUKmaKq3zSbJrPetP/yi+bkkkAvLKuJ rp6g== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1705680350; s=strato-dkim-0002; d=gjlay.de; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=haJDXK5e+2Hx/uPUBOa2ZOb5HwikaUEZyEZfncMqpps=; b=lbmPB3MCRGcuLiASooZvFUvg8TxfJtNrLK8nseC9mPDs7kPjk3Ie2XaVUrtcj6Q+WR jwe2O2ZPqGu4O/F+0MiIlMOeHXE5VamKdNDY7hfO/iTtvfpOT1tLNhzQx528yUxM5/hE mybCtMHmkwiHY+jzQy2iZhTn8Kilfs8HX74/Cf3LHUne3sl8nWmVwpqNvp0DG/kVQ035 aDdbEn9OD9bjKuoZ1agv6RG2MeWxfTWWznGf4kDQgq5vPmovg9zctPwBqlR1cz/U8rxs 0420U3Hva4gJ78jfEObxqvCkSeW6VPC5FKJnjcJX71/XVzf1bXJIY6rBDRKbsqJE/V+u Ic3A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1705680350; s=strato-dkim-0003; d=gjlay.de; h=In-Reply-To:From:References:To:Subject:Date:Message-ID:Cc:Date:From: Subject:Sender; bh=haJDXK5e+2Hx/uPUBOa2ZOb5HwikaUEZyEZfncMqpps=; b=NfEZdd09O8PapIDrFuxBZXBBnRhSY6dw+JsiNbcUgnYeH+w5WRMNVpeMAi/3Urv6TI Xiv0nLCqTn/crEblV/CQ== X-RZG-AUTH: ":LXoWVUeid/7A29J/hMvvT3koxZnKT7Qq0xotTetVnKkSjsSjq3WhKPVxx3mY" Received: from [192.168.2.102] by smtp.strato.de (RZmta 49.10.2 DYNA|AUTH) with ESMTPSA id g5de8600JG5o2fV (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Fri, 19 Jan 2024 17:05:50 +0100 (CET) Message-ID: <71f8f116-e3b8-4e70-b30a-a4bc042466a2@gjlay.de> Date: Fri, 19 Jan 2024 17:05:50 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates. Content-Language: en-US To: Roger Sayle , gcc-patches@gcc.gnu.org References: <023501da4a48$320e7540$962b5fc0$@nextmovesoftware.com> From: Georg-Johann Lay In-Reply-To: <023501da4a48$320e7540$962b5fc0$@nextmovesoftware.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Am 18.01.24 um 20:54 schrieb Roger Sayle: > > This patch tweaks RTL expansion of multi-word shifts and rotates to use > PLUS rather than IOR for disjunctive operations. During expansion of > these operations, the middle-end creates RTL like (X<>C2) > where the constants C1 and C2 guarantee that bits don't overlap. > Hence the IOR can be performed by any any_or_plus operation, such as > IOR, XOR or PLUS; for word-size operations where carry chains aren't > an issue these should all be equally fast (single-cycle) instructions. > The benefit of this change is that targets with shift-and-add insns, > like x86's lea, can benefit from the LSHIFT-ADD form. > > An example of a backend that benefits is ARC, which is demonstrated > by these two simple functions: But there are also back-ends where this is bad. The reason is that with ORI, the back-end needs only to operate no these sub-words where the sub-mask is non-zero. But for PLUS this is not the case because the back-end does not know that intermediate carry will be zero. Hence, with PLUS, more instructions are needed. An example is AVR, but maybe much more target with multi-word operations are affected in a bad way. Take for example the case with 2 words and a value of 1. LO |= 1 HI |= 0 can be optimized to LO |= 1 but for addition this is not the case: LO += 1 HI +=c 0 ;; Does not know that always carry = 0. Johann > > unsigned long long foo(unsigned long long x) { return x<<2; } > > which with -O2 is currently compiled to: > > foo: lsr r2,r0,30 > asl_s r1,r1,2 > asl_s r0,r0,2 > j_s.d [blink] > or_s r1,r1,r2 > > with this patch becomes: > > foo: lsr r2,r0,30 > add2 r1,r2,r1 > j_s.d [blink] > asl_s r0,r0,2 > > unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62); } > > which with -O2 is currently compiled to 6 insns + return: > > bar: lsr r12,r0,30 > asl_s r3,r1,2 > asl_s r0,r0,2 > lsr_s r1,r1,30 > or_s r0,r0,r1 > j_s.d [blink] > or r1,r12,r3 > > with this patch becomes 4 insns + return: > > bar: lsr r3,r1,30 > lsr r2,r0,30 > add2 r1,r2,r1 > j_s.d [blink] > add2 r0,r3,r0 > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2024-01-18 Roger Sayle > > gcc/ChangeLog > * expmed.cc (expand_shift_1): Use add_optab instead of ior_optab > to generate PLUS instead or IOR when unioning disjoint bitfields. > * optabs.cc (expand_subword_shift): Likewise. > (expand_binop): Likewise for double-word rotate. > > > Thanks in advance, > Roger > -- >