From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by sourceware.org (Postfix) with ESMTPS id ACFBC385829F for ; Mon, 22 Jan 2024 07:47:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ACFBC385829F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ACFBC385829F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705909627; cv=none; b=n6rgIIxxxX8OIR4QiDlG3YjKhHjchpHd7MR/33BxyCTtFUVdRyNdXHf2jCXTby5LRqAH6pjBN/UnA1j6Pl2NArhER1xi8X+RkG9KENPRmzQ1wlJIBfnRzV2VHr2tfNvgaymgd+0dB/jmYIRSO2v9lSe7BMY5TnLRp+DIa1xplWQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705909627; c=relaxed/simple; bh=aA5gBlmxfZoQZAb+wqbvgK36B2d/okRY3gicdbwsk6A=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=hjhAQ+timWci4l9PW/WKLOpB5c8maRha38eEQYdrHjPDHvOGM3DqLHJnwCTHltQvLp8jOAhvae8iJHsCRIgMqwSpu8+fvJaCkoVF+kXdCDINLszecfDz8SoY9nf+SWYDKVL+td2SUgnYByzweSCKsH1VQVEVU2MH5nbL8cSTuNQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-50eabbc3dccso3329648e87.2 for ; Sun, 21 Jan 2024 23:47:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705909622; x=1706514422; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=aVlL+vUDaYKo+zILfYmJgrdlEu9hrsw7zHfxNQoky20=; b=JezPpA/5fSuIswYFiZm36fq+bw1uH5P/j6O/yv2BWQ8xiyZOx/FTqKqlA/1oVb8bAF TMCcCZSByQsePHKGb853AHIHJ32WZFoxi22MGY6QaYdNTR8bUogbH+psA44WWM9Jfcw7 tw9S9z9aAGcAngF0JSU+BlTL2pOTU97V0xTIOkFmpWngph1i7aecNuDzgyh8Z6e/GQyp se0HjZFMq1EKIrTa5bEKJVIkj7BO2vTB7m/uIcCafzDp5jjap9VlB3jsrjbURiO1I11u PnX2mtvFWB/KWuT/MpiBkkBHAzVulfuMJo1c2cohl2qT20tq2+cel/sBA85yWtOc1bi/ dwag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705909622; x=1706514422; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aVlL+vUDaYKo+zILfYmJgrdlEu9hrsw7zHfxNQoky20=; b=sGKLh8e/y+w052K5MguwDguA41y+cgHnHSAhVLsQ+9VdPMSRd5H6D45c4rCOfA1MJO D0O6RL1AcVZcLAJXC2wvUU7slgNLOw8RRHxZsPJoSjtXQXnPqT+gupI7prDsUAIG4RgJ albIj8ebMVnWJjVHk0UrW84QL8xf3GHeurAxUgBI3Xe3MuW/7CnE8MJ8DEu1D64zO7Zk sN2LILRpz/KQobmhm6TtOlylOhcm+rNKzsaNrP4wcQSMAfy42N5U7qvUvYINjH/toInZ /5kkJRK9mZS35RIss9xL9vcAiYCbcFYEk6Bf2mO04F4IVhO55bHzRNw6dG5kGKQNFHsx iEyw== X-Gm-Message-State: AOJu0YxrCGZYiCeGpfEkC+spTwroXak9NEtKeAjULrMkSfz1yiJyb+7d KoAl2462xUbxziArYh6R8/YQxe7O6UT+ujz8LlXacBxBGK7XHUDEpbQtlpxfXqEFqlLFBpsL+2i 9j/tIZ9is9JNvoIAPjlrsXfLS4NZpQv6d X-Google-Smtp-Source: AGHT+IEJEwEMDX23K0nEJni4GWdD9lK+VCdLrZscLlZ/SUU2oROkPRyj2wLiE+q1TQpGsak+a/QbT3SgPeMo4c+eLZY= X-Received: by 2002:a05:6512:3d0f:b0:50e:7dcc:ef52 with SMTP id d15-20020a0565123d0f00b0050e7dccef52mr695726lfv.120.1705909621903; Sun, 21 Jan 2024 23:47:01 -0800 (PST) MIME-Version: 1.0 References: <023501da4a48$320e7540$962b5fc0$@nextmovesoftware.com> <71f8f116-e3b8-4e70-b30a-a4bc042466a2@gjlay.de> In-Reply-To: <71f8f116-e3b8-4e70-b30a-a4bc042466a2@gjlay.de> From: Richard Biener Date: Mon, 22 Jan 2024 08:45:47 +0100 Message-ID: Subject: Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates. To: Georg-Johann Lay Cc: Roger Sayle , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Jan 19, 2024 at 5:06=E2=80=AFPM Georg-Johann Lay wro= te: > > > > Am 18.01.24 um 20:54 schrieb Roger Sayle: > > > > This patch tweaks RTL expansion of multi-word shifts and rotates to use > > PLUS rather than IOR for disjunctive operations. During expansion of > > these operations, the middle-end creates RTL like (X<>C2) > > where the constants C1 and C2 guarantee that bits don't overlap. > > Hence the IOR can be performed by any any_or_plus operation, such as > > IOR, XOR or PLUS; for word-size operations where carry chains aren't > > an issue these should all be equally fast (single-cycle) instructions. > > The benefit of this change is that targets with shift-and-add insns, > > like x86's lea, can benefit from the LSHIFT-ADD form. > > > > An example of a backend that benefits is ARC, which is demonstrated > > by these two simple functions: > > But there are also back-ends where this is bad. > > The reason is that with ORI, the back-end needs only to operate no > these sub-words where the sub-mask is non-zero. But for PLUS this > is not the case because the back-end does not know that intermediate > carry will be zero. Hence, with PLUS, more instructions are needed. > An example is AVR, but maybe much more target with multi-word operations > are affected in a bad way. > > Take for example the case with 2 words and a value of 1. > > LO |=3D 1 > HI |=3D 0 > > can be optimized to > > LO |=3D 1 > > but for addition this is not the case: > > LO +=3D 1 > HI +=3Dc 0 ;; Does not know that always carry =3D 0. I wonder if the PLUS can be done on the lowpart only to make this detail obvious? > Johann > > > > > > unsigned long long foo(unsigned long long x) { return x<<2; } > > > > which with -O2 is currently compiled to: > > > > foo: lsr r2,r0,30 > > asl_s r1,r1,2 > > asl_s r0,r0,2 > > j_s.d [blink] > > or_s r1,r1,r2 > > > > with this patch becomes: > > > > foo: lsr r2,r0,30 > > add2 r1,r2,r1 > > j_s.d [blink] > > asl_s r0,r0,2 > > > > unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62); } > > > > which with -O2 is currently compiled to 6 insns + return: > > > > bar: lsr r12,r0,30 > > asl_s r3,r1,2 > > asl_s r0,r0,2 > > lsr_s r1,r1,30 > > or_s r0,r0,r1 > > j_s.d [blink] > > or r1,r12,r3 > > > > with this patch becomes 4 insns + return: > > > > bar: lsr r3,r1,30 > > lsr r2,r0,30 > > add2 r1,r2,r1 > > j_s.d [blink] > > add2 r0,r3,r0 > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=3Dunix{-m32} > > with no new failures. Ok for mainline? > > > > > > 2024-01-18 Roger Sayle > > > > gcc/ChangeLog > > * expmed.cc (expand_shift_1): Use add_optab instead of ior_opt= ab > > to generate PLUS instead or IOR when unioning disjoint bitfiel= ds. > > * optabs.cc (expand_subword_shift): Likewise. > > (expand_binop): Likewise for double-word rotate. > > > > > > Thanks in advance, > > Roger > > -- > >