From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) by sourceware.org (Postfix) with ESMTPS id 26B36385841C for ; Thu, 25 Jan 2024 09:21:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 26B36385841C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 26B36385841C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706174505; cv=none; b=HV0M3cG9DH1Kmx7p5dVykr20nQu90uk9UmwL5UdvbvV0gMSPxR7YAP1s8xLHqmiy/7+atuW12QnwUG3tEueSskElw5uDISdCD/D1njK6PLlijD2lt5F19lkpkWnfz4evfQFHYTGwT64YOn07yyQniD+0TWn4nt0+YIrzm5+fJ3c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706174505; c=relaxed/simple; bh=S4oQArx4ePH0pG8Y0US5O7cxnni3YD88docXOqM0Tcw=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=eV1f7TjKrARZgrQ551C8qfVSfGhAJIT9olkve527VLvjx6KwRRUG7YZLHs1gdoA4EBV8TINszjEZfBNZ5sFe1ckGkQUE9FfUTTstd/rBvGf3X733WmkvF9LMiJntYZbhV6cs2EpfdaHvBJ19dZG+U3hkTdh9rJaQ9Mv43PCTL5I= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12d.google.com with SMTP id 2adb3069b0e04-5100cb64e7dso2843878e87.0 for ; Thu, 25 Jan 2024 01:21:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706174497; x=1706779297; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=O8A2eYFY/VDR/Hmm6FeN5E6pCW/WoBLS0d96t9VUZuY=; b=WgumI4QfG+q4KZuwH2Xb5UZZcUiRgJkGVb13P30R3TpVviTMKXBADmBohd2RU5+88k yHsG22LoEFcRgxnl4wfr1vxM/xYCf15xoEmX9v9S2EXh1Y91LQsusg7F06I3VQFbc7i/ ZVftbkp5ZEz79qmJxFanMrj/VEd8F29EHJi4xN1QnLCSgN4N1DKsb1bsL1DZpvnyPdad cHNLgBCn8Xjz1LSZBCDaD8FJX2+ITlynKQfCDWP/ZBbdTKeuFYSAECKKmUEybxngk/FC gGZVIEe8gg5Ak3wfIV7W/g2SKz2yZCjRGvvkTp5rTGIJsDG0I/GFrBxtj2vuTEo/zmCm ENOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706174497; x=1706779297; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=O8A2eYFY/VDR/Hmm6FeN5E6pCW/WoBLS0d96t9VUZuY=; b=EbpPR6I5TFY+4Qq3q1WRO2XBMUnaAK8c+SJYvx22YSJ3m8YlIUgsO/X3AAFs0wBNBr etee3WLub0cTtb0t7vRvAlvL5PqIr/lJu+uYHjGA/uerFJQmAVfRlywaloRZoJnclLwb wO49XwzbkDesJwMLEgBj9LRh4auwWluR3u7mvU+mbEd0LkeDNCUyLQCeBIR/WH/Yr2Pe w2dKJH1Y2k7DEIRLQkWh7LXKp/i6lhVgQczCvJNFpoFR/2M4m4Xboq99dK6QFq3MHNLC Cv3HOS3TAoPausnXTcWBEVOJ3wrFjrtz3QQrhSb64PLKgDDuoW5RNcIFNO7EJpgW7vBN p6Hg== X-Gm-Message-State: AOJu0Yz/4ZvypmORU2pL57DDp8E7KN4pUS+PfiSTWqrYVJu+m44qOedl eJ4768pP5bUEW7agmWAWEzaOoL+W1NZG9UHrjBirCI5Ap8vJQZD2oj6iSEwR3Qh37wPSMhCix6w y6Qav/TTg7an+R55d+OYm/CaYeKvNd4Nb X-Google-Smtp-Source: AGHT+IFEUmlc9nVFZewwcRuK2AmU2Tk1QhMFdNP8GgftKpheIIMPHgQnTKigh+fZ4zQ30gZ3hy2XgsgysRTgcrH2mcU= X-Received: by 2002:a05:6512:15d:b0:50e:e3e5:40f8 with SMTP id m29-20020a056512015d00b0050ee3e540f8mr207080lfo.1.1706174497202; Thu, 25 Jan 2024 01:21:37 -0800 (PST) MIME-Version: 1.0 References: <023501da4a48$320e7540$962b5fc0$@nextmovesoftware.com> <71f8f116-e3b8-4e70-b30a-a4bc042466a2@gjlay.de> <3ad0a683-e227-4499-bc5a-c08a9b50eb66@gjlay.de> In-Reply-To: <3ad0a683-e227-4499-bc5a-c08a9b50eb66@gjlay.de> From: Richard Biener Date: Thu, 25 Jan 2024 10:20:18 +0100 Message-ID: Subject: Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates. To: Georg-Johann Lay Cc: Roger Sayle , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Jan 24, 2024 at 4:50=E2=80=AFPM Georg-Johann Lay wro= te: > > > > Am 22.01.24 um 08:45 schrieb Richard Biener: > > On Fri, Jan 19, 2024 at 5:06=E2=80=AFPM Georg-Johann Lay = wrote: > >> > >> > >> > >> Am 18.01.24 um 20:54 schrieb Roger Sayle: > >>> > >>> This patch tweaks RTL expansion of multi-word shifts and rotates to u= se > >>> PLUS rather than IOR for disjunctive operations. During expansion of > >>> these operations, the middle-end creates RTL like (X<>C2) > >>> where the constants C1 and C2 guarantee that bits don't overlap. > >>> Hence the IOR can be performed by any any_or_plus operation, such as > >>> IOR, XOR or PLUS; for word-size operations where carry chains aren't > >>> an issue these should all be equally fast (single-cycle) instructions= . > >>> The benefit of this change is that targets with shift-and-add insns, > >>> like x86's lea, can benefit from the LSHIFT-ADD form. > >>> > >>> An example of a backend that benefits is ARC, which is demonstrated > >>> by these two simple functions: > >> > >> But there are also back-ends where this is bad. > >> > >> The reason is that with ORI, the back-end needs only to operate no > >> these sub-words where the sub-mask is non-zero. But for PLUS this > >> is not the case because the back-end does not know that intermediate > >> carry will be zero. Hence, with PLUS, more instructions are needed. > >> An example is AVR, but maybe much more target with multi-word operatio= ns > >> are affected in a bad way. > >> > >> Take for example the case with 2 words and a value of 1. > >> > >> LO |=3D 1 > >> HI |=3D 0 > >> > >> can be optimized to > >> > >> LO |=3D 1 > >> > >> but for addition this is not the case: > >> > >> LO +=3D 1 > >> HI +=3Dc 0 ;; Does not know that always carry =3D 0. > > > > I wonder if the PLUS can be done on the lowpart only to make this > > detail obvious? > > For AVR, word_mode is HImode, but the hardware has only 8-bit registers. > > Moreover splitting insns is not wanted or not possible (due to CCmode). Btw, it would be nice to have test coverage on AVR for the cases we're talking about (if there isn't already). That makes sure we don't regress with whatever solution we end up with. Richard. > Johann > > >>> unsigned long long foo(unsigned long long x) { return x<<2; } > >>> > >>> which with -O2 is currently compiled to: > >>> > >>> foo: lsr r2,r0,30 > >>> asl_s r1,r1,2 > >>> asl_s r0,r0,2 > >>> j_s.d [blink] > >>> or_s r1,r1,r2 > >>> > >>> with this patch becomes: > >>> > >>> foo: lsr r2,r0,30 > >>> add2 r1,r2,r1 > >>> j_s.d [blink] > >>> asl_s r0,r0,2 > >>> > >>> unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62);= } > >>> > >>> which with -O2 is currently compiled to 6 insns + return: > >>> > >>> bar: lsr r12,r0,30 > >>> asl_s r3,r1,2 > >>> asl_s r0,r0,2 > >>> lsr_s r1,r1,30 > >>> or_s r0,r0,r1 > >>> j_s.d [blink] > >>> or r1,r12,r3 > >>> > >>> with this patch becomes 4 insns + return: > >>> > >>> bar: lsr r3,r1,30 > >>> lsr r2,r0,30 > >>> add2 r1,r2,r1 > >>> j_s.d [blink] > >>> add2 r0,r3,r0 > >>> > >>> > >>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > >>> and make -k check, both with and without --target_board=3Dunix{-m32} > >>> with no new failures. Ok for mainline? > >>> > >>> > >>> 2024-01-18 Roger Sayle > >>> > >>> gcc/ChangeLog > >>> * expmed.cc (expand_shift_1): Use add_optab instead of ior_= optab > >>> to generate PLUS instead or IOR when unioning disjoint bitf= ields. > >>> * optabs.cc (expand_subword_shift): Likewise. > >>> (expand_binop): Likewise for double-word rotate. > >>> > >>> > >>> Thanks in advance, > >>> Roger