public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Georg-Johann Lay <avr@gjlay.de>
Cc: Roger Sayle <roger@nextmovesoftware.com>, gcc-patches@gcc.gnu.org
Subject: Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.
Date: Thu, 25 Jan 2024 10:20:18 +0100	[thread overview]
Message-ID: <CAFiYyc0FJs9jf2w62H-EcDb+-=acdrR=ugbFKVFCN6Z8P5fSdA@mail.gmail.com> (raw)
In-Reply-To: <3ad0a683-e227-4499-bc5a-c08a9b50eb66@gjlay.de>

On Wed, Jan 24, 2024 at 4:50 PM Georg-Johann Lay <avr@gjlay.de> wrote:
>
>
>
> Am 22.01.24 um 08:45 schrieb Richard Biener:
> > On Fri, Jan 19, 2024 at 5:06 PM Georg-Johann Lay <avr@gjlay.de> wrote:
> >>
> >>
> >>
> >> Am 18.01.24 um 20:54 schrieb Roger Sayle:
> >>>
> >>> This patch tweaks RTL expansion of multi-word shifts and rotates to use
> >>> PLUS rather than IOR for disjunctive operations.  During expansion of
> >>> these operations, the middle-end creates RTL like (X<<C1) | (Y>>C2)
> >>> where the constants C1 and C2 guarantee that bits don't overlap.
> >>> Hence the IOR can be performed by any any_or_plus operation, such as
> >>> IOR, XOR or PLUS; for word-size operations where carry chains aren't
> >>> an issue these should all be equally fast (single-cycle) instructions.
> >>> The benefit of this change is that targets with shift-and-add insns,
> >>> like x86's lea, can benefit from the LSHIFT-ADD form.
> >>>
> >>> An example of a backend that benefits is ARC, which is demonstrated
> >>> by these two simple functions:
> >>
> >> But there are also back-ends where this is bad.
> >>
> >> The reason is that with ORI, the back-end needs only to operate no
> >> these sub-words where the sub-mask is non-zero.  But for PLUS this
> >> is not the case because the back-end does not know that intermediate
> >> carry will be zero.  Hence, with PLUS, more instructions are needed.
> >> An example is AVR, but maybe much more target with multi-word operations
> >> are affected in a bad way.
> >>
> >> Take for example the case with 2 words and a value of 1.
> >>
> >> LO |= 1
> >> HI |= 0
> >>
> >> can be optimized to
> >>
> >> LO |= 1
> >>
> >> but for addition this is not the case:
> >>
> >> LO += 1
> >> HI +=c 0 ;; Does not know that always carry = 0.
> >
> > I wonder if the PLUS can be done on the lowpart only to make this
> > detail obvious?
>
> For AVR, word_mode is HImode, but the hardware has only 8-bit registers.
>
> Moreover splitting insns is not wanted or not possible (due to CCmode).

Btw, it would be nice to have test coverage on AVR for the cases we're
talking about (if there isn't already).  That makes sure we don't regress
with whatever solution we end up with.

Richard.

> Johann
>
> >>> unsigned long long foo(unsigned long long x) { return x<<2; }
> >>>
> >>> which with -O2 is currently compiled to:
> >>>
> >>> foo:    lsr     r2,r0,30
> >>>           asl_s   r1,r1,2
> >>>           asl_s   r0,r0,2
> >>>           j_s.d   [blink]
> >>>           or_s    r1,r1,r2
> >>>
> >>> with this patch becomes:
> >>>
> >>> foo:    lsr     r2,r0,30
> >>>           add2    r1,r2,r1
> >>>           j_s.d   [blink]
> >>>           asl_s   r0,r0,2
> >>>
> >>> unsigned long long bar(unsigned long long x) { return (x<<2)|(x>>62); }
> >>>
> >>> which with -O2 is currently compiled to 6 insns + return:
> >>>
> >>> bar:    lsr     r12,r0,30
> >>>           asl_s   r3,r1,2
> >>>           asl_s   r0,r0,2
> >>>           lsr_s   r1,r1,30
> >>>           or_s    r0,r0,r1
> >>>           j_s.d   [blink]
> >>>           or      r1,r12,r3
> >>>
> >>> with this patch becomes 4 insns + return:
> >>>
> >>> bar:    lsr     r3,r1,30
> >>>           lsr     r2,r0,30
> >>>           add2    r1,r2,r1
> >>>           j_s.d   [blink]
> >>>           add2    r0,r3,r0
> >>>
> >>>
> >>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> >>> and make -k check, both with and without --target_board=unix{-m32}
> >>> with no new failures.  Ok for mainline?
> >>>
> >>>
> >>> 2024-01-18  Roger Sayle  <roger@nextmovesoftware.com>
> >>>
> >>> gcc/ChangeLog
> >>>           * expmed.cc (expand_shift_1): Use add_optab instead of ior_optab
> >>>           to generate PLUS instead or IOR when unioning disjoint bitfields.
> >>>           * optabs.cc (expand_subword_shift): Likewise.
> >>>           (expand_binop): Likewise for double-word rotate.
> >>>
> >>>
> >>> Thanks in advance,
> >>> Roger

  reply	other threads:[~2024-01-25  9:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-18 19:54 Roger Sayle
2024-01-19 11:03 ` Richard Biener
2024-01-19 13:26   ` Roger Sayle
2024-01-19 13:49     ` Richard Biener
2024-01-19 16:05 ` Georg-Johann Lay
2024-01-19 16:50   ` Jeff Law
2024-01-20  9:31     ` Uros Bizjak
2024-01-22  7:45   ` Richard Biener
2024-01-22 15:51     ` Jeff Law
2024-01-24 15:49     ` Georg-Johann Lay
2024-01-25  9:20       ` Richard Biener [this message]
2024-06-09  1:48 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc0FJs9jf2w62H-EcDb+-=acdrR=ugbFKVFCN6Z8P5fSdA@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=avr@gjlay.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=roger@nextmovesoftware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).