public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/107949] New: PPC: Unnecessary rlwinm after lbzx
@ 2022-12-02  8:13 jens.seifert at de dot ibm.com
  2022-12-02 10:23 ` [Bug target/107949] " jens.seifert at de dot ibm.com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: jens.seifert at de dot ibm.com @ 2022-12-02  8:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107949

            Bug ID: 107949
           Summary: PPC: Unnecessary rlwinm after lbzx
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jens.seifert at de dot ibm.com
  Target Milestone: ---

extern unsigned char magic1[256];

unsigned int hash(const unsigned char inp[4])
{
   const unsigned long long INIT = 0x1ULL;
   unsigned long long h1 = INIT;
   h1 = magic1[((unsigned long long)inp[0]) ^ h1];
   h1 = magic1[((unsigned long long)inp[1]) ^ h1];
   h1 = magic1[((unsigned long long)inp[2]) ^ h1];
   h1 = magic1[((unsigned long long)inp[3]) ^ h1];
   return h1;
}

#ifdef __powerpc__
#define lbzx(b,c) ({ unsigned long long r; __asm__("lbzx
%0,%1,%2":"=r"(r):"b"(b),"r"(c)); r; })
unsigned int hash2(const unsigned char inp[4])
{
   const unsigned long long INIT = 0x1ULL;
   unsigned long long h1 = INIT;
   h1 = lbzx(magic1, inp[0] ^ h1);
   h1 = lbzx(magic1, inp[1] ^ h1);
   h1 = lbzx(magic1, inp[2] ^ h1);
   h1 = lbzx(magic1, inp[3] ^ h1);
   return h1;
}
#endif

Extra rlwinm get added.

hash(unsigned char const*):
.LCF0:
        addi 2,2,.TOC.-.LCF0@l
        lbz 9,0(3)
        addis 10,2,.LC0@toc@ha
        ld 10,.LC0@toc@l(10)
        lbz 6,1(3)
        lbz 7,2(3)
        lbz 8,3(3)
        xori 9,9,0x1
        lbzx 9,10,9
        xor 9,9,6
        rlwinm 9,9,0,0xff <= not necessary
        lbzx 9,10,9
        xor 9,9,7
        rlwinm 9,9,0,0xff <= not necessary
        lbzx 9,10,9
        xor 9,9,8
        rlwinm 9,9,0,0xff <= not necessary
        lbzx 3,10,9
        blr
        .long 0
        .byte 0,9,0,0,0,0,0,0
hash2(unsigned char const*):
.LCF1:
        addi 2,2,.TOC.-.LCF1@l
        lbz 7,0(3)
        lbz 8,1(3)
        lbz 10,2(3)
        lbz 6,3(3)
        addis 9,2,.LC1@toc@ha
        ld 9,.LC1@toc@l(9)
        xori 7,7,0x1
        lbzx 7,9,7
        xor 8,8,7
        lbzx 8,9,8
        xor 10,10,8
        lbzx 10,9,10
        xor 10,6,10
        lbzx 3,9,10
        rldicl 3,3,0,32
        blr

Tiny sample:
unsigned long long tiny(const unsigned char *inp)
{
  return inp[0] ^ inp[1];
}

tiny(unsigned char const*):
        lbz 9,0(3)
        lbz 10,1(3)
        xor 3,9,10
        rlwinm 3,3,0,0xff
        blr
        .long 0
        .byte 0,9,0,0,0,0,0,0

unsigned long long tiny2(const unsigned char *inp)
{
  unsigned long long a = inp[0];
  unsigned long long b = inp[1];
  return a ^ b;
}

tiny2(unsigned char const*):
        lbz 9,0(3)
        lbz 10,1(3)
        xor 3,9,10
        rlwinm 3,3,0,0xff
        blr
        .long 0
        .byte 0,9,0,0,0,0,0,0

lbz/lbzx creates a value 0 <= x < 256. xor of 2 such values does not change
value range.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/107949] PPC: Unnecessary rlwinm after lbzx
  2022-12-02  8:13 [Bug target/107949] New: PPC: Unnecessary rlwinm after lbzx jens.seifert at de dot ibm.com
@ 2022-12-02 10:23 ` jens.seifert at de dot ibm.com
  2022-12-02 16:07 ` [Bug rtl-optimization/107949] " pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jens.seifert at de dot ibm.com @ 2022-12-02 10:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107949

--- Comment #1 from Jens Seifert <jens.seifert at de dot ibm.com> ---
hash2 is only provided to show how the code should look like (without rlwinm).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/107949] PPC: Unnecessary rlwinm after lbzx
  2022-12-02  8:13 [Bug target/107949] New: PPC: Unnecessary rlwinm after lbzx jens.seifert at de dot ibm.com
  2022-12-02 10:23 ` [Bug target/107949] " jens.seifert at de dot ibm.com
@ 2022-12-02 16:07 ` pinskia at gcc dot gnu.org
  2022-12-10 12:09 ` jens.seifert at de dot ibm.com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-12-02 16:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107949

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-12-02
          Component|target                      |rtl-optimization
             Target|powerpc                     |powerpc, aarch64
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
aarch64 has the same issue (with almost the same RTL even)


(insn 6 3 7 2 (set (reg:QI 123 [ *inp_5(D) ])
        (mem:QI (reg/v/f:SI 121 [ inp ]) [0 *inp_5(D)+0 S1 A8]))
"/app/example.cpp":4:17 554 {*movqi_internal}
     (nil))
(insn 7 6 8 2 (set (reg:QI 124 [ MEM[(const unsigned char *)inp_5(D) + 1B] ])
        (mem:QI (plus:SI (reg/v/f:SI 121 [ inp ])
                (const_int 1 [0x1])) [0 MEM[(const unsigned char *)inp_5(D) +
1B]+0 S1 A8])) "/app/example.cpp":4:17 554 {*movqi_internal}
     (expr_list:REG_DEAD (reg/v/f:SI 121 [ inp ])
        (nil)))
(insn 8 7 9 2 (set (reg:SI 125)
        (xor:SI (subreg:SI (reg:QI 123 [ *inp_5(D) ]) 0)
            (subreg:SI (reg:QI 124 [ MEM[(const unsigned char *)inp_5(D) + 1B]
]) 0))) "/app/example.cpp":4:17 215 {*boolsi3}
     (expr_list:REG_DEAD (reg:QI 124 [ MEM[(const unsigned char *)inp_5(D) +
1B] ])
        (expr_list:REG_DEAD (reg:QI 123 [ *inp_5(D) ])
            (nil))))
(note 9 8 21 2 NOTE_INSN_DELETED)
(insn 21 9 22 2 (set (reg:SI 3 3)
        (const_int 0 [0])) "/app/example.cpp":5:1 549 {*movsi_internal1}
     (nil))
(insn 22 21 17 2 (set (reg:SI 4 4 [orig:3+4 ] [3])
        (zero_extend:SI (subreg:QI (reg:SI 125) 3))) "/app/example.cpp":5:1 4
{zero_extendqisi2}
     (expr_list:REG_DEAD (reg:SI 125)
        (nil)))

We lose track (didn't take into account LOAD_EXTEND_OP(QImode) is ZERO_EXTEND
?) that the memory was zero extended.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/107949] PPC: Unnecessary rlwinm after lbzx
  2022-12-02  8:13 [Bug target/107949] New: PPC: Unnecessary rlwinm after lbzx jens.seifert at de dot ibm.com
  2022-12-02 10:23 ` [Bug target/107949] " jens.seifert at de dot ibm.com
  2022-12-02 16:07 ` [Bug rtl-optimization/107949] " pinskia at gcc dot gnu.org
@ 2022-12-10 12:09 ` jens.seifert at de dot ibm.com
  2023-01-08 17:31 ` segher at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jens.seifert at de dot ibm.com @ 2022-12-10 12:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107949

--- Comment #3 from Jens Seifert <jens.seifert at de dot ibm.com> ---
*** Bug 108048 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/107949] PPC: Unnecessary rlwinm after lbzx
  2022-12-02  8:13 [Bug target/107949] New: PPC: Unnecessary rlwinm after lbzx jens.seifert at de dot ibm.com
                   ` (2 preceding siblings ...)
  2022-12-10 12:09 ` jens.seifert at de dot ibm.com
@ 2023-01-08 17:31 ` segher at gcc dot gnu.org
  2023-02-17 17:21 ` bergner at gcc dot gnu.org
  2023-02-17 17:33 ` segher at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: segher at gcc dot gnu.org @ 2023-01-08 17:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107949

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org

--- Comment #4 from Segher Boessenkool <segher at gcc dot gnu.org> ---
How would GCC know no extension is needed?  The asm template is not parsed
at all, by design.  Making h1 an unsigned char might solve it here?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/107949] PPC: Unnecessary rlwinm after lbzx
  2022-12-02  8:13 [Bug target/107949] New: PPC: Unnecessary rlwinm after lbzx jens.seifert at de dot ibm.com
                   ` (3 preceding siblings ...)
  2023-01-08 17:31 ` segher at gcc dot gnu.org
@ 2023-02-17 17:21 ` bergner at gcc dot gnu.org
  2023-02-17 17:33 ` segher at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: bergner at gcc dot gnu.org @ 2023-02-17 17:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107949

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bergner at gcc dot gnu.org

--- Comment #5 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #4)
> How would GCC know no extension is needed?  The asm template is not parsed
> at all, by design.  Making h1 an unsigned char might solve it here?

The version with the inline asm isn't what Jens is worried about, that gives
the generated code he wants (ie, no rlwinm).  He asking why the fully C version
of the test case adds the unneeded rlwinm.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/107949] PPC: Unnecessary rlwinm after lbzx
  2022-12-02  8:13 [Bug target/107949] New: PPC: Unnecessary rlwinm after lbzx jens.seifert at de dot ibm.com
                   ` (4 preceding siblings ...)
  2023-02-17 17:21 ` bergner at gcc dot gnu.org
@ 2023-02-17 17:33 ` segher at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: segher at gcc dot gnu.org @ 2023-02-17 17:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107949

--- Comment #6 from Segher Boessenkool <segher at gcc dot gnu.org> ---
We generate loads into QImode regs, so we need to explicitly convert it to
whatever larger mode is wanted later.  We also have define_insns to do a
zero-extended load directly into a bigger pseudo, but that isn't used
apparently.

This is one instance of a much more generic problem; on rs6000 this is
usually observed as SImode being extended to DImode more often than
needed / wanted.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-17 17:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-02  8:13 [Bug target/107949] New: PPC: Unnecessary rlwinm after lbzx jens.seifert at de dot ibm.com
2022-12-02 10:23 ` [Bug target/107949] " jens.seifert at de dot ibm.com
2022-12-02 16:07 ` [Bug rtl-optimization/107949] " pinskia at gcc dot gnu.org
2022-12-10 12:09 ` jens.seifert at de dot ibm.com
2023-01-08 17:31 ` segher at gcc dot gnu.org
2023-02-17 17:21 ` bergner at gcc dot gnu.org
2023-02-17 17:33 ` segher at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).