public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/95737] New: PPC: Unnecessary extsw after negative less than
@ 2020-06-18  8:10 jens.seifert at de dot ibm.com
  2020-06-19 16:40 ` [Bug target/95737] " wschmidt at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: jens.seifert at de dot ibm.com @ 2020-06-18  8:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

            Bug ID: 95737
           Summary: PPC: Unnecessary extsw after negative less than
           Product: gcc
           Version: 8.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jens.seifert at de dot ibm.com
  Target Milestone: ---

unsigned long long negativeLessThan(unsigned long long a, unsigned long long b)
{
   return -(a < b);
}

gcc -m64 -O2 -save-temps negativeLessThan.C

creates

_Z16negativeLessThanyy:
.LFB0:
        .cfi_startproc
        subfc 4,4,3
        subfe 3,3,3
        extsw 3,3
        blr

The extsw is not necessary.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
@ 2020-06-19 16:40 ` wschmidt at gcc dot gnu.org
  2020-06-19 16:42 ` wschmidt at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2020-06-19 16:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

--- Comment #1 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
Please test this out of context of a return statement.  The problem with
unnecessary extends of return values is widely known and not specific to this
particular case.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
  2020-06-19 16:40 ` [Bug target/95737] " wschmidt at gcc dot gnu.org
@ 2020-06-19 16:42 ` wschmidt at gcc dot gnu.org
  2020-06-19 17:43 ` jens.seifert at de dot ibm.com
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2020-06-19 16:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

Bill Schmidt <wschmidt at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #2 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
If you can show this is different from 65010 (not a return value issue), please
reopen.

*** This bug has been marked as a duplicate of bug 65010 ***

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
  2020-06-19 16:40 ` [Bug target/95737] " wschmidt at gcc dot gnu.org
  2020-06-19 16:42 ` wschmidt at gcc dot gnu.org
@ 2020-06-19 17:43 ` jens.seifert at de dot ibm.com
  2020-06-21 15:35 ` wschmidt at linux dot ibm.com
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jens.seifert at de dot ibm.com @ 2020-06-19 17:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

Jens Seifert <jens.seifert at de dot ibm.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|DUPLICATE                   |---

--- Comment #3 from Jens Seifert <jens.seifert at de dot ibm.com> ---
This is different as the extsw also happens if the result gets used e.g.
followed by a andc, which is my case. I obviously oversimplified the sample. It
has nothing to do with function result and ABI requirements. gcc assume that
the result of -(a < b) implemented by subfc, subfe is signed 32-bit. But the
result is already 64-bit.

unsigned long long branchlesconditional(unsigned long long a, unsigned long
long b, unsigned long long c)
{
   unsigned long long mask = -(a < b);
   return c &~ mask;
}

results in

_Z20branchlesconditionalyyy:
.LFB1:
        .cfi_startproc
        subfc 4,4,3
        subfe 3,3,3
        not 3,3
        extsw 3,3
        and 3,3,5
        blr

expected
subfc
subfe
andc

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
                   ` (2 preceding siblings ...)
  2020-06-19 17:43 ` jens.seifert at de dot ibm.com
@ 2020-06-21 15:35 ` wschmidt at linux dot ibm.com
  2020-06-21 16:30 ` segher at gcc dot gnu.org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: wschmidt at linux dot ibm.com @ 2020-06-21 15:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

--- Comment #4 from wschmidt at linux dot ibm.com ---
On 6/19/20 12:43 PM, jens.seifert at de dot ibm.com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737
>
> Jens Seifert <jens.seifert at de dot ibm.com> changed:
>
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>               Status|RESOLVED                    |UNCONFIRMED
>           Resolution|DUPLICATE                   |---
>
> --- Comment #3 from Jens Seifert <jens.seifert at de dot ibm.com> ---
> This is different as the extsw also happens if the result gets used e.g.
> followed by a andc, which is my case. I obviously oversimplified the sample. It
> has nothing to do with function result and ABI requirements. gcc assume that
> the result of -(a < b) implemented by subfc, subfe is signed 32-bit. But the
> result is already 64-bit.
>
> unsigned long long branchlesconditional(unsigned long long a, unsigned long
> long b, unsigned long long c)
> {
>     unsigned long long mask = -(a < b);
>     return c &~ mask;
> }
>
> results in
>
> _Z20branchlesconditionalyyy:
> .LFB1:
>          .cfi_startproc
>          subfc 4,4,3
>          subfe 3,3,3
>          not 3,3
>          extsw 3,3
>          and 3,3,5
>          blr
>
> expected
> subfc
> subfe
> andc
>
Thanks for verifying, Jens!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
                   ` (3 preceding siblings ...)
  2020-06-21 15:35 ` wschmidt at linux dot ibm.com
@ 2020-06-21 16:30 ` segher at gcc dot gnu.org
  2022-01-06  5:39 ` guihaoc at gcc dot gnu.org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: segher at gcc dot gnu.org @ 2020-06-21 16:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |segher at gcc dot gnu.org
                 CC|                            |segher at gcc dot gnu.org
             Target|powerpc-*-*-*               |powerpc64*-*-*
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2020-06-21

--- Comment #5 from Segher Boessenkool <segher at gcc dot gnu.org> ---
In gimple this already is

negativeLessThan (long long unsigned int a, long long unsigned int b)
{
  _Bool _1;
  int _2;
  int _3;
  long long unsigned int _6;

  _1 = a_4(D) < b_5(D);
  _2 = (int) _1;
  _3 = -_2;
  _6 = (long long unsigned int) _3;
  return _6;
}

Then, it is expanded as a sign_extend:DI of a subreg:SI, and nowhere
does it see this isn't necessary (it isn't because that SI cannot be
negative).

The RTL code isn't optimised very well before combine, and that does

Trying 11 -> 12:
   11: {r128:SI=ca:SI-0x1;clobber ca:SI;}
      REG_UNUSED ca:SI
   12: r123:DI=sign_extend(r128:SI)
      REG_DEAD r128:SI
Failed to match this instruction:
(set (reg:DI 123)
    (sign_extend:DI (plus:SI (reg:SI 98 ca [+4 ])
            (const_int -1 [0xffffffffffffffff]))))

(note everything is made SImode in insn 11 before, it absorbed the
subreg).  Combine cannot keep track of known zero bits of hard regs
well, so it fails to see that XER[CA] is only ever 0 or 1 here (it
always is, but it doesn't know that either).

I'll try to add an extra pattern for this extend, that will do the
trick I think.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
                   ` (4 preceding siblings ...)
  2020-06-21 16:30 ` segher at gcc dot gnu.org
@ 2022-01-06  5:39 ` guihaoc at gcc dot gnu.org
  2022-01-13 12:19 ` segher at gcc dot gnu.org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2022-01-06  5:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

HaoChen Gui <guihaoc at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |guihaoc at gcc dot gnu.org

--- Comment #6 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
//source code
unsigned long long negativeLessThan(unsigned long long a, unsigned long long b)
{
   return -(a < b);
}

//P8 with -O2
        subfc 4,4,3
        subfe 3,3,3
        extsw 3,3


//P9 with -O2
        li 10,0
        li 9,1
        cmpld 0,3,4
        isel 3,9,10,0
        neg 3,3

Seems cmp+isel on P9 is sub-optimal.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
                   ` (5 preceding siblings ...)
  2022-01-06  5:39 ` guihaoc at gcc dot gnu.org
@ 2022-01-13 12:19 ` segher at gcc dot gnu.org
  2022-01-13 16:34 ` segher at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: segher at gcc dot gnu.org @ 2022-01-13 12:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

--- Comment #7 from Segher Boessenkool <segher at gcc dot gnu.org> ---
> Seems cmp+isel on P9 is sub-optimal.

For this particular test, perhaps.  But it is better overall, at least some
years ago.  It was benchmarked (with spec), on p9.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
                   ` (6 preceding siblings ...)
  2022-01-13 12:19 ` segher at gcc dot gnu.org
@ 2022-01-13 16:34 ` segher at gcc dot gnu.org
  2022-01-14  9:16 ` guihaoc at gcc dot gnu.org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: segher at gcc dot gnu.org @ 2022-01-13 16:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

--- Comment #8 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Somewhat more constructive...  The problem here is that neg isn't pushed
"through" isel insns.  This in general means you need to negate both inputs
to the isel of course, but there are cases where that is advantageous, like
here when both of those inputs are constants (or actually registers, but
set to some constant).

This is one reason the new set[n]bc[r] insns are so useful to us: it is all
one insn, also on RTL level, so it naturally works out in such cases.  It
also is slightly faster anyway of course, fewer insns and all that, even
if for dataflow it is all the same.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
                   ` (7 preceding siblings ...)
  2022-01-13 16:34 ` segher at gcc dot gnu.org
@ 2022-01-14  9:16 ` guihaoc at gcc dot gnu.org
  2022-05-18  5:24 ` cvs-commit at gcc dot gnu.org
  2022-07-29  2:52 ` guihaoc at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2022-01-14  9:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

--- Comment #9 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
Add a pattern to convert the plus mode to DI. 

+(define_insn_and_split "*my_split"
+  [(set (match_operand:DI 0 "gpc_reg_operand")
+       (sign_extend:DI (plus:SI (match_operand:SI 1 "ca_operand")
+                                (const_int -1))))]
+  ""
+  "#"
+  ""
+  [(parallel [(set (match_dup 0)
+                  (plus:DI (match_dup 2)
+                           (const_int -1)))
+             (clobber (match_dup 2))])]
+{
+  operands[2] = copy_rtx (operands[1]);
+  PUT_MODE (operands[2], DImode);
+})

With the patch, the "extsw" could be optimized out. I compared the performance
between P8 code (with the patch) and P9 code. The performance of P9 is better. 
ISA says that computation with CA causes additional latency. It should be true.
The only concern is P9 code uses more register.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
                   ` (8 preceding siblings ...)
  2022-01-14  9:16 ` guihaoc at gcc dot gnu.org
@ 2022-05-18  5:24 ` cvs-commit at gcc dot gnu.org
  2022-07-29  2:52 ` guihaoc at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-18  5:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by HaoChen Gui <guihaoc@gcc.gnu.org>:

https://gcc.gnu.org/g:a174dc1a7f2bf0a71475ff633b130a60c0c3ff4a

commit r13-582-ga174dc1a7f2bf0a71475ff633b130a60c0c3ff4a
Author: Haochen Gui <guihaoc@gcc.gnu.org>
Date:   Wed May 11 09:19:52 2022 +0800

    This patch adds a combine pattern for "CA minus one". The SImode "CA minus
one" can be converted to DImode as CA only has two values (0 or 1).

    gcc/
            PR target/95737
            * config/rs6000/rs6000.md (*subfsi3_carry_in_xx_64): New.

    gcc/testsuite/
            PR target/95737
            * gcc.target/powerpc/pr95737.c: New.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/95737] PPC: Unnecessary extsw after negative less than
  2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
                   ` (9 preceding siblings ...)
  2022-05-18  5:24 ` cvs-commit at gcc dot gnu.org
@ 2022-07-29  2:52 ` guihaoc at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2022-07-29  2:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

HaoChen Gui <guihaoc at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #11 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-07-29  2:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-18  8:10 [Bug target/95737] New: PPC: Unnecessary extsw after negative less than jens.seifert at de dot ibm.com
2020-06-19 16:40 ` [Bug target/95737] " wschmidt at gcc dot gnu.org
2020-06-19 16:42 ` wschmidt at gcc dot gnu.org
2020-06-19 17:43 ` jens.seifert at de dot ibm.com
2020-06-21 15:35 ` wschmidt at linux dot ibm.com
2020-06-21 16:30 ` segher at gcc dot gnu.org
2022-01-06  5:39 ` guihaoc at gcc dot gnu.org
2022-01-13 12:19 ` segher at gcc dot gnu.org
2022-01-13 16:34 ` segher at gcc dot gnu.org
2022-01-14  9:16 ` guihaoc at gcc dot gnu.org
2022-05-18  5:24 ` cvs-commit at gcc dot gnu.org
2022-07-29  2:52 ` guihaoc at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).