public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions
@ 2013-12-09 11:35 jengelh at inai dot de
  2013-12-09 13:16 ` [Bug rtl-optimization/59429] " hjl.tools at gmail dot com
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: jengelh at inai dot de @ 2013-12-09 11:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

            Bug ID: 59429
           Summary: Missed optimization opportunity in qsort-style
                    comparison functions
           Product: gcc
           Version: 4.8.1
            Status: UNCONFIRMED
          Severity: minor
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jengelh at inai dot de

The following code:

extern int comLG(int, int);
extern int comLE(int, int);
extern int comEL(int, int);
extern int comEG(int, int);
extern int comGL(int, int);
extern int comGE(int, int);
int comLG(int a, int b) { return (a < b) ? -1 : (a > b) ? 1 : 0; }
int comLE(int a, int b) { return (a < b) ? -1 : (a == b) ? 0 : 1; }
int comEL(int a, int b) { return (a == b) ? 0 : (a < b) ? -1 : 1; }
int comEG(int a, int b) { return (a == b) ? 0 : (a > b) ? 1 : -1; }
int comGL(int a, int b) { return (a > b) ? 1 : (a < b) ? -1 : 0; }
int comGE(int a, int b) { return (a > b) ? 1 : (a == b) ? 0 : -1; }

when compiled with:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib64/gcc/x86_64-suse-linux/4.8/lto-wrapper
Target: x86_64-suse-linux
Configured with: ../configure --prefix=/usr --infodir=/usr/share/info
--mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
--enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.8
--enable-ssp --disable-libssp --disable-plugin
--with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --enable-linker-build-id
--program-suffix=-4.8 --enable-linux-futex --without-system-libunwind
--with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
Thread model: posix
gcc version 4.8.1 20130909 [gcc-4_8-branch revision 202388] (SUSE Linux) 

yields object code with different realizations and sizes.
What I observe:

$ gcc -Os -c cmp.c
(also shows with -O3)

$ readelf -s cmp.o
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     8: 0000000000000000    16 FUNC    GLOBAL DEFAULT    1 comLG
     9: 0000000000000010    16 FUNC    GLOBAL DEFAULT    1 comLE
    10: 0000000000000020    18 FUNC    GLOBAL DEFAULT    1 comEL
    11: 0000000000000032    18 FUNC    GLOBAL DEFAULT    1 comEG
    12: 0000000000000044    18 FUNC    GLOBAL DEFAULT    1 comGL
    13: 0000000000000056    18 FUNC    GLOBAL DEFAULT    1 comGE

$ objdump -d cmp.o -Mintel
0000000000000000 <comLG>:
   0:   31 d2                   xor    edx,edx
   2:   39 f7                   cmp    edi,esi
   4:   b8 ff ff ff ff          mov    eax,0xffffffff
   9:   0f 9f c2                setg   dl
   c:   0f 4d c2                cmovge eax,edx
   f:   c3                      ret    

0000000000000010 <comLE>:
  10:   31 d2                   xor    edx,edx
  12:   39 f7                   cmp    edi,esi
  14:   b8 ff ff ff ff          mov    eax,0xffffffff
  19:   0f 95 c2                setne  dl
  1c:   0f 4d c2                cmovge eax,edx
  1f:   c3                      ret    

0000000000000020 <comEL>:
  20:   39 f7                   cmp    edi,esi
  22:   74 0b                   je     2f <comEL+0xf>
  24:   0f 9d c0                setge  al
  27:   0f b6 c0                movzx  eax,al
  2a:   8d 44 00 ff             lea    eax,[rax+rax*1-0x1]
  2e:   c3                      ret    
  2f:   31 c0                   xor    eax,eax
  31:   c3                      ret    

0000000000000032 <comEG>:
  32:   39 f7                   cmp    edi,esi
  34:   74 0b                   je     41 <comEG+0xf>
  36:   0f 9f c0                setg   al
  39:   0f b6 c0                movzx  eax,al
  3c:   8d 44 00 ff             lea    eax,[rax+rax*1-0x1]
  40:   c3                      ret    
  41:   31 c0                   xor    eax,eax
  43:   c3                      ret    

0000000000000044 <comGL>:
  44:   39 f7                   cmp    edi,esi
  46:   b8 01 00 00 00          mov    eax,0x1
  4b:   7f 08                   jg     55 <comGL+0x11>
  4d:   0f 9c c0                setl   al
  50:   0f b6 c0                movzx  eax,al
  53:   f7 d8                   neg    eax
  55:   c3                      ret    

0000000000000056 <comGE>:
  56:   39 f7                   cmp    edi,esi
  58:   b8 01 00 00 00          mov    eax,0x1
  5d:   7f 08                   jg     67 <comGE+0x11>
  5f:   0f 95 c0                setne  al
  62:   0f b6 c0                movzx  eax,al
  65:   f7 d8                   neg    eax
  67:   c3                      ret    

What I expected instead:

All functions to have the same asm.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
@ 2013-12-09 13:16 ` hjl.tools at gmail dot com
  2013-12-09 14:02 ` jengelh at inai dot de
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: hjl.tools at gmail dot com @ 2013-12-09 13:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
gold has

--icf [none,all,safe]       Identical Code Folding. '--icf=safe' Folds ctors,
dtors and functions whose pointers are definitely not taken.

which should help it at link-time.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
  2013-12-09 13:16 ` [Bug rtl-optimization/59429] " hjl.tools at gmail dot com
@ 2013-12-09 14:02 ` jengelh at inai dot de
  2013-12-09 14:05 ` jengelh at inai dot de
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: jengelh at inai dot de @ 2013-12-09 14:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #2 from Jan Engelhardt <jengelh at inai dot de> ---
Suppose all functions are used and taken.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
  2013-12-09 13:16 ` [Bug rtl-optimization/59429] " hjl.tools at gmail dot com
  2013-12-09 14:02 ` jengelh at inai dot de
@ 2013-12-09 14:05 ` jengelh at inai dot de
  2014-04-04 11:37 ` ktietz at gcc dot gnu.org
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: jengelh at inai dot de @ 2013-12-09 14:05 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #3 from Jan Engelhardt <jengelh at inai dot de> ---
I took it "Component: rtl-optimization" meant Register Transfer Language, not
Runtime Linking. If needed, please reassign to whatever component is actually
applicable. I am looking for a compiler enhancement so that the functions
_become_ equal, not a way to eliminate duplicate functions.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (2 preceding siblings ...)
  2013-12-09 14:05 ` jengelh at inai dot de
@ 2014-04-04 11:37 ` ktietz at gcc dot gnu.org
  2014-05-02 15:58 ` law at redhat dot com
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-04-04 11:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

Kai Tietz <ktietz at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-04-04
                 CC|                            |ktietz at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #4 from Kai Tietz <ktietz at gcc dot gnu.org> ---
This issue seems to be another problem caused by missing unification of
expressions.

For given example we get for 'comGE' the following gimple:

comGE (int a, int b)
{
  int iftmp.8_1;
  int _2;
  _Bool _6;
  int _7;

  <bb 2>:
  if (a_3(D) <= b_4(D))
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  _6 = a_3(D) != b_4(D);
  _2 = (int) _6;
  _7 = -_2;

  <bb 4>:
  # iftmp.8_1 = PHI <_7(3), 1(2)>
  return iftmp.8_1;

}

As we see we don't optimize out the negate case.
For expression:
  (a_3(D) > b_4(D) ? 1 : -((int) (a_3(D) != b_4(D))))
we should transform it instead to:
  (a_3(D) < b_4(D) ? -1 : (int) (a_3(D) != b_4(D)))

The reference to gold's --icf option looks for me wrong here.  Such
optimizations - as done here for gold - can't be done by gcc.  At least not for
the given example provided.  In other TUs there always might users of direct
function-address in comparison.  So to find and reduce identical code-blocks is
just a linker feature (with some danger IMHO).

So back to this issue.  I would think it is more a problem to be solved on
gimple, and not that much a RTL issue in first hand.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (3 preceding siblings ...)
  2014-04-04 11:37 ` ktietz at gcc dot gnu.org
@ 2014-05-02 15:58 ` law at redhat dot com
  2014-05-02 15:59 ` law at redhat dot com
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: law at redhat dot com @ 2014-05-02 15:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at redhat dot com

--- Comment #5 from Jeffrey A. Law <law at redhat dot com> ---


For expression:
  (a_3(D) > b_4(D) ? 1 : -((int) (a_3(D) != b_4(D))))
we should transform it instead to:
  (a_3(D) < b_4(D) ? -1 : (int) (a_3(D) != b_4(D)))

Ugh.  That seems fairly complex for the existing transformations.  Though I
guess we could match on

cond1 ? -1 : -(int) cond2

Where the conditions aren't necessarily related.  We'd turn that into

cond1'


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (4 preceding siblings ...)
  2014-05-02 15:58 ` law at redhat dot com
@ 2014-05-02 15:59 ` law at redhat dot com
  2014-05-02 16:03 ` ktietz at gcc dot gnu.org
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: law at redhat dot com @ 2014-05-02 15:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #6 from Jeffrey A. Law <law at redhat dot com> ---
Ignore my last comment.  I hadn't completed my thoughts yet.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (5 preceding siblings ...)
  2014-05-02 15:59 ` law at redhat dot com
@ 2014-05-02 16:03 ` ktietz at gcc dot gnu.org
  2014-05-02 16:09 ` law at redhat dot com
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-05-02 16:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #7 from Kai Tietz <ktietz at gcc dot gnu.org> ---
Yeah, that was my initial idea here too.  But this transformation is just an
improvement for underlying unsigned one-bit values, as here the negate is
indeed an nop.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (6 preceding siblings ...)
  2014-05-02 16:03 ` ktietz at gcc dot gnu.org
@ 2014-05-02 16:09 ` law at redhat dot com
  2014-05-02 16:14 ` ktietz at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: law at redhat dot com @ 2014-05-02 16:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #8 from Jeffrey A. Law <law at redhat dot com> ---
So I think we need to make a gross level determination of what a canonical form
for this stuff ought to be, then look at what it would take to transform into
that canonical form.

I think the key here is to realize that we want the negated constant to be the
true result of the first COND_EXPR and the 0/1 result to always be the result
of the second COND_EXPR.

That way the negated constant is carried by the PHI node and we can use the
result of the second COND_EXPR directly without additional code.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (7 preceding siblings ...)
  2014-05-02 16:09 ` law at redhat dot com
@ 2014-05-02 16:14 ` ktietz at gcc dot gnu.org
  2014-05-02 16:33 ` law at redhat dot com
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-05-02 16:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #9 from Kai Tietz <ktietz at gcc dot gnu.org> ---
The general solution here for patterns like
'(a cmp1 b ? x : (a cmp2 b ? y : z))' would be to split such statment up into
3 conditions like:
if (a cmp1 b)
  r = x;
else if (a cmp2 b)
  r = y;
else if (!(a cmp2 b && a cmp1 b))
  r = z;

By this we can resort conditions and try to normalize it.  Of course this just
works for floating-points with fast-math, or just on integral types.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (8 preceding siblings ...)
  2014-05-02 16:14 ` ktietz at gcc dot gnu.org
@ 2014-05-02 16:33 ` law at redhat dot com
  2014-05-02 16:42 ` ktietz at gcc dot gnu.org
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: law at redhat dot com @ 2014-05-02 16:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #10 from Jeffrey A. Law <law at redhat dot com> ---
So if you go into a cascaded if-else form, don't we have to recover the
COND_EXPR form to get good code?

Remember, that to get good code here, we need to discover a non-branching form
of one (or more) COND/COND_EXPR expressions and we'd really like to avoid
arithmetic in that non-branching form.  ie, do a comparison and use the result
after casting it from a bool to a suitable integer type.

So I guess whatever form we want ought to be driven by which is easier to
analyze/rewrite into a COND_EXPR form that will be expanded into a setcc insn
without any arithmetic on the result.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (9 preceding siblings ...)
  2014-05-02 16:33 ` law at redhat dot com
@ 2014-05-02 16:42 ` ktietz at gcc dot gnu.org
  2014-05-05  8:48 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-05-02 16:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #11 from Kai Tietz <ktietz at gcc dot gnu.org> ---
sure,  I just shown the transformation for doing the sorting.  After
prioritizing the order we should transform back to initial condition and just
changing comparison code plus the edges.

To handle this we could use a table describing transformation rules.  So we
don't need to calculate anything.
We have here one special-case to handle:

'(a cmp1 b ? x : (a eq-neq b))' as this would be in the long form '(a cmp1 b ?
(a eq-neq b ? 1 : 0))'


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (10 preceding siblings ...)
  2014-05-02 16:42 ` ktietz at gcc dot gnu.org
@ 2014-05-05  8:48 ` rguenth at gcc dot gnu.org
  2014-05-30  8:33 ` ktietz at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-05-05  8:48 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
My idea was to have sth like tree-affine for conditions and that condition
optimizing passes (like ifcombine or ifcvt or phiopt) build an on-the-side
representation for this, also catching straight-line code

  a = b < 0;
  c = d > 1;
  d = a | c;
  if (d)
    ...

the tree-affine like condition can be normalized and simplified on-the-side
and you'd do simplifications/lookup using that machinery, only emitting
changed computes in a transform stage.  Basically you collect BB or edge
predicates (like ifcvt does) but with a more elaborate machinery that
allows for simplification (ifcombine has its own as well).

I wouldn't try to tackle this testcase from a pattern matching side.
In principle comment#4 looks like sth for phiopt.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (11 preceding siblings ...)
  2014-05-05  8:48 ` rguenth at gcc dot gnu.org
@ 2014-05-30  8:33 ` ktietz at gcc dot gnu.org
  2021-07-25  0:22 ` [Bug tree-optimization/59429] " pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-05-30  8:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #13 from Kai Tietz <ktietz at gcc dot gnu.org> ---
Hmm, I don't see what binary-combine helper would help here? It might be a good
thing to have such a abstract-representation for doing logical optimizations of
comparison chains.  Nevertheless this seems to me being not the solution for
the given issue here.

The issue - described in this problem - is more related to simple
pattern-matching. So normalization in such cases seems to be the way to go.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (12 preceding siblings ...)
  2014-05-30  8:33 ` ktietz at gcc dot gnu.org
@ 2021-07-25  0:22 ` pinskia at gcc dot gnu.org
  2024-01-24 16:56 ` pinskia at gcc dot gnu.org
  2024-03-22 10:55 ` redbeard0531 at gmail dot com
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-25  0:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu.org
           Keywords|                            |missed-optimization
          Component|rtl-optimization            |tree-optimization
           Severity|minor                       |enhancement

--- Comment #14 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The two on the trunk which seems to be not the same (at least on aarch64) is
comEL and comEG.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (13 preceding siblings ...)
  2021-07-25  0:22 ` [Bug tree-optimization/59429] " pinskia at gcc dot gnu.org
@ 2024-01-24 16:56 ` pinskia at gcc dot gnu.org
  2024-03-22 10:55 ` redbeard0531 at gmail dot com
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-24 16:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |redbeard0531 at gmail dot com

--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 113585 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
  2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
                   ` (14 preceding siblings ...)
  2024-01-24 16:56 ` pinskia at gcc dot gnu.org
@ 2024-03-22 10:55 ` redbeard0531 at gmail dot com
  15 siblings, 0 replies; 17+ messages in thread
From: redbeard0531 at gmail dot com @ 2024-03-22 10:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429

--- Comment #16 from Mathias Stearn <redbeard0531 at gmail dot com> ---
Trunk still generates different code for all cases (in some cases subtly so)
for both aarch64 and x86_64: https://www.godbolt.org/z/1s6sxrMWq. For both
arches, it seems like LE and LG generate the best code.

On aarch64, they probably all have the same throughput, but EL and EG have a
size penalty with an extra instruction.

On x86_64, there is much more variety. EL and EG both get end up with a branch
rather than being branchless, which is probably bad since comparison functions
are often called in ways that the result branches are unpredictable. GE and GL
appear to have regressed since this ticket was created. They now do the
comparison twice rather than reusing the flags from the first comparison:

comGL(int, int):
        xor     eax, eax
        cmp     edi, esi
        mov     edx, 1
        setl    al
        neg     eax
        cmp     edi, esi
        cmovg   eax, edx
        ret
comGE(int, int):
        xor     eax, eax
        cmp     edi, esi
        mov     edx, 1
        setne   al
        neg     eax
        cmp     edi, esi
        cmovg   eax, edx
        ret

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-03-22 10:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
2013-12-09 13:16 ` [Bug rtl-optimization/59429] " hjl.tools at gmail dot com
2013-12-09 14:02 ` jengelh at inai dot de
2013-12-09 14:05 ` jengelh at inai dot de
2014-04-04 11:37 ` ktietz at gcc dot gnu.org
2014-05-02 15:58 ` law at redhat dot com
2014-05-02 15:59 ` law at redhat dot com
2014-05-02 16:03 ` ktietz at gcc dot gnu.org
2014-05-02 16:09 ` law at redhat dot com
2014-05-02 16:14 ` ktietz at gcc dot gnu.org
2014-05-02 16:33 ` law at redhat dot com
2014-05-02 16:42 ` ktietz at gcc dot gnu.org
2014-05-05  8:48 ` rguenth at gcc dot gnu.org
2014-05-30  8:33 ` ktietz at gcc dot gnu.org
2021-07-25  0:22 ` [Bug tree-optimization/59429] " pinskia at gcc dot gnu.org
2024-01-24 16:56 ` pinskia at gcc dot gnu.org
2024-03-22 10:55 ` redbeard0531 at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).