public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions
@ 2013-12-09 11:35 jengelh at inai dot de
2013-12-09 13:16 ` [Bug rtl-optimization/59429] " hjl.tools at gmail dot com
` (15 more replies)
0 siblings, 16 replies; 17+ messages in thread
From: jengelh at inai dot de @ 2013-12-09 11:35 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
Bug ID: 59429
Summary: Missed optimization opportunity in qsort-style
comparison functions
Product: gcc
Version: 4.8.1
Status: UNCONFIRMED
Severity: minor
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: jengelh at inai dot de
The following code:
extern int comLG(int, int);
extern int comLE(int, int);
extern int comEL(int, int);
extern int comEG(int, int);
extern int comGL(int, int);
extern int comGE(int, int);
int comLG(int a, int b) { return (a < b) ? -1 : (a > b) ? 1 : 0; }
int comLE(int a, int b) { return (a < b) ? -1 : (a == b) ? 0 : 1; }
int comEL(int a, int b) { return (a == b) ? 0 : (a < b) ? -1 : 1; }
int comEG(int a, int b) { return (a == b) ? 0 : (a > b) ? 1 : -1; }
int comGL(int a, int b) { return (a > b) ? 1 : (a < b) ? -1 : 0; }
int comGE(int a, int b) { return (a > b) ? 1 : (a == b) ? 0 : -1; }
when compiled with:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib64/gcc/x86_64-suse-linux/4.8/lto-wrapper
Target: x86_64-suse-linux
Configured with: ../configure --prefix=/usr --infodir=/usr/share/info
--mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
--enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.8
--enable-ssp --disable-libssp --disable-plugin
--with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --enable-linker-build-id
--program-suffix=-4.8 --enable-linux-futex --without-system-libunwind
--with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
Thread model: posix
gcc version 4.8.1 20130909 [gcc-4_8-branch revision 202388] (SUSE Linux)
yields object code with different realizations and sizes.
What I observe:
$ gcc -Os -c cmp.c
(also shows with -O3)
$ readelf -s cmp.o
Num: Value Size Type Bind Vis Ndx Name
8: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 comLG
9: 0000000000000010 16 FUNC GLOBAL DEFAULT 1 comLE
10: 0000000000000020 18 FUNC GLOBAL DEFAULT 1 comEL
11: 0000000000000032 18 FUNC GLOBAL DEFAULT 1 comEG
12: 0000000000000044 18 FUNC GLOBAL DEFAULT 1 comGL
13: 0000000000000056 18 FUNC GLOBAL DEFAULT 1 comGE
$ objdump -d cmp.o -Mintel
0000000000000000 <comLG>:
0: 31 d2 xor edx,edx
2: 39 f7 cmp edi,esi
4: b8 ff ff ff ff mov eax,0xffffffff
9: 0f 9f c2 setg dl
c: 0f 4d c2 cmovge eax,edx
f: c3 ret
0000000000000010 <comLE>:
10: 31 d2 xor edx,edx
12: 39 f7 cmp edi,esi
14: b8 ff ff ff ff mov eax,0xffffffff
19: 0f 95 c2 setne dl
1c: 0f 4d c2 cmovge eax,edx
1f: c3 ret
0000000000000020 <comEL>:
20: 39 f7 cmp edi,esi
22: 74 0b je 2f <comEL+0xf>
24: 0f 9d c0 setge al
27: 0f b6 c0 movzx eax,al
2a: 8d 44 00 ff lea eax,[rax+rax*1-0x1]
2e: c3 ret
2f: 31 c0 xor eax,eax
31: c3 ret
0000000000000032 <comEG>:
32: 39 f7 cmp edi,esi
34: 74 0b je 41 <comEG+0xf>
36: 0f 9f c0 setg al
39: 0f b6 c0 movzx eax,al
3c: 8d 44 00 ff lea eax,[rax+rax*1-0x1]
40: c3 ret
41: 31 c0 xor eax,eax
43: c3 ret
0000000000000044 <comGL>:
44: 39 f7 cmp edi,esi
46: b8 01 00 00 00 mov eax,0x1
4b: 7f 08 jg 55 <comGL+0x11>
4d: 0f 9c c0 setl al
50: 0f b6 c0 movzx eax,al
53: f7 d8 neg eax
55: c3 ret
0000000000000056 <comGE>:
56: 39 f7 cmp edi,esi
58: b8 01 00 00 00 mov eax,0x1
5d: 7f 08 jg 67 <comGE+0x11>
5f: 0f 95 c0 setne al
62: 0f b6 c0 movzx eax,al
65: f7 d8 neg eax
67: c3 ret
What I expected instead:
All functions to have the same asm.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
@ 2013-12-09 13:16 ` hjl.tools at gmail dot com
2013-12-09 14:02 ` jengelh at inai dot de
` (14 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: hjl.tools at gmail dot com @ 2013-12-09 13:16 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
gold has
--icf [none,all,safe] Identical Code Folding. '--icf=safe' Folds ctors,
dtors and functions whose pointers are definitely not taken.
which should help it at link-time.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
2013-12-09 13:16 ` [Bug rtl-optimization/59429] " hjl.tools at gmail dot com
@ 2013-12-09 14:02 ` jengelh at inai dot de
2013-12-09 14:05 ` jengelh at inai dot de
` (13 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: jengelh at inai dot de @ 2013-12-09 14:02 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #2 from Jan Engelhardt <jengelh at inai dot de> ---
Suppose all functions are used and taken.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
2013-12-09 13:16 ` [Bug rtl-optimization/59429] " hjl.tools at gmail dot com
2013-12-09 14:02 ` jengelh at inai dot de
@ 2013-12-09 14:05 ` jengelh at inai dot de
2014-04-04 11:37 ` ktietz at gcc dot gnu.org
` (12 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: jengelh at inai dot de @ 2013-12-09 14:05 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #3 from Jan Engelhardt <jengelh at inai dot de> ---
I took it "Component: rtl-optimization" meant Register Transfer Language, not
Runtime Linking. If needed, please reassign to whatever component is actually
applicable. I am looking for a compiler enhancement so that the functions
_become_ equal, not a way to eliminate duplicate functions.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (2 preceding siblings ...)
2013-12-09 14:05 ` jengelh at inai dot de
@ 2014-04-04 11:37 ` ktietz at gcc dot gnu.org
2014-05-02 15:58 ` law at redhat dot com
` (11 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-04-04 11:37 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
Kai Tietz <ktietz at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2014-04-04
CC| |ktietz at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #4 from Kai Tietz <ktietz at gcc dot gnu.org> ---
This issue seems to be another problem caused by missing unification of
expressions.
For given example we get for 'comGE' the following gimple:
comGE (int a, int b)
{
int iftmp.8_1;
int _2;
_Bool _6;
int _7;
<bb 2>:
if (a_3(D) <= b_4(D))
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
_6 = a_3(D) != b_4(D);
_2 = (int) _6;
_7 = -_2;
<bb 4>:
# iftmp.8_1 = PHI <_7(3), 1(2)>
return iftmp.8_1;
}
As we see we don't optimize out the negate case.
For expression:
(a_3(D) > b_4(D) ? 1 : -((int) (a_3(D) != b_4(D))))
we should transform it instead to:
(a_3(D) < b_4(D) ? -1 : (int) (a_3(D) != b_4(D)))
The reference to gold's --icf option looks for me wrong here. Such
optimizations - as done here for gold - can't be done by gcc. At least not for
the given example provided. In other TUs there always might users of direct
function-address in comparison. So to find and reduce identical code-blocks is
just a linker feature (with some danger IMHO).
So back to this issue. I would think it is more a problem to be solved on
gimple, and not that much a RTL issue in first hand.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (3 preceding siblings ...)
2014-04-04 11:37 ` ktietz at gcc dot gnu.org
@ 2014-05-02 15:58 ` law at redhat dot com
2014-05-02 15:59 ` law at redhat dot com
` (10 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: law at redhat dot com @ 2014-05-02 15:58 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
Jeffrey A. Law <law at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |law at redhat dot com
--- Comment #5 from Jeffrey A. Law <law at redhat dot com> ---
For expression:
(a_3(D) > b_4(D) ? 1 : -((int) (a_3(D) != b_4(D))))
we should transform it instead to:
(a_3(D) < b_4(D) ? -1 : (int) (a_3(D) != b_4(D)))
Ugh. That seems fairly complex for the existing transformations. Though I
guess we could match on
cond1 ? -1 : -(int) cond2
Where the conditions aren't necessarily related. We'd turn that into
cond1'
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (4 preceding siblings ...)
2014-05-02 15:58 ` law at redhat dot com
@ 2014-05-02 15:59 ` law at redhat dot com
2014-05-02 16:03 ` ktietz at gcc dot gnu.org
` (9 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: law at redhat dot com @ 2014-05-02 15:59 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #6 from Jeffrey A. Law <law at redhat dot com> ---
Ignore my last comment. I hadn't completed my thoughts yet.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (5 preceding siblings ...)
2014-05-02 15:59 ` law at redhat dot com
@ 2014-05-02 16:03 ` ktietz at gcc dot gnu.org
2014-05-02 16:09 ` law at redhat dot com
` (8 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-05-02 16:03 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #7 from Kai Tietz <ktietz at gcc dot gnu.org> ---
Yeah, that was my initial idea here too. But this transformation is just an
improvement for underlying unsigned one-bit values, as here the negate is
indeed an nop.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (6 preceding siblings ...)
2014-05-02 16:03 ` ktietz at gcc dot gnu.org
@ 2014-05-02 16:09 ` law at redhat dot com
2014-05-02 16:14 ` ktietz at gcc dot gnu.org
` (7 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: law at redhat dot com @ 2014-05-02 16:09 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #8 from Jeffrey A. Law <law at redhat dot com> ---
So I think we need to make a gross level determination of what a canonical form
for this stuff ought to be, then look at what it would take to transform into
that canonical form.
I think the key here is to realize that we want the negated constant to be the
true result of the first COND_EXPR and the 0/1 result to always be the result
of the second COND_EXPR.
That way the negated constant is carried by the PHI node and we can use the
result of the second COND_EXPR directly without additional code.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (7 preceding siblings ...)
2014-05-02 16:09 ` law at redhat dot com
@ 2014-05-02 16:14 ` ktietz at gcc dot gnu.org
2014-05-02 16:33 ` law at redhat dot com
` (6 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-05-02 16:14 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #9 from Kai Tietz <ktietz at gcc dot gnu.org> ---
The general solution here for patterns like
'(a cmp1 b ? x : (a cmp2 b ? y : z))' would be to split such statment up into
3 conditions like:
if (a cmp1 b)
r = x;
else if (a cmp2 b)
r = y;
else if (!(a cmp2 b && a cmp1 b))
r = z;
By this we can resort conditions and try to normalize it. Of course this just
works for floating-points with fast-math, or just on integral types.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (8 preceding siblings ...)
2014-05-02 16:14 ` ktietz at gcc dot gnu.org
@ 2014-05-02 16:33 ` law at redhat dot com
2014-05-02 16:42 ` ktietz at gcc dot gnu.org
` (5 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: law at redhat dot com @ 2014-05-02 16:33 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #10 from Jeffrey A. Law <law at redhat dot com> ---
So if you go into a cascaded if-else form, don't we have to recover the
COND_EXPR form to get good code?
Remember, that to get good code here, we need to discover a non-branching form
of one (or more) COND/COND_EXPR expressions and we'd really like to avoid
arithmetic in that non-branching form. ie, do a comparison and use the result
after casting it from a bool to a suitable integer type.
So I guess whatever form we want ought to be driven by which is easier to
analyze/rewrite into a COND_EXPR form that will be expanded into a setcc insn
without any arithmetic on the result.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (9 preceding siblings ...)
2014-05-02 16:33 ` law at redhat dot com
@ 2014-05-02 16:42 ` ktietz at gcc dot gnu.org
2014-05-05 8:48 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-05-02 16:42 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #11 from Kai Tietz <ktietz at gcc dot gnu.org> ---
sure, I just shown the transformation for doing the sorting. After
prioritizing the order we should transform back to initial condition and just
changing comparison code plus the edges.
To handle this we could use a table describing transformation rules. So we
don't need to calculate anything.
We have here one special-case to handle:
'(a cmp1 b ? x : (a eq-neq b))' as this would be in the long form '(a cmp1 b ?
(a eq-neq b ? 1 : 0))'
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (10 preceding siblings ...)
2014-05-02 16:42 ` ktietz at gcc dot gnu.org
@ 2014-05-05 8:48 ` rguenth at gcc dot gnu.org
2014-05-30 8:33 ` ktietz at gcc dot gnu.org
` (3 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-05-05 8:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
My idea was to have sth like tree-affine for conditions and that condition
optimizing passes (like ifcombine or ifcvt or phiopt) build an on-the-side
representation for this, also catching straight-line code
a = b < 0;
c = d > 1;
d = a | c;
if (d)
...
the tree-affine like condition can be normalized and simplified on-the-side
and you'd do simplifications/lookup using that machinery, only emitting
changed computes in a transform stage. Basically you collect BB or edge
predicates (like ifcvt does) but with a more elaborate machinery that
allows for simplification (ifcombine has its own as well).
I wouldn't try to tackle this testcase from a pattern matching side.
In principle comment#4 looks like sth for phiopt.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (11 preceding siblings ...)
2014-05-05 8:48 ` rguenth at gcc dot gnu.org
@ 2014-05-30 8:33 ` ktietz at gcc dot gnu.org
2021-07-25 0:22 ` [Bug tree-optimization/59429] " pinskia at gcc dot gnu.org
` (2 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ktietz at gcc dot gnu.org @ 2014-05-30 8:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #13 from Kai Tietz <ktietz at gcc dot gnu.org> ---
Hmm, I don't see what binary-combine helper would help here? It might be a good
thing to have such a abstract-representation for doing logical optimizations of
comparison chains. Nevertheless this seems to me being not the solution for
the given issue here.
The issue - described in this problem - is more related to simple
pattern-matching. So normalization in such cases seems to be the way to go.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug tree-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (12 preceding siblings ...)
2014-05-30 8:33 ` ktietz at gcc dot gnu.org
@ 2021-07-25 0:22 ` pinskia at gcc dot gnu.org
2024-01-24 16:56 ` pinskia at gcc dot gnu.org
2024-03-22 10:55 ` redbeard0531 at gmail dot com
15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-25 0:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |pinskia at gcc dot gnu.org
Keywords| |missed-optimization
Component|rtl-optimization |tree-optimization
Severity|minor |enhancement
--- Comment #14 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The two on the trunk which seems to be not the same (at least on aarch64) is
comEL and comEG.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug tree-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (13 preceding siblings ...)
2021-07-25 0:22 ` [Bug tree-optimization/59429] " pinskia at gcc dot gnu.org
@ 2024-01-24 16:56 ` pinskia at gcc dot gnu.org
2024-03-22 10:55 ` redbeard0531 at gmail dot com
15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-24 16:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |redbeard0531 at gmail dot com
--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 113585 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug tree-optimization/59429] Missed optimization opportunity in qsort-style comparison functions
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
` (14 preceding siblings ...)
2024-01-24 16:56 ` pinskia at gcc dot gnu.org
@ 2024-03-22 10:55 ` redbeard0531 at gmail dot com
15 siblings, 0 replies; 17+ messages in thread
From: redbeard0531 at gmail dot com @ 2024-03-22 10:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #16 from Mathias Stearn <redbeard0531 at gmail dot com> ---
Trunk still generates different code for all cases (in some cases subtly so)
for both aarch64 and x86_64: https://www.godbolt.org/z/1s6sxrMWq. For both
arches, it seems like LE and LG generate the best code.
On aarch64, they probably all have the same throughput, but EL and EG have a
size penalty with an extra instruction.
On x86_64, there is much more variety. EL and EG both get end up with a branch
rather than being branchless, which is probably bad since comparison functions
are often called in ways that the result branches are unpredictable. GE and GL
appear to have regressed since this ticket was created. They now do the
comparison twice rather than reusing the flags from the first comparison:
comGL(int, int):
xor eax, eax
cmp edi, esi
mov edx, 1
setl al
neg eax
cmp edi, esi
cmovg eax, edx
ret
comGE(int, int):
xor eax, eax
cmp edi, esi
mov edx, 1
setne al
neg eax
cmp edi, esi
cmovg eax, edx
ret
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-03-22 10:55 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-09 11:35 [Bug rtl-optimization/59429] New: Missed optimization opportunity in qsort-style comparison functions jengelh at inai dot de
2013-12-09 13:16 ` [Bug rtl-optimization/59429] " hjl.tools at gmail dot com
2013-12-09 14:02 ` jengelh at inai dot de
2013-12-09 14:05 ` jengelh at inai dot de
2014-04-04 11:37 ` ktietz at gcc dot gnu.org
2014-05-02 15:58 ` law at redhat dot com
2014-05-02 15:59 ` law at redhat dot com
2014-05-02 16:03 ` ktietz at gcc dot gnu.org
2014-05-02 16:09 ` law at redhat dot com
2014-05-02 16:14 ` ktietz at gcc dot gnu.org
2014-05-02 16:33 ` law at redhat dot com
2014-05-02 16:42 ` ktietz at gcc dot gnu.org
2014-05-05 8:48 ` rguenth at gcc dot gnu.org
2014-05-30 8:33 ` ktietz at gcc dot gnu.org
2021-07-25 0:22 ` [Bug tree-optimization/59429] " pinskia at gcc dot gnu.org
2024-01-24 16:56 ` pinskia at gcc dot gnu.org
2024-03-22 10:55 ` redbeard0531 at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).