public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128
@ 2020-11-23 13:49 denis.campredon at gmail dot com
2020-11-23 16:35 ` [Bug tree-optimization/97950] " jakub at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: denis.campredon at gmail dot com @ 2020-11-23 13:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97950
Bug ID: 97950
Summary: Unoptimal code generation with
__builtin_*_overflow{,_p} for short and __int128
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: denis.campredon at gmail dot com
Target Milestone: ---
For the following code, the generation is unoptimal on x86-64.
For most of the functions with `short` a jump is generated.
For the functions with `__int128` all but `mul_overflow_{,un}signed___int128`
seems to have extra `mov` produced.
The same problems apply to all the `__builtin_*_overflow_p`
-------------
#define TEST_OVERFLOW(type) \
bool mul_overflow_signed_##type(signed type a, signed type b, signed type c)
{return __builtin_mul_overflow(a,b,&c);} \
bool add_overflow_signed_##type(signed type a, signed type b, signed type c)
{return __builtin_add_overflow(a,b,&c);} \
bool sub_overflow_signed_##type(signed type a, signed type b, signed type c)
{return __builtin_sub_overflow(a,b,&c);} \
bool mul_overflow_unsigned_##type(unsigned type a, unsigned type b, unsigned
type c) {return __builtin_mul_overflow(a,b,&c);} \
bool add_overflow_unsigned_##type(unsigned type a, unsigned type b, unsigned
type c) {return __builtin_add_overflow(a,b,&c);} \
bool sub_overflow_unsigned_##type(unsigned type a, unsigned type b, unsigned
type c) {return __builtin_sub_overflow(a,b,&c);} \
TEST_OVERFLOW(short)
TEST_OVERFLOW(__int128)
-------------
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128
2020-11-23 13:49 [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128 denis.campredon at gmail dot com
@ 2020-11-23 16:35 ` jakub at gcc dot gnu.org
2020-11-23 18:01 ` denis.campredon at gmail dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-11-23 16:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97950
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 49613
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49613&action=edit
gcc11-pr97950.patch
Untested fix for the short cases.
As for __int128, I think it would be better if you filed a separate issue,
because it has nothing to do with the short one, and mentioned which moves you
find redundant. Generally, some redundant moves will be there with double-word
arithmetics, it is a matter of when exactly it is the right time to lower
double-word operations (and which) into word ones, doing it too early prevents
e.g. STV from doing its job and e.g. doing for __int128 some operations in SSE
registers, while doing it too late may result in the already assigning pairs of
GP registers for the 128-bit pseudos and while some useless moves can be
recovered afterwards, certainly not all of them. x86 doesn't really have
instructions that would allow implementing the 128-bit multiplications with
overflow efficiently.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128
2020-11-23 13:49 [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128 denis.campredon at gmail dot com
2020-11-23 16:35 ` [Bug tree-optimization/97950] " jakub at gcc dot gnu.org
@ 2020-11-23 18:01 ` denis.campredon at gmail dot com
2020-11-23 18:25 ` [Bug target/97950] " pinskia at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: denis.campredon at gmail dot com @ 2020-11-23 18:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97950
--- Comment #2 from denis.campredon at gmail dot com ---
Thanks for your fast patch. I've opened PR97961 for the __int128 problem
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128
2020-11-23 13:49 [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128 denis.campredon at gmail dot com
2020-11-23 16:35 ` [Bug tree-optimization/97950] " jakub at gcc dot gnu.org
2020-11-23 18:01 ` denis.campredon at gmail dot com
@ 2020-11-23 18:25 ` pinskia at gcc dot gnu.org
2020-11-23 20:26 ` ubizjak at gmail dot com
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-11-23 18:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97950
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128
2020-11-23 13:49 [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128 denis.campredon at gmail dot com
` (2 preceding siblings ...)
2020-11-23 18:25 ` [Bug target/97950] " pinskia at gcc dot gnu.org
@ 2020-11-23 20:26 ` ubizjak at gmail dot com
2020-11-23 20:36 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2020-11-23 20:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97950
--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
Comment on attachment 49613
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49613
gcc11-pr97950.patch
>+(define_insn_and_split "*setcc_hi_1"
>+ [(set (match_operand:HI 0 "register_operand" "=q")
>+ (match_operator:HI 1 "ix86_comparison_operator"
>+ [(reg FLAGS_REG) (const_int 0)]))]
>+ "!TARGET_PARTIAL_REG_STALL"
>+ "#"
>+ "&& reload_completed"
>+ [(set (match_dup 2) (match_dup 1))
>+ (set (match_dup 0) (zero_extend:DI (match_dup 2)))]
zero_extend:HI
Perhaps you could apply SWI24 mode iterator to existing *setcc_si_1_and and
*setcc_si_1_movzbl?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128
2020-11-23 13:49 [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128 denis.campredon at gmail dot com
` (3 preceding siblings ...)
2020-11-23 20:26 ` ubizjak at gmail dot com
@ 2020-11-23 20:36 ` jakub at gcc dot gnu.org
2020-11-24 9:46 ` cvs-commit at gcc dot gnu.org
2020-11-24 9:51 ` [Bug target/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-11-23 20:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97950
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #49613|0 |1
is obsolete| |
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
Status|UNCONFIRMED |ASSIGNED
Ever confirmed|0 |1
Last reconfirmed| |2020-11-23
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 49615
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49615&action=edit
gcc11-pr97950.patch
Updated patch.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128
2020-11-23 13:49 [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128 denis.campredon at gmail dot com
` (4 preceding siblings ...)
2020-11-23 20:36 ` jakub at gcc dot gnu.org
@ 2020-11-24 9:46 ` cvs-commit at gcc dot gnu.org
2020-11-24 9:51 ` [Bug target/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-11-24 9:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97950
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:a1dd66b108cba086f58448ccbe9bf57b0a342f9a
commit r11-5279-ga1dd66b108cba086f58448ccbe9bf57b0a342f9a
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue Nov 24 10:45:40 2020 +0100
i386: Add *setcc_hi_1* define_insn_and_split [PR97950]
As the following testcase shows, unlike char, int or long long sized
__builtin_*_overflow{,_p}, for short sized one in most cases the ce1 pass
doesn't optimize the jo/jno or jc/jnc jumps with setting of a pseudo to 0/1
into seto/setc. The reason is missing *setcc_hi_1* pattern. The following
patch implements it using mode iterators so that on i486 and pentium?
one can get the zero extension through and instead of movzbw.
2020-11-24 Jakub Jelinek <jakub@redhat.com>
PR target/97950
* config/i386/i386.md (*setcc_si_1_and): Macroize into...
(*setcc_<mode>_1_and): New define_insn_and_split with SWI24
iterator.
(*setcc_si_1_movzbl): Macroize into...
(*setcc_<mode>_1_movzbl): New define_insn_and_split with SWI24
iterator.
* gcc.target/i386/pr97950.c: New test.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short
2020-11-23 13:49 [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128 denis.campredon at gmail dot com
` (5 preceding siblings ...)
2020-11-24 9:46 ` cvs-commit at gcc dot gnu.org
@ 2020-11-24 9:51 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-11-24 9:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97950
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|Unoptimal code generation |Unoptimal code generation
|with |with
|__builtin_*_overflow{,_p} |__builtin_*_overflow{,_p}
|for short and __int128 |for short
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed for 11.1, too risky for backports.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-11-24 9:51 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-23 13:49 [Bug tree-optimization/97950] New: Unoptimal code generation with __builtin_*_overflow{,_p} for short and __int128 denis.campredon at gmail dot com
2020-11-23 16:35 ` [Bug tree-optimization/97950] " jakub at gcc dot gnu.org
2020-11-23 18:01 ` denis.campredon at gmail dot com
2020-11-23 18:25 ` [Bug target/97950] " pinskia at gcc dot gnu.org
2020-11-23 20:26 ` ubizjak at gmail dot com
2020-11-23 20:36 ` jakub at gcc dot gnu.org
2020-11-24 9:46 ` cvs-commit at gcc dot gnu.org
2020-11-24 9:51 ` [Bug target/97950] Unoptimal code generation with __builtin_*_overflow{,_p} for short jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).