public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result
@ 2022-08-08 19:41 dimitar at gcc dot gnu.org
2022-08-08 19:42 ` [Bug target/106562] " dimitar at gcc dot gnu.org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: dimitar at gcc dot gnu.org @ 2022-08-08 19:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
Bug ID: 106562
Summary: PRU: Inefficient code for zero check of 64-bit AND
result
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: dimitar at gcc dot gnu.org
Target Milestone: ---
GCC generates inefficient code for the following C snippet:
char test(uint64_t a, uint64_t b)
{
return a && b;
}
test:
or r0.b0, r14.b0, r14.b1
or r0.b0, r0.b0, r14.b2
or r0.b0, r0.b0, r14.b3
or r0.b0, r0.b0, r15.b0
or r0.b0, r0.b0, r15.b1
or r0.b0, r0.b0, r15.b2
or r0.b0, r0.b0, r15.b3
qbeq .L4, r0.b0, 0
mov r14, r16
mov r15, r17
sub r2, r2, 2
rsb r0, r16, 0
rsc r1, r17, 0
mov r17, r14
mov r18, r15
sbbo r3.b2, r2, 0, 2
ldi r16.b0, (63) & 0xffff
or r14, r0, r17
or r15, r1, r18
call %label(__pruabi_lsrll)
lbbo r3.b2, r2, 0, 2
add r2, r2, 2
ret
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
@ 2022-08-08 19:42 ` dimitar at gcc dot gnu.org
2022-09-18 18:04 ` [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) " dimitar at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: dimitar at gcc dot gnu.org @ 2022-08-08 19:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
Dimitar Dimitrov <dimitar at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2022-08-08
Assignee|unassigned at gcc dot gnu.org |dimitar at gcc dot gnu.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
2022-08-08 19:42 ` [Bug target/106562] " dimitar at gcc dot gnu.org
@ 2022-09-18 18:04 ` dimitar at gcc dot gnu.org
2022-10-05 18:11 ` dimitar at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: dimitar at gcc dot gnu.org @ 2022-09-18 18:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
--- Comment #1 from Dimitar Dimitrov <dimitar at gcc dot gnu.org> ---
I explored setting REGMODE_NATURAL_SIZE=4 for PRU. This required adjustments
in many places in middle end to use REGMODE_NATURAL_SIZE instead of word_mode.
That however proved too intrusive. And I don't see how other targets would
benefit.
I'll instead go with defining an expansion of cbranchdi4 for PRU. That would be
contained in the PRU backend only, and thus safer.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
2022-08-08 19:42 ` [Bug target/106562] " dimitar at gcc dot gnu.org
2022-09-18 18:04 ` [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) " dimitar at gcc dot gnu.org
@ 2022-10-05 18:11 ` dimitar at gcc dot gnu.org
2022-10-09 11:39 ` cvs-commit at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: dimitar at gcc dot gnu.org @ 2022-10-05 18:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
--- Comment #2 from Dimitar Dimitrov <dimitar at gcc dot gnu.org> ---
With cbranchdi4 defined, the generated code is now 10 instructions:
test:
qbne .L5, r15, 0
qbeq .L4, r14, 0
.L5:
rsb r0, r16, 0
rsc r1, r17, 0
or r0, r0, r16
or r1, r1, r17
lsr r14, r1, 31
ret
.L4:
ldi r14, 0
ret
Defining BRANCH_COST=0, as avr backend does, shrinks the above to only 7
instructions.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
` (2 preceding siblings ...)
2022-10-05 18:11 ` dimitar at gcc dot gnu.org
@ 2022-10-09 11:39 ` cvs-commit at gcc dot gnu.org
2023-06-07 19:02 ` dimitar at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-10-09 11:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Dimitar Dimitrov <dimitar@gcc.gnu.org>:
https://gcc.gnu.org/g:e95e91eccd022a4a3a86da2749809fbad9afd20e
commit r13-3180-ge95e91eccd022a4a3a86da2749809fbad9afd20e
Author: Dimitar Dimitrov <dimitar@dinux.eu>
Date: Sun Sep 18 16:27:18 2022 +0300
pru: Add cbranchdi4 pattern
Manually expanding into 32-bit comparisons is much more efficient than
the default expansion into word-size comparisons. Note that word for PRU
is 8-bit.
PR target/106562
gcc/ChangeLog:
* config/pru/pru-protos.h (pru_noteq_condition): New
function declaration.
* config/pru/pru.cc (pru_noteq_condition): New function.
* config/pru/pru.md (cbranchdi4): Define new pattern.
gcc/testsuite/ChangeLog:
* gcc.target/pru/pr106562-1.c: New test.
* gcc.target/pru/pr106562-2.c: New test.
* gcc.target/pru/pr106562-3.c: New test.
* gcc.target/pru/pr106562-4.c: New test.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
` (3 preceding siblings ...)
2022-10-09 11:39 ` cvs-commit at gcc dot gnu.org
@ 2023-06-07 19:02 ` dimitar at gcc dot gnu.org
2023-06-07 19:26 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: dimitar at gcc dot gnu.org @ 2023-06-07 19:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
--- Comment #4 from Dimitar Dimitrov <dimitar at gcc dot gnu.org> ---
The ideal PRU code sequence for the snippet would be:
char test(uint64_t a, uint64_t b)
{
return a && b;
}
or r14, r14, r15
or r16, r16, r17
umin r14, r14, 1
umin r14, r14, r16
ret
Thus I'm trying to implementing the following conversion in
emit_store_flag_int():
"X != 0" -> "UMIN (X, 1)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
` (4 preceding siblings ...)
2023-06-07 19:02 ` dimitar at gcc dot gnu.org
@ 2023-06-07 19:26 ` pinskia at gcc dot gnu.org
2023-08-29 19:38 ` dimitar at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-06-07 19:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=104296
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Dimitar Dimitrov from comment #4)
> Thus I'm trying to implementing the following conversion in
> emit_store_flag_int():
>
> "X != 0" -> "UMIN (X, 1)
That is basically what I mention in PR 104296.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
` (5 preceding siblings ...)
2023-06-07 19:26 ` pinskia at gcc dot gnu.org
@ 2023-08-29 19:38 ` dimitar at gcc dot gnu.org
2023-08-30 19:05 ` cvs-commit at gcc dot gnu.org
2023-08-30 20:13 ` dimitar at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: dimitar at gcc dot gnu.org @ 2023-08-29 19:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
--- Comment #6 from Dimitar Dimitrov <dimitar at gcc dot gnu.org> ---
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599276.html gives a good
analysis why deferring expansion decisions to the backend is preferred.
Most backends already define cstore patterns, so it would not be valuable to
add a generic code in emit_store_flag_int() as a fallback if cstore expansion
fails. Such fallback would simply not be utilized on most architectures.
Hence I intend do add a cstore pattern for PRU as a non-intrusive fix for this
PR.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
` (6 preceding siblings ...)
2023-08-29 19:38 ` dimitar at gcc dot gnu.org
@ 2023-08-30 19:05 ` cvs-commit at gcc dot gnu.org
2023-08-30 20:13 ` dimitar at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-08-30 19:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Dimitar Dimitrov <dimitar@gcc.gnu.org>:
https://gcc.gnu.org/g:ee077d0c5793e1d4ad8d3b033ef2f0225ba6bd59
commit r14-3578-gee077d0c5793e1d4ad8d3b033ef2f0225ba6bd59
Author: Dimitar Dimitrov <dimitar@dinux.eu>
Date: Tue Jun 13 22:20:13 2023 +0300
pru: Add cstore expansion patterns
Add cstore patterns for the two specific operations which can be
efficiently expanded using the UMIN instruction:
X != 0
X == 0
The rest of the operations are rejected, and left to be expanded
by the common expansion code.
PR target/106562
gcc/ChangeLog:
* config/pru/predicates.md (const_0_operand): New predicate.
(pru_cstore_comparison_operator): Ditto.
* config/pru/pru.md (cstore<mode>4): New pattern.
(cstoredi4): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/pru/pr106562-10.c: New test.
* gcc.target/pru/pr106562-11.c: New test.
* gcc.target/pru/pr106562-5.c: New test.
* gcc.target/pru/pr106562-6.c: New test.
* gcc.target/pru/pr106562-7.c: New test.
* gcc.target/pru/pr106562-8.c: New test.
* gcc.target/pru/pr106562-9.c: New test.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
` (7 preceding siblings ...)
2023-08-30 19:05 ` cvs-commit at gcc dot gnu.org
@ 2023-08-30 20:13 ` dimitar at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: dimitar at gcc dot gnu.org @ 2023-08-30 20:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562
Dimitar Dimitrov <dimitar at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #8 from Dimitar Dimitrov <dimitar at gcc dot gnu.org> ---
Should be fixed now in trunk.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-08-30 20:13 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-08 19:41 [Bug target/106562] New: PRU: Inefficient code for zero check of 64-bit AND result dimitar at gcc dot gnu.org
2022-08-08 19:42 ` [Bug target/106562] " dimitar at gcc dot gnu.org
2022-09-18 18:04 ` [Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) " dimitar at gcc dot gnu.org
2022-10-05 18:11 ` dimitar at gcc dot gnu.org
2022-10-09 11:39 ` cvs-commit at gcc dot gnu.org
2023-06-07 19:02 ` dimitar at gcc dot gnu.org
2023-06-07 19:26 ` pinskia at gcc dot gnu.org
2023-08-29 19:38 ` dimitar at gcc dot gnu.org
2023-08-30 19:05 ` cvs-commit at gcc dot gnu.org
2023-08-30 20:13 ` dimitar at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).