public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results
[not found] <bug-82524-4@http.gcc.gnu.org/bugzilla/>
@ 2021-10-12 16:21 ` cvs-commit at gcc dot gnu.org
2023-11-08 20:56 ` cvs-commit at gcc dot gnu.org
1 sibling, 0 replies; 2+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-12 16:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82524
--- Comment #21 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:
https://gcc.gnu.org/g:b37351e3279d192d5d4682f002abe5b2e133bba6
commit r12-4359-gb37351e3279d192d5d4682f002abe5b2e133bba6
Author: Uros Bizjak <ubizjak@gmail.com>
Date: Tue Oct 12 18:20:38 2021 +0200
i386: Improve workaround for PR82524 LRA limitation [PR85730]
As explained in PR82524, LRA is not able to reload strict_low_part inout
operand with matched input operand. The patch introduces a workaround,
where we allow LRA to generate an instruction with non-matched input
operand
which is split post reload to an instruction that inserts non-matched input
operand to an inout operand and the instruction that uses matched operand.
The generated code improves from:
movsbl %dil, %edx
movl %edi, %eax
sall $3, %edx
movb %dl, %al
to:
movl %edi, %eax
movb %dil, %al
salb $3, %al
which is still not optimal, but the code is one instruction shorter and
does not use a temporary register.
2021-10-12 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/85730
PR target/82524
* config/i386/i386.md (*add<mode>_1_slp): Rewrite as
define_insn_and_split pattern. Add alternative 1 and split it
post reload to insert operand 1 into the low part of operand 0.
(*sub<mode>_1_slp): Ditto.
(*and<mode>_1_slp): Ditto.
(*<any_or:code><mode>_1_slp): Ditto.
(*ashl<mode>3_1_slp): Ditto.
(*<any_shiftrt:insn><mode>3_1_slp): Ditto.
(*<any_rotate:insn><mode>3_1_slp): Ditto.
(*neg<mode>_1_slp): New insn_and_split pattern.
(*one_cmpl<mode>_1_slp): Ditto.
gcc/testsuite/
PR target/85730
PR target/82524
* gcc.target/i386/pr85730.c: New test.
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results
[not found] <bug-82524-4@http.gcc.gnu.org/bugzilla/>
2021-10-12 16:21 ` [Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results cvs-commit at gcc dot gnu.org
@ 2023-11-08 20:56 ` cvs-commit at gcc dot gnu.org
1 sibling, 0 replies; 2+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-08 20:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82524
--- Comment #22 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:
https://gcc.gnu.org/g:dced5ae64703507a7159972316a1dde48e5f7470
commit r14-5254-gdced5ae64703507a7159972316a1dde48e5f7470
Author: Uros Bizjak <ubizjak@gmail.com>
Date: Wed Nov 8 21:46:26 2023 +0100
i386: Apply LRA reload workaround to insns with high registers [PR82524]
LRA is not able to reload zero_extracted in-out operand with matched input
operand in the same way as strict_low_part in-out operand. The patch
applies the strict_low_part workaround, where we allow LRA to generate
an instruction with non-matched input operand, which is split post reload
to the instruction that inserts non-matched input operand to an in-out
operand and the instruction that uses matched operand, also to
zero_extracted in-out operand case.
The generated code from the pr82524.c testcase improves from:
movl %esi, %ecx
movl %edi, %eax
movsbl %ch, %esi
addl %esi, %edx
movb %dl, %ah
to:
movl %edi, %eax
movl %esi, %ecx
movb %ch, %ah
addb %dl, %ah
The compiler is now also able to handle non-commutative operations:
movl %edi, %eax
movl %esi, %ecx
movb %ch, %ah
subb %dl, %ah
and unary operations:
movl %edi, %eax
movl %esi, %edx
movb %dh, %ah
negb %ah
The patch also robustifies split condition of the splitters to ensure that
only alternatives with unmatched operands are split.
PR target/82524
gcc/ChangeLog:
* config/i386/i386.md (*add<mode>_1_slp):
Split insn only for unmatched operand 0.
(*sub<mode>_1_slp): Ditto.
(*<any_logic:code><mode>_1_slp): Merge pattern from
"*and<mode>_1_slp"
and "*<any_logic:code><mode>_1_slp" using any_logic code iterator.
Split insn only for unmatched operand 0.
(*neg<mode>1_slp): Split insn only for unmatched operand 0.
(*one_cmpl<mode>_1_slp): Ditto.
(*ashl<mode>3_1_slp): Ditto.
(*<any_shiftrt:insn><mode>_1_slp): Ditto.
(*<any_rotate:insn><mode>_1_slp): Ditto.
(*addqi_ext<mode>_1): Redefine as define_insn_and_split. Add
alternative 1 and split insn after reload for unmatched operand 0.
(*<plusminus:insn>qi_ext<mode>_2): Merge pattern from
"*addqi_ext<mode>_2" and "*subqi_ext<mode>_2" using plusminus code
iterator. Redefine as define_insn_and_split. Add alternative 1
and split insn after reload for unmatched operand 0.
(*subqi_ext<mode>_1): Redefine as define_insn_and_split. Add
alternative 1 and split insn after reload for unmatched operand 0.
(*<any_logic:code>qi_ext<mode>_0): Merge pattern from
"*andqi_ext<mode>_0" and and "*<any_logic:code>qi_ext<mode>_0"
using
any_logic code iterator.
(*<any_logic:code>qi_ext<mode>_1): Merge pattern from
"*andqi_ext<mode>_1" and "*<any_logic:code>qi_ext<mode>_1" using
any_logic code iterator. Redefine as define_insn_and_split. Add
alternative 1 and split insn after reload for unmatched operand 0.
(*<any_logic:code>qi_ext<mode>_1_cc): Merge pattern from
"*andqi_ext<mode>_1_cc" and "*xorqi_ext<mode>_1_cc" using any_logic
code iterator. Redefine as define_insn_and_split. Add alternative
1
and split insn after reload for unmatched operand 0.
(*<any_logic:code>qi_ext<mode>_2): Merge pattern from
"*andqi_ext<mode>_2" and "*<any_or:code>qi_ext<mode>_2" using
any_logic code iterator. Redefine as define_insn_and_split. Add
alternative 1 and split insn after reload for unmatched operand 0.
(*<any_logic:code>qi_ext<mode>_3): Redefine as
define_insn_and_split.
Add alternative 1 and split insn after reload for unmatched operand
0.
(*negqi_ext<mode>_1): Rename from "*negqi_ext<mode>_2". Add
alternative 1 and split insn after reload for unmatched operand 0.
(*one_cmplqi_ext<mode>_1): Ditto.
(*ashlqi_ext<mode>_1): Ditto.
(*<any_shiftrt:insn>qi_ext<mode>_1): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr78904-1.c (test_sub): New test.
* gcc.target/i386/pr78904-1a.c (test_sub): Ditto.
* gcc.target/i386/pr78904-1b.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2a.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2b.c (test_sub): Ditto.
* gcc.target/i386/pr78952-4.c (test_sub): Ditto.
* gcc.target/i386/pr82524.c: New test.
* gcc.target/i386/pr82524-1.c: New test.
* gcc.target/i386/pr82524-2.c: New test.
* gcc.target/i386/pr82524-3.c: New test.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-11-08 20:56 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-82524-4@http.gcc.gnu.org/bugzilla/>
2021-10-12 16:21 ` [Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results cvs-commit at gcc dot gnu.org
2023-11-08 20:56 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).