public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results
       [not found] <bug-82524-4@http.gcc.gnu.org/bugzilla/>
@ 2021-10-12 16:21 ` cvs-commit at gcc dot gnu.org
  2023-11-08 20:56 ` cvs-commit at gcc dot gnu.org
  1 sibling, 0 replies; 2+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-12 16:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82524

--- Comment #21 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:b37351e3279d192d5d4682f002abe5b2e133bba6

commit r12-4359-gb37351e3279d192d5d4682f002abe5b2e133bba6
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Tue Oct 12 18:20:38 2021 +0200

    i386: Improve workaround for PR82524 LRA limitation [PR85730]

    As explained in PR82524, LRA is not able to reload strict_low_part inout
    operand with matched input operand. The patch introduces a workaround,
    where we allow LRA to generate an instruction with non-matched input
operand
    which is split post reload to an instruction that inserts non-matched input
    operand to an inout operand and the instruction that uses matched operand.

    The generated code improves from:

            movsbl  %dil, %edx
            movl    %edi, %eax
            sall    $3, %edx
            movb    %dl, %al

    to:

            movl    %edi, %eax
            movb    %dil, %al
            salb    $3, %al

    which is still not optimal, but the code is one instruction shorter and
    does not use a temporary register.

    2021-10-12  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/
            PR target/85730
            PR target/82524
            * config/i386/i386.md (*add<mode>_1_slp): Rewrite as
            define_insn_and_split pattern.  Add alternative 1 and split it
            post reload to insert operand 1 into the low part of operand 0.
            (*sub<mode>_1_slp): Ditto.
            (*and<mode>_1_slp): Ditto.
            (*<any_or:code><mode>_1_slp): Ditto.
            (*ashl<mode>3_1_slp): Ditto.
            (*<any_shiftrt:insn><mode>3_1_slp): Ditto.
            (*<any_rotate:insn><mode>3_1_slp): Ditto.
            (*neg<mode>_1_slp): New insn_and_split pattern.
            (*one_cmpl<mode>_1_slp): Ditto.

    gcc/testsuite/
            PR target/85730
            PR target/82524
            * gcc.target/i386/pr85730.c: New test.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results
       [not found] <bug-82524-4@http.gcc.gnu.org/bugzilla/>
  2021-10-12 16:21 ` [Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results cvs-commit at gcc dot gnu.org
@ 2023-11-08 20:56 ` cvs-commit at gcc dot gnu.org
  1 sibling, 0 replies; 2+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-08 20:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82524

--- Comment #22 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:dced5ae64703507a7159972316a1dde48e5f7470

commit r14-5254-gdced5ae64703507a7159972316a1dde48e5f7470
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Nov 8 21:46:26 2023 +0100

    i386: Apply LRA reload workaround to insns with high registers [PR82524]

    LRA is not able to reload zero_extracted in-out operand with matched input
    operand in the same way as strict_low_part in-out operand.  The patch
    applies the strict_low_part workaround, where we allow LRA to generate
    an instruction with non-matched input operand, which is split post reload
    to the instruction that inserts non-matched input operand to an in-out
    operand and the instruction that uses matched operand, also to
    zero_extracted in-out operand case.

    The generated code from the pr82524.c testcase improves from:

            movl    %esi, %ecx
            movl    %edi, %eax
            movsbl  %ch, %esi
            addl    %esi, %edx
            movb    %dl, %ah

    to:
            movl    %edi, %eax
            movl    %esi, %ecx
            movb    %ch, %ah
            addb    %dl, %ah

    The compiler is now also able to handle non-commutative operations:

            movl    %edi, %eax
            movl    %esi, %ecx
            movb    %ch, %ah
            subb    %dl, %ah

    and unary operations:

            movl    %edi, %eax
            movl    %esi, %edx
            movb    %dh, %ah
            negb    %ah

    The patch also robustifies split condition of the splitters to ensure that
    only alternatives with unmatched operands are split.

            PR target/82524

    gcc/ChangeLog:

            * config/i386/i386.md (*add<mode>_1_slp):
            Split insn only for unmatched operand 0.
            (*sub<mode>_1_slp): Ditto.
            (*<any_logic:code><mode>_1_slp): Merge pattern from
"*and<mode>_1_slp"
            and "*<any_logic:code><mode>_1_slp" using any_logic code iterator.
            Split insn only for unmatched operand 0.
            (*neg<mode>1_slp): Split insn only for unmatched operand 0.
            (*one_cmpl<mode>_1_slp): Ditto.
            (*ashl<mode>3_1_slp): Ditto.
            (*<any_shiftrt:insn><mode>_1_slp): Ditto.
            (*<any_rotate:insn><mode>_1_slp): Ditto.
            (*addqi_ext<mode>_1): Redefine as define_insn_and_split.  Add
            alternative 1 and split insn after reload for unmatched operand 0.
            (*<plusminus:insn>qi_ext<mode>_2): Merge pattern from
            "*addqi_ext<mode>_2" and "*subqi_ext<mode>_2" using plusminus code
            iterator. Redefine as define_insn_and_split.  Add alternative 1
            and split insn after reload for unmatched operand 0.
            (*subqi_ext<mode>_1): Redefine as define_insn_and_split.  Add
            alternative 1 and split insn after reload for unmatched operand 0.
            (*<any_logic:code>qi_ext<mode>_0): Merge pattern from
            "*andqi_ext<mode>_0" and and "*<any_logic:code>qi_ext<mode>_0"
using
            any_logic code iterator.
            (*<any_logic:code>qi_ext<mode>_1): Merge pattern from
            "*andqi_ext<mode>_1" and "*<any_logic:code>qi_ext<mode>_1" using
            any_logic code iterator. Redefine as define_insn_and_split.  Add
            alternative 1 and split insn after reload for unmatched operand 0.
            (*<any_logic:code>qi_ext<mode>_1_cc): Merge pattern from
            "*andqi_ext<mode>_1_cc" and "*xorqi_ext<mode>_1_cc" using any_logic
            code iterator. Redefine as define_insn_and_split.  Add alternative
1
            and split insn after reload for unmatched operand 0.
            (*<any_logic:code>qi_ext<mode>_2): Merge pattern from
            "*andqi_ext<mode>_2" and "*<any_or:code>qi_ext<mode>_2" using
            any_logic code iterator. Redefine as define_insn_and_split.  Add
            alternative 1 and split insn after reload for unmatched operand 0.
            (*<any_logic:code>qi_ext<mode>_3): Redefine as
define_insn_and_split.
            Add alternative 1 and split insn after reload for unmatched operand
0.
            (*negqi_ext<mode>_1): Rename from "*negqi_ext<mode>_2".  Add
            alternative 1 and split insn after reload for unmatched operand 0.
            (*one_cmplqi_ext<mode>_1): Ditto.
            (*ashlqi_ext<mode>_1): Ditto.
            (*<any_shiftrt:insn>qi_ext<mode>_1): Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr78904-1.c (test_sub): New test.
            * gcc.target/i386/pr78904-1a.c (test_sub): Ditto.
            * gcc.target/i386/pr78904-1b.c (test_sub): Ditto.
            * gcc.target/i386/pr78904-2.c (test_sub): Ditto.
            * gcc.target/i386/pr78904-2a.c (test_sub): Ditto.
            * gcc.target/i386/pr78904-2b.c (test_sub): Ditto.
            * gcc.target/i386/pr78952-4.c (test_sub): Ditto.
            * gcc.target/i386/pr82524.c: New test.
            * gcc.target/i386/pr82524-1.c: New test.
            * gcc.target/i386/pr82524-2.c: New test.
            * gcc.target/i386/pr82524-3.c: New test.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-11-08 20:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-82524-4@http.gcc.gnu.org/bugzilla/>
2021-10-12 16:21 ` [Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results cvs-commit at gcc dot gnu.org
2023-11-08 20:56 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).