public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/78904] zero-extracts are not effective
       [not found] <bug-78904-4@http.gcc.gnu.org/bugzilla/>
@ 2023-04-19 15:04 ` cvs-commit at gcc dot gnu.org
  2023-06-23  8:22 ` roger at nextmovesoftware dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-19 15:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:0df6d181230f0480547ed08b4e4354db68242724

commit r14-85-g0df6d181230f0480547ed08b4e4354db68242724
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Apr 19 17:00:52 2023 +0200

    i386: Emit compares between high registers and memory

    Following code:

    typedef __SIZE_TYPE__ size_t;

    struct S1s
    {
      char pad1;
      char val;
      short pad2;
    };

    extern char ts[256];

    _Bool foo (struct S1s a, size_t i)
    {
      return (ts[i] > a.val);
    }

    compiles with -O2 to:

            movl    %edi, %eax
            movsbl  %ah, %edi
            cmpb    %dil, ts(%rsi)
            setg    %al
            ret

    the compare could use high register %ah instead of %dil:

            movl    %edi, %eax
            cmpb    ts(%rsi), %ah
            setl    %al
            ret

    Use any_extract code iterator to handle signed and unsigned extracts
    from high register and introduce peephole2 patterns to propagate
    norex memory opeerand into the compare insn.

    gcc/ChangeLog:

            PR target/78904
            PR target/78952
            * config/i386/i386.md (*cmpqi_ext<mode>_1_mem_rex64): New insn
pattern.
            (*cmpqi_ext<mode>_1): Use nonimmediate_operand predicate
            for operand 0. Use any_extract code iterator.
            (*cmpqi_ext<mode>_1 peephole2): New peephole2 pattern.
            (*cmpqi_ext<mode>_2): Use any_extract code iterator.
            (*cmpqi_ext<mode>_3_mem_rex64): New insn pattern.
            (*cmpqi_ext<mode>_1): Use general_operand predicate
            for operand 1. Use any_extract code iterator.
            (*cmpqi_ext<mode>_3 peephole2): New peephole2 pattern.
            (*cmpqi_ext<mode>_4): Use any_extract code iterator.

    gcc/testsuite/ChangeLog:

            PR target/78904
            PR target/78952
            * gcc.target/i386/pr78952-3.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/78904] zero-extracts are not effective
       [not found] <bug-78904-4@http.gcc.gnu.org/bugzilla/>
  2023-04-19 15:04 ` [Bug target/78904] zero-extracts are not effective cvs-commit at gcc dot gnu.org
@ 2023-06-23  8:22 ` roger at nextmovesoftware dot com
  2023-06-23 17:47 ` segher at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-06-23  8:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot com

--- Comment #16 from Roger Sayle <roger at nextmovesoftware dot com> ---
Just to warn people in advance, the test case pr78904-1b.c is expected to start
FAILing with the commit of
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622079.html and is
scheduled to be resolved 24-48 hours later (over the weekend) by
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622078.html
As explained in https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622083.html
this is to investigate additional tweaks and whether alternate fixes are more
appropriate.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/78904] zero-extracts are not effective
       [not found] <bug-78904-4@http.gcc.gnu.org/bugzilla/>
  2023-04-19 15:04 ` [Bug target/78904] zero-extracts are not effective cvs-commit at gcc dot gnu.org
  2023-06-23  8:22 ` roger at nextmovesoftware dot com
@ 2023-06-23 17:47 ` segher at gcc dot gnu.org
  2023-06-24 22:12 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: segher at gcc dot gnu.org @ 2023-06-23 17:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #17 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Roger Sayle from comment #16)
> Just to warn people in advance, the test case pr78904-1b.c is expected to
> start FAILing with the commit of
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622079.html and is
> scheduled to be resolved 24-48 hours later (over the weekend) by
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622078.html
> As explained in
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622083.html this is to
> investigate additional tweaks and whether alternate fixes are more
> appropriate.

Thanks for the warning Roger!  Much appreciated.

That fix is for x86 only though?  Is that really the only target affected?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/78904] zero-extracts are not effective
       [not found] <bug-78904-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2023-06-23 17:47 ` segher at gcc dot gnu.org
@ 2023-06-24 22:12 ` cvs-commit at gcc dot gnu.org
  2023-11-14 17:36 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-24 22:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:8f6c747c8638d4c3c47ba2d4c8be86909e183132

commit r14-2065-g8f6c747c8638d4c3c47ba2d4c8be86909e183132
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Sat Jun 24 23:05:25 2023 +0100

    i386: Add alternate representation for {and,or,xor}b %ah,%dh.

    A patch that I'm working on to improve RTL simplifications in the
    middle-end results in the regression of pr78904-1b.c, due to changes in
    the canonical representation of high-byte (%ah, %bh, %ch, %dh) logic.
    See also PR target/78904.

    This patch avoids/prevents those failures by adding support for the
    alternate representation, duplicating the existing *<code>qi_ext<mode>_2
    as *<code>qi_ext<mode>_3 (the new version also replacing any_or with
    any_logic to provide *andqi_ext<mode>_3 in the same pattern).  Removing
    the original pattern isn't trivial, as it's generated by define_split,
    but this can be investigated after the other pieces are approved.

    The current representation of this instruction is:

    (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
            (const_int 8 [0x8])
            (const_int 8 [0x8]))
        (subreg:DI (xor:QI (subreg:QI (zero_extract:DI (reg:DI 94)
                        (const_int 8 [0x8])
                        (const_int 8 [0x8])) 0)
                (subreg:QI (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
                        (const_int 8 [0x8])
                        (const_int 8 [0x8])) 0)) 0))

    after my proposed middle-end improvement, we attempt to recognize:

    (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
            (const_int 8 [0x8])
            (const_int 8 [0x8]))
        (zero_extract:DI (xor:DI (reg:DI 94)
                (reg/v:DI 87 [ aD.2763 ]))
            (const_int 8 [0x8])
            (const_int 8 [0x8])))

    2023-06-24  Roger Sayle  <roger@nextmovesoftware.com>

    gcc/ChangeLog
            * config/i386/i386.md (*<code>qi_ext<mode>_3): New define_insn.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/78904] zero-extracts are not effective
       [not found] <bug-78904-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2023-06-24 22:12 ` cvs-commit at gcc dot gnu.org
@ 2023-11-14 17:36 ` cvs-commit at gcc dot gnu.org
  2023-11-15 21:22 ` cvs-commit at gcc dot gnu.org
  2023-11-16 18:13 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-14 17:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #19 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:b42a09b258c3ed8d1368e0ef0948034dcf0f8ac9

commit r14-5456-gb42a09b258c3ed8d1368e0ef0948034dcf0f8ac9
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Tue Nov 14 18:34:43 2023 +0100

    i386: Generate strict_low_part QImode insn with high input register

    Following testcase:

    struct S1
    {
      unsigned char val;
      unsigned char pad1;
      unsigned short pad2;
    };

    struct S2
    {
      unsigned char pad1;
      unsigned char val;
      unsigned short pad2;
    };

    struct S1 test_and (struct S1 a, struct S2 b)
    {
      a.val &= b.val;

      return a;
    }

    compiles with -O2 to:

            movl    %esi, %edx
            movl    %edi, %eax
            movzbl  %dh, %esi
            andb    %sil, %al

    ANDB could use high register %dh instead of %sil:

            movl    %edi, %eax
            movl    %esi, %edx
            andb    %dh, %al

    Patch introduces strict_low_part QImode insn patterns with one of
    its input arguments extracted from high register.

            PR target/78904

    gcc/ChangeLog:

            * config/i386/i386.md (*addqi_ext<mode>_1_slp):
            New define_insn_and_split pattern.
            (*subqi_ext<mode>_1_slp): Ditto.
            (*<any_logic:code>qi_ext<mode>_1_slp): Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr78904-7.c: New test.
            * gcc.target/i386/pr78904-7a.c: New test.
            * gcc.target/i386/pr78904-7b.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/78904] zero-extracts are not effective
       [not found] <bug-78904-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2023-11-14 17:36 ` cvs-commit at gcc dot gnu.org
@ 2023-11-15 21:22 ` cvs-commit at gcc dot gnu.org
  2023-11-16 18:13 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-15 21:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #20 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:e8676f9ded71f5e04c4e9d81ec656809f6ba54e6

commit r14-5511-ge8676f9ded71f5e04c4e9d81ec656809f6ba54e6
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Nov 15 22:21:10 2023 +0100

    i386: Optimize strict_low_part QImode insn with high input registers

    Following testcase:

    struct S1
    {
      unsigned char val;
      unsigned char pad1;
      unsigned short pad2;
    };

    struct S2
    {
      unsigned char pad1;
      unsigned char val;
      unsigned short pad2;
    };

    struct S1 test_add (struct S1 a, struct S2 b, struct S2 c)
    {
      a.val = b.val + c.val;

      return a;
    }

    compiles with -O2 to:

            movl    %edi, %eax
            movzbl  %dh, %edx
            movl    %esi, %ecx
            movb    %dl, %al
            addb    %ch, %al

    The insert to %al can go directly from %dh:

            movl    %edi, %eax
            movl    %esi, %ecx
            movb    %dh, %al
            addb    %ch, %al

    Patch introduces strict_low_part QImode insn patterns with both of
    their input arguments extracted from high register.  This invalid
    insn is split after reload to a lowpart insert from the high register
    and <insn>qi_ext<mode>_1_slp instruction.

            PR target/78904

    gcc/ChangeLog:

            * config/i386/i386.md (*movstrictqi_ext<mode>_1): New insn pattern.
            (*addqi_ext<mode>_2_slp): New define_insn_and_split pattern.
            (*subqi_ext<mode>_2_slp): Ditto.
            (*<any_logic:code>qi_ext<mode>_2_slp): Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr78904-8.c: New test.
            * gcc.target/i386/pr78904-8a.c: New test.
            * gcc.target/i386/pr78904-8b.c: New test.
            * gcc.target/i386/pr78904-9.c: New test.
            * gcc.target/i386/pr78904-9a.c: New test.
            * gcc.target/i386/pr78904-9b.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/78904] zero-extracts are not effective
       [not found] <bug-78904-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2023-11-15 21:22 ` cvs-commit at gcc dot gnu.org
@ 2023-11-16 18:13 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-16 18:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #21 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:8ebc7e0b0ddf4679cf09ed6836fac30ca01d3ba0

commit r14-5539-g8ebc7e0b0ddf4679cf09ed6836fac30ca01d3ba0
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Thu Nov 16 18:07:36 2023 +0100

    i386: Optimize QImode insn with high input registers

    Sometimes the compiler emits the following code with <insn>qi_ext<mode>_0:

            shrl    $8, %eax
            addb    %bh, %al

    Patch introduces new low part QImode insn patterns with both of
    their input arguments extracted from high register.  This invalid
    insn is split after reload to a move from the high register
    and <insn>qi_ext<mode>_0 instruction.  The combine pass is able to
    convert shift to zero/sign-extract sub-RTX, which we split to the
    optimal:

            movzbl  %bh, %edx
            addb    %ah, %dl

            PR target/78904

    gcc/ChangeLog:

            * config/i386/i386.md (*addqi_ext2<mode>_0):
            New define_insn_and_split pattern.
            (*subqi_ext2<mode>_0): Ditto.
            (*<code>qi_ext2<mode>_0): Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr78904-10.c: New test.
            * gcc.target/i386/pr78904-10a.c: New test.
            * gcc.target/i386/pr78904-10b.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-11-16 18:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-78904-4@http.gcc.gnu.org/bugzilla/>
2023-04-19 15:04 ` [Bug target/78904] zero-extracts are not effective cvs-commit at gcc dot gnu.org
2023-06-23  8:22 ` roger at nextmovesoftware dot com
2023-06-23 17:47 ` segher at gcc dot gnu.org
2023-06-24 22:12 ` cvs-commit at gcc dot gnu.org
2023-11-14 17:36 ` cvs-commit at gcc dot gnu.org
2023-11-15 21:22 ` cvs-commit at gcc dot gnu.org
2023-11-16 18:13 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).