[Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475
@ 2021-10-19 13:39 hjl.tools at gmail dot com
  2021-10-19 14:12 ` [Bug rtl-optimization/102840] " rguenth at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-19 13:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840

            Bug ID: 102840
           Summary: [12 Regression] gcc.target/i386/pr22076.c by r12-4475
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: roger at nextmovesoftware dot com
  Target Milestone: ---

On Linux/x86-64, r12-4475 caused:

$ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32}' i386.exp=pr22076.c"
...
FAIL: gcc.target/i386/pr22076.c scan-assembler-not movl
FAIL: gcc.target/i386/pr22076.c scan-assembler-times movq 2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/102840] [12 Regression] gcc.target/i386/pr22076.c by r12-4475
  2021-10-19 13:39 [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475 hjl.tools at gmail dot com
@ 2021-10-19 14:12 ` rguenth at gcc dot gnu.org
  2021-10-19 14:48 ` roger at nextmovesoftware dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-19 14:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
             Target|                            |x86_64-*-* i?86-*-*

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/102840] [12 Regression] gcc.target/i386/pr22076.c by r12-4475
  2021-10-19 13:39 [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475 hjl.tools at gmail dot com
  2021-10-19 14:12 ` [Bug rtl-optimization/102840] " rguenth at gcc dot gnu.org
@ 2021-10-19 14:48 ` roger at nextmovesoftware dot com
  2021-10-19 15:36 ` hjl.tools at gmail dot com
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: roger at nextmovesoftware dot com @ 2021-10-19 14:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-10-19
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #1 from Roger Sayle <roger at nextmovesoftware dot com> ---
I believe this test case is poorly written, and not correctly testing the
original issue in PR target/22076 which concerned suboptimal moving of
arguments via memory (fixed by prohibiting reload using mmx registers).

Prior to my patch, with -m32 -O2 -fomit-frame-pointer -mmmx -mno-sse2, GCC
generated:

test:   movq    .LC1, %mm0
        paddb   .LC0, %mm0
        movq    %mm0, x
        ret

.x:     .zero 8
.LC0:   .byte   1
        .byte   2
        .byte   3
        .byte   4
        .byte   5
        .byte   6
        .byte   7
        .byte   8
.LC1:   .byte   11
        .byte   22
        .byte   33
        .byte   44
        .byte   55
        .byte   66
        .byte   77
        .byte   88

which indeed doesn't use movl, and requires two movq.

After my patch, we now generate the much more efficient (dare I say optimal):
test:   movl    $807671820, %eax
        movl    $1616136252, %edx
        movl    %eax, x
        movl    %edx, x+4
        ret

which has evaluated the _mm_add_pi8 at compile-time, and effectively memsets x
to the correct value in the minimum possible number of cycles.  In fact,
failing to evaluate this at compile-time is a regression since v4.1 (according
to godbolt)

[p.s. I predict other platforms might also notice changes in their testsuites,
as the middle-end now generates more efficient instruction sequences].

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/102840] [12 Regression] gcc.target/i386/pr22076.c by r12-4475
  2021-10-19 13:39 [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475 hjl.tools at gmail dot com
  2021-10-19 14:12 ` [Bug rtl-optimization/102840] " rguenth at gcc dot gnu.org
  2021-10-19 14:48 ` roger at nextmovesoftware dot com
@ 2021-10-19 15:36 ` hjl.tools at gmail dot com
  2021-10-19 17:42 ` roger at nextmovesoftware dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-19 15:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Roger Sayle from comment #1)
> I believe this test case is poorly written, and not correctly testing the
> original issue in PR target/22076 which concerned suboptimal moving of
> arguments via memory (fixed by prohibiting reload using mmx registers).
> 
> Prior to my patch, with -m32 -O2 -fomit-frame-pointer -mmmx -mno-sse2, GCC
> generated:
> 
> test:   movq    .LC1, %mm0
>         paddb   .LC0, %mm0
>         movq    %mm0, x
>         ret
> 
> .x:     .zero 8
> .LC0:   .byte   1
>         .byte   2
>         .byte   3
>         .byte   4
>         .byte   5
>         .byte   6
>         .byte   7
>         .byte   8
> .LC1:   .byte   11
>         .byte   22
>         .byte   33
>         .byte   44
>         .byte   55
>         .byte   66
>         .byte   77
>         .byte   88
> 
> which indeed doesn't use movl, and requires two movq.
> 
> After my patch, we now generate the much more efficient (dare I say optimal):
> test:   movl    $807671820, %eax
>         movl    $1616136252, %edx
>         movl    %eax, x
>         movl    %edx, x+4
>         ret
> 
> which has evaluated the _mm_add_pi8 at compile-time, and effectively memsets
> x to the correct value in the minimum possible number of cycles.  In fact,
> failing to evaluate this at compile-time is a regression since v4.1
> (according to godbolt)

If your analysis is correct, why does -m64 stay the same?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/102840] [12 Regression] gcc.target/i386/pr22076.c by r12-4475
  2021-10-19 13:39 [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475 hjl.tools at gmail dot com
                   ` (2 preceding siblings ...)
  2021-10-19 15:36 ` hjl.tools at gmail dot com
@ 2021-10-19 17:42 ` roger at nextmovesoftware dot com
  2021-10-19 18:08 ` hjl.tools at gmail dot com
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: roger at nextmovesoftware dot com @ 2021-10-19 17:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840

--- Comment #3 from Roger Sayle <roger at nextmovesoftware dot com> ---
With -m64, before:
test:   movq    .LC1(%rip), %mm0
        paddb   .LC0(%rip), %mm0
        movq    %xmm0, x(%rip)
        ret

And after:
test:   movq    .LC2(%rip), %rax
        movq    %rax, x(%rip)
        ret

So we have two movq before, and two movq after, but clearly we've avoided the
computation at run-time.

It's difficult (for me) to judge whether the -m32's use of immediate constants
is now better than -m64's load memory/store memory idiom in the "average case",
but worst case [data cache miss], the former is clearly better [requiring only
fewer memory transactions].

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/102840] [12 Regression] gcc.target/i386/pr22076.c by r12-4475
  2021-10-19 13:39 [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475 hjl.tools at gmail dot com
                   ` (3 preceding siblings ...)
  2021-10-19 17:42 ` roger at nextmovesoftware dot com
@ 2021-10-19 18:08 ` hjl.tools at gmail dot com
  2021-10-21 18:58 ` cvs-commit at gcc dot gnu.org
  2021-10-21 18:59 ` ubizjak at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-10-19 18:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ubizjak at gmail dot com

--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> ---
Avoid MMX register isn't a bad thing.  I think we should adjust test
to check that MMX register isn't used.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/102840] [12 Regression] gcc.target/i386/pr22076.c by r12-4475
  2021-10-19 13:39 [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475 hjl.tools at gmail dot com
                   ` (4 preceding siblings ...)
  2021-10-19 18:08 ` hjl.tools at gmail dot com
@ 2021-10-21 18:58 ` cvs-commit at gcc dot gnu.org
  2021-10-21 18:59 ` ubizjak at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-21 18:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:6aceeb3fb64b0e82fc3301026669062797ec01a5

commit r12-4618-g6aceeb3fb64b0e82fc3301026669062797ec01a5
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Thu Oct 21 20:57:38 2021 +0200

    testsuite: Adjust pr22076.c to avoid compile-time optimization [PR102840]

    2021-10-21  UroÅ¡ Bizjak  <ubizjak@gmail.com>

            PR testsuite/102840

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr22076.c: Adjust to avoid compile time
optimization.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/102840] [12 Regression] gcc.target/i386/pr22076.c by r12-4475
  2021-10-19 13:39 [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475 hjl.tools at gmail dot com
                   ` (5 preceding siblings ...)
  2021-10-21 18:58 ` cvs-commit at gcc dot gnu.org
@ 2021-10-21 18:59 ` ubizjak at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2021-10-21 18:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> ---
Fixed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-21 18:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-19 13:39 [Bug rtl-optimization/102840] New: [12 Regression] gcc.target/i386/pr22076.c by r12-4475 hjl.tools at gmail dot com
2021-10-19 14:12 ` [Bug rtl-optimization/102840] " rguenth at gcc dot gnu.org
2021-10-19 14:48 ` roger at nextmovesoftware dot com
2021-10-19 15:36 ` hjl.tools at gmail dot com
2021-10-19 17:42 ` roger at nextmovesoftware dot com
2021-10-19 18:08 ` hjl.tools at gmail dot com
2021-10-21 18:58 ` cvs-commit at gcc dot gnu.org
2021-10-21 18:59 ` ubizjak at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).