public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267
@ 2024-01-22 15:46 roger at nextmovesoftware dot com
  2024-01-24 12:48 ` [Bug rtl-optimization/113542] " rearnsha at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: roger at nextmovesoftware dot com @ 2024-01-22 15:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

            Bug ID: 113542
           Summary: gcc.target/arm/bics_3.c regression after change for
                    pr111267
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: roger at nextmovesoftware dot com
  Target Milestone: ---

This patch is a placeholder for tracking the reported failures of
FAIL: gcc.target/arm/bics_3.c scan-assembler-times bics\tr[0-9]+, r[0-9]+,
r[0-9]+ 2
FAIL: gcc.target/arm/bics_3.c scan-assembler-times bics\tr[0-9]+, r[0-9]+,
r[0-9]+, .sl #2 1
See https://linaro.atlassian.net/browse/GNU-1117

Alas, I've been unable to reproduce the failure on cross compilers to either
arm-linux-gnueabihf nor armv8l-unknown-linux-gnueabihf, so I suspect that
there's some configuration option or compile-time flag I'm missing that's
required to trigger these failures (which I'm hoping is "missed optimization"
rather than "wrong code").

Hopefully, filing this PR provides a mechanism to allow folks to help me
investigate this issue.  My apologies for the temporary inconvenience.
Setting the component to "rtl-optimization" until this is confirmed to be a
target (ARM backend) issue.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/113542] gcc.target/arm/bics_3.c regression after change for pr111267
  2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
@ 2024-01-24 12:48 ` rearnsha at gcc dot gnu.org
  2024-01-24 16:42 ` [Bug target/113542] [14 Regression] " rearnsha at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2024-01-24 12:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2024-01-24

--- Comment #1 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Options to reproduce

-O2 -mcpu=cortex-m3 -mthumb

The problem is really a back-end issue.  But the cause is that the fwprop pass
is now merging

propagating insn 9 into insn 10, replacing:
(set (reg:SI 124 [ _7 ])
    (ne:SI (reg:CC 100 cc)
        (const_int 0 [0])))

with the flag setting instruction to form

(parallel [
        (set (reg:SI 124 [ _7 ])
            (ne:SI (reg:SI 122 [ _2 ])
                (const_int 0 [0])))
        (clobber (reg:CC 100 cc))
    ])

That's OK, but it means that the combine pass is no-longer able to merge the
flag setter with an earlier result producer.

A similar thing starts to happen arm state this is dropped because the costs
are working out as the same (it has to reduce the cost).

So I think it's that the cost model for thumb2 needs tweaking.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267
  2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
  2024-01-24 12:48 ` [Bug rtl-optimization/113542] " rearnsha at gcc dot gnu.org
@ 2024-01-24 16:42 ` rearnsha at gcc dot gnu.org
  2024-01-25  9:51 ` mkuvyrkov at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2024-01-24 16:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization

--- Comment #2 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
The costing code is expecting

(parallel [
        (set (reg:SI 124 [ _7 ])
            (ne:SI (reg:SI 122 [ _2 ])
                (const_int 0 [0])))
        (clobber (reg:CC 100 cc))
    ])


To result in the assembler output

SUBS r124, R122, #1
SBC  r124, R122, r124

so really should have a cost of 8 (two insns).  But for some reason the thumb2
back-end is not generating that output in this case.  Overall, that means that
for bic_si_test

BIC r0, r0, r1
SUBS r1, r0, #1
SBC r0, r0, r1

is neither better nor worse than

BICS r0, r0, r1
IT ne
MOVNE r0, #1

and certainly better than

BICS r0, r0, r1
ITE ne
MOVNE r2, #1
MOVEQ r2, #0

at least when it comes to code size.

So the test is somewhat flaky, but there is a further problem with the compiler
not generating the expected sequence for NE(reg, 0) in Thumb2.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267
  2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
  2024-01-24 12:48 ` [Bug rtl-optimization/113542] " rearnsha at gcc dot gnu.org
  2024-01-24 16:42 ` [Bug target/113542] [14 Regression] " rearnsha at gcc dot gnu.org
@ 2024-01-25  9:51 ` mkuvyrkov at gcc dot gnu.org
  2024-01-31 14:33 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: mkuvyrkov at gcc dot gnu.org @ 2024-01-25  9:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Maxim Kuvyrkov <mkuvyrkov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mkuvyrkov at gcc dot gnu.org

--- Comment #3 from Maxim Kuvyrkov <mkuvyrkov at gcc dot gnu.org> ---
Copy-pasting my comment from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267#c15 :

I've looked into the reason for the above failures, and it seems to be not an
issue.

After the patch fwprop1 decides to do an additional propagation, which was
considered as "would increase complexity of pattern" before the patch.  This
results in change from "bics; mov" to "bic; subs".  If I understand ARM
assembler correctly, handling of sign was shifted from "bics" to "subs"
instruction.

This is the actual code: BEFORE:
        bics    r0, r0, r1      @ 9     [c=4 l=4] 
*andsi_notsi_si_compare0_scratch
        mov     r0, #1  @ 23    [c=4 l=4]  *thumb2_movsi_vfp/1
        it      eq
        moveq   r0, #0  @ 26    [c=8 l=4]  *p *thumb2_movsi_vfp/2
        bx      lr      @ 29    [c=8 l=4]  *thumb2_return

and AFTER:
        bic     r0, r0, r1      @ 8     [c=4 l=4]  andsi_notsi_si
        subs    r0, r0, #0      @ 22    [c=4 l=4]  cmpsi2_addneg/0
        it      ne
        movne   r0, #1  @ 23    [c=8 l=4]  *p *thumb2_movsi_vfp/2
        bx      lr      @ 26    [c=8 l=4]  *thumb2_return

If I don't hear anything to the contrary, I'll update the testcase to accept
both "bic" and "bics".

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267
  2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
                   ` (2 preceding siblings ...)
  2024-01-25  9:51 ` mkuvyrkov at gcc dot gnu.org
@ 2024-01-31 14:33 ` rguenth at gcc dot gnu.org
  2024-02-21 12:47 ` mkuvyrkov at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-31 14:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |14.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267
  2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
                   ` (3 preceding siblings ...)
  2024-01-31 14:33 ` rguenth at gcc dot gnu.org
@ 2024-02-21 12:47 ` mkuvyrkov at gcc dot gnu.org
  2024-03-07 20:47 ` law at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: mkuvyrkov at gcc dot gnu.org @ 2024-02-21 12:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Maxim Kuvyrkov <mkuvyrkov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rearnsha at gcc dot gnu.org
           Assignee|mkuvyrkov at gcc dot gnu.org       |unassigned at gcc dot gnu.org

--- Comment #4 from Maxim Kuvyrkov <mkuvyrkov at gcc dot gnu.org> ---
Reply from Richard Earnshaw on gcc-patches@ to my patch to make the testcase
accept both "bic" and "bics" instructions:

The test was added (r6-823-g0454e698401a3e) specifically to check that a BICS
instruction was being generated.  Whether or not that is right is somewhat
debatable, but this change seems to be papering over a different issue.

Either we should generate BICS, making this change incorrect, or we should
disable the test for thumb code on the basis that this isn't really a win.

But really, we should fix the compiler to do better here.  We really want
something like

        BICS  r0, r0, r1  // r0 is 0 or non-zero
        MOVNE r0, #1      // convert all non-zero to 1

in Arm state (ie using the BICS instruction to set the result to zero); and in
thumb2, perhaps something like:

        BICS  r0, r0, r1
        IT    ne
        MOVNE r0, #1

or maybe even better:

        BIC  r0, r0, r1
        SUBS r1, r0, #1
        SBC  r0, r0, r1

which is slightly better than BICS because SUBS breaks a condition-code chain
(all the flag bits are set).

There are similar quality issues for other NE(arith-op, 0) cases; we just don't
have tests for those.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267
  2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
                   ` (4 preceding siblings ...)
  2024-02-21 12:47 ` mkuvyrkov at gcc dot gnu.org
@ 2024-03-07 20:47 ` law at gcc dot gnu.org
  2024-03-08 17:08 ` cvs-commit at gcc dot gnu.org
  2024-03-08 17:15 ` rearnsha at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-07 20:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
                 CC|                            |law at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267
  2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
                   ` (5 preceding siblings ...)
  2024-03-07 20:47 ` law at gcc dot gnu.org
@ 2024-03-08 17:08 ` cvs-commit at gcc dot gnu.org
  2024-03-08 17:15 ` rearnsha at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-03-08 17:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Earnshaw <rearnsha@gcc.gnu.org>:

https://gcc.gnu.org/g:ac829a89fb56cfd914d5e29ed4695e499b0dbc95

commit r14-9399-gac829a89fb56cfd914d5e29ed4695e499b0dbc95
Author: Richard Earnshaw <rearnsha@arm.com>
Date:   Fri Mar 8 16:23:53 2024 +0000

    arm: testsuite: tweak bics_3.c [PR113542]

    This test was too simple, which meant that the compiler was sometimes
    able to find a better optimization of the code than using a BICS
    instruction.  Fix this by changing the test slightly to produce a
    sequence where BICS should always be the preferred solution.

    gcc/testsuite:
            PR target/113542
            * gcc.target/arm/bics_3.c: Adjust code to something which should
            always result in BICS.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267
  2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
                   ` (6 preceding siblings ...)
  2024-03-08 17:08 ` cvs-commit at gcc dot gnu.org
@ 2024-03-08 17:15 ` rearnsha at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2024-03-08 17:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #6 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Change the test slightly to avoid the insn matching issues.  This does leave
open the question of how best to optimize the slightly simpler sequences, where
we could do even better than we do now, but that's an enhancement and not
appropriate for gcc-14.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-03-08 17:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-22 15:46 [Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267 roger at nextmovesoftware dot com
2024-01-24 12:48 ` [Bug rtl-optimization/113542] " rearnsha at gcc dot gnu.org
2024-01-24 16:42 ` [Bug target/113542] [14 Regression] " rearnsha at gcc dot gnu.org
2024-01-25  9:51 ` mkuvyrkov at gcc dot gnu.org
2024-01-31 14:33 ` rguenth at gcc dot gnu.org
2024-02-21 12:47 ` mkuvyrkov at gcc dot gnu.org
2024-03-07 20:47 ` law at gcc dot gnu.org
2024-03-08 17:08 ` cvs-commit at gcc dot gnu.org
2024-03-08 17:15 ` rearnsha at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).