public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/90706] [9/10 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
@ 2020-03-12 11:59 ` jakub at gcc dot gnu.org
  2020-03-13 23:25 ` bseifert at gmx dot at
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-12 11:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.3                         |9.4

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 9.3.0 has been released, adjusting target milestone.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [9/10 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
  2020-03-12 11:59 ` [Bug rtl-optimization/90706] [9/10 Regression] Useless code generated for stack / register operations on AVR jakub at gcc dot gnu.org
@ 2020-03-13 23:25 ` bseifert at gmx dot at
  2020-04-29 16:50 ` gjl at gcc dot gnu.org
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: bseifert at gmx dot at @ 2020-03-13 23:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #8 from Berni <bseifert at gmx dot at> ---
I just compiled AVR gcc 9.3.0 and tested the code again. Still no improvement!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [9/10 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
  2020-03-12 11:59 ` [Bug rtl-optimization/90706] [9/10 Regression] Useless code generated for stack / register operations on AVR jakub at gcc dot gnu.org
  2020-03-13 23:25 ` bseifert at gmx dot at
@ 2020-04-29 16:50 ` gjl at gcc dot gnu.org
  2021-06-01  8:14 ` [Bug rtl-optimization/90706] [9/10/11/12 " rguenth at gcc dot gnu.org
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: gjl at gcc dot gnu.org @ 2020-04-29 16:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #9 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
(In reply to Berni from comment #8)
> I just compiled AVR gcc 9.3.0 and tested the code again. Still no
> improvement!


Don't expect anything from v9 (or from v10 for that matter). The problem is in
the middle-end, and problems there that affect targets like avr will not be
fixed -- except in the rare case you manage to show that the issue affects a
target that is important enough and report it for that target.

And don't expect anything from v11+ either. The avr backend will likely be
removed from the compiler, see PR92729. The depreciation is for v11 and wasn't
even worth a mention in the v10 release notes caveats
https://gcc.gnu.org/gcc-10/changes.html

The general recommendation is to switch to clang / llvm where the respective
backend is improving and has left experimental status;and is not suffering from
self-destruction...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [9/10/11/12 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2020-04-29 16:50 ` gjl at gcc dot gnu.org
@ 2021-06-01  8:14 ` rguenth at gcc dot gnu.org
  2022-05-27  9:40 ` [Bug rtl-optimization/90706] [10/11/12/13 " rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-01  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.4                         |9.5

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9.4 is being released, retargeting bugs to GCC 9.5.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2021-06-01  8:14 ` [Bug rtl-optimization/90706] [9/10/11/12 " rguenth at gcc dot gnu.org
@ 2022-05-27  9:40 ` rguenth at gcc dot gnu.org
  2022-06-28 10:37 ` jakub at gcc dot gnu.org
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-27  9:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.5                         |10.4

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9 branch is being closed

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2022-05-27  9:40 ` [Bug rtl-optimization/90706] [10/11/12/13 " rguenth at gcc dot gnu.org
@ 2022-06-28 10:37 ` jakub at gcc dot gnu.org
  2022-11-01 15:55 ` gjl at gcc dot gnu.org
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.4                        |10.5

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2022-06-28 10:37 ` jakub at gcc dot gnu.org
@ 2022-11-01 15:55 ` gjl at gcc dot gnu.org
  2022-12-13 14:10 ` vmakarov at gcc dot gnu.org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: gjl at gcc dot gnu.org @ 2022-11-01 15:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #13 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
Created attachment 53812
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53812&action=edit
Test case with 32-bit integer.

This problem is still present in current master (future v13) and also occurs
with 32-bit integers.

> avr-gcc -S -Os -mul.c -fdump-rtl-ira

With v8, mul.s has 15 instructions.

With newer versions, mul.s has 26 additional instructions: 
* 12 silly, useless stores into / loads from frame.
* 12 instructions to setup the frame.
* More instructions due to sub-optimal register alloc.
* Uses 6 bytes stack frame where v8 needs no frame at all.

In the IRA dump, there is:

Pass 0 for finding pseudo/allocno costs
    a0 (r53,l0) best NO_REGS, allocno NO_REGS
    a2 (r49,l0) best GENERAL_REGS, allocno GENERAL_REGS
    a1 (r48,l0) best NO_REGS, allocno NO_REGS
...
Pass 1 for finding pseudo/allocno costs
    r53: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
    r49: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
    r48: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
...
      Spill a0(r53,l0)
      Spill a1(r48,l0)
      Allocno a2r49 of GENERAL_REGS(30) ...

So there are 2 register spills for no reason that lead to that code bloat.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2022-11-01 15:55 ` gjl at gcc dot gnu.org
@ 2022-12-13 14:10 ` vmakarov at gcc dot gnu.org
  2022-12-15 19:20 ` cvs-commit at gcc dot gnu.org
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2022-12-13 14:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Vladimir Makarov <vmakarov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #14 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
What I see is the input to RA was significantly changed sing gcc-8 (see
insns marked by !).  A lot of subregs is generated now and there is no
promotion of (argument) hard regs (insns 44-47) because of
https://gcc.gnu.org/legacy-ml/gcc-patches/2018-10/msg01356.html.


    1: NOTE_INSN_DELETED                             1: NOTE_INSN_DELETED
    4: NOTE_INSN_BASIC_BLOCK 2                       4: NOTE_INSN_BASIC_BLOCK 2
    2: r44:SF=r22:SF                                44: r56:QI=r22:QI
      REG_DEAD r22:SF                                  REG_DEAD r22:QI
    3: NOTE_INSN_FUNCTION_BEG                       45: r57:QI=r23:QI
    6: r45:QI=0x1                                      REG_DEAD r23:QI
      REG_EQUAL 0x1                                 46: r58:QI=r24:QI
    7: r18:SF=0.0                                      REG_DEAD r24:QI
!   8: r22:SF=r44:SF                                47: r59:QI=r25:QI
      REG_DEAD r44:SF                                  REG_DEAD r25:QI
    9: r24:QI=call [`__gtsf2'] argc:0               48: r52:QI=r56:QI
      REG_DEAD r25:QI                                  REG_DEAD r56:QI
      REG_DEAD r23:QI                               49: r53:QI=r57:QI
      REG_DEAD r22:QI                                  REG_DEAD r57:QI
      REG_DEAD r18:SF                               50: r54:QI=r58:QI
      REG_CALL_DECL `__gtsf2'                          REG_DEAD r58:QI
      REG_EH_REGION 0xffffffff80000000              51: r55:QI=r59:QI
   10: NOTE_INSN_DELETED                               REG_DEAD r59:QI
   11: cc0=cmp(r24:QI,0)                             3: NOTE_INSN_FUNCTION_BEG
      REG_DEAD r24:QI                                6: r46:QI=0x1
   12: pc={(cc0>0)?L14:pc}                             REG_EQUAL 0x1
      REG_BR_PROB 633507684                          7: r18:SF=0.0
   22: NOTE_INSN_BASIC_BLOCK 3                    !  52: clobber r60:SI
   13: r45:QI=0                                   !  53: r60:SI#0=r52:QI
      REG_EQUAL 0                                      REG_DEAD r52:QI
   14: L14:                                       !  54: r60:SI#1=r53:QI
   23: NOTE_INSN_BASIC_BLOCK 4                         REG_DEAD r53:QI
   19: r24:QI=r45:QI                              !  55: r60:SI#2=r54:QI
      REG_DEAD r45:QI                                  REG_DEAD r54:QI
   20: use r24:QI                                 !  56: r60:SI#3=r55:QI
                                                       REG_DEAD r55:QI
                                                  !  57: r22:SF=r60:SI#0
                                                       REG_DEAD r60:SI
                                                     9: r24:QI=call [`__gtsf2']
argc:0
                                                       REG_DEAD r25:QI
                                                       REG_DEAD r23:QI
                                                       REG_DEAD r22:QI
                                                       REG_DEAD r18:SF
                                                       REG_CALL_DECL `__gtsf2'
                                                       REG_EH_REGION
0xffffffff80000000
                                                    34: r50:QI=r24:QI
                                                       REG_DEAD r24:QI
                                                    10: NOTE_INSN_DELETED
                                                    11: pc={(r50:QI>0)?L13:pc}
                                                       REG_DEAD r50:QI
                                                       REG_BR_PROB 633507684
                                                    21: NOTE_INSN_BASIC_BLOCK 3
                                                    12: r46:QI=0
                                                       REG_EQUAL 0
                                                    13: L13:
                                                    22: NOTE_INSN_BASIC_BLOCK 4
                                                    18: r24:QI=r46:QI
                                                       REG_DEAD r46:QI
                                                    19: use r24:QI

Currently, GCC generates the following AVR code:

check:
        push r28
        push r29
        rcall .
        rcall .
        push __tmp_reg__
        in r28,__SP_L__
        in r29,__SP_H__
/* prologue: function */
/* frame size = 5 */
/* stack size = 7 */
.L__stack_usage = 7
        ldi r18,lo8(1)
        std Y+5,r18
        ldi r18,0
        ldi r19,0
        ldi r20,0
        ldi r21,0
!       std Y+1,r22
!       std Y+2,r23
!       std Y+3,r24
!       std Y+4,r25
!       ldd r22,Y+1
!       ldd r23,Y+2
!       ldd r24,Y+3
!       ldd r25,Y+4
        rcall __gtsf2
        cp __zero_reg__,r24
        brlt .L2
        std Y+5,__zero_reg__
.L2:
        ldd r24,Y+5
/* epilogue start */
        pop __tmp_reg__
        pop __tmp_reg__
        pop __tmp_reg__
        pop __tmp_reg__
        pop __tmp_reg__
        pop r29
        pop r28
        ret

There are a lot of loads and stores.  That is because p60 got memory:

a2(r60,l0) costs: ADDW_REGS:32000 SIMPLE_LD_REGS:32000 LD_REGS:32000
NO_LD_REGS:32000 GENERAL_REGS:32000 MEM:12000
r60: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS

After some investigation I found that IRA calculates a wrong cost for moving
general hard regs of SFmode.

The following patch solves the problem:

diff --git a/gcc/ira.cc b/gcc/ira.cc
index d28a67b2546..cb4bfca739d 100644
--- a/gcc/ira.cc
+++ b/gcc/ira.cc
@@ -1627,14 +1627,22 @@ ira_init_register_move_cost (machine_mode mode)
                 *p2 != LIM_REG_CLASSES; p2++)
              if (ira_class_hard_regs_num[*p2] > 0
                  && (ira_reg_class_max_nregs[*p2][mode]
-                     <= ira_class_hard_regs_num[*p2]))
+                     <= ira_class_hard_regs_num[*p2])
+                 && hard_reg_set_intersect_p (ok_regs,
+                                              reg_class_contents[cl1])
+                 && hard_reg_set_intersect_p (ok_regs,
+                                              reg_class_contents[*p2]))
                cost = MAX (cost, ira_register_move_cost[mode][cl1][*p2]);

            for (p1 = &reg_class_subclasses[cl1][0];
                 *p1 != LIM_REG_CLASSES; p1++)
              if (ira_class_hard_regs_num[*p1] > 0
                  && (ira_reg_class_max_nregs[*p1][mode]
-                     <= ira_class_hard_regs_num[*p1]))
+                     <= ira_class_hard_regs_num[*p1])
+                 && hard_reg_set_intersect_p (ok_regs,
+                                              reg_class_contents[cl2])
+                 && hard_reg_set_intersect_p (ok_regs,
+                                              reg_class_contents[*p1]))
                cost = MAX (cost, ira_register_move_cost[mode][*p1][cl2]);

            ira_assert (cost <= 65535);


With this patch RA generates the following better code:

check:
        push r12
        push r13
        push r14
        push r15
        push r28
/* prologue: function */
/* frame size = 0 */
/* stack size = 5 */
.L__stack_usage = 5
        ldi r28,lo8(1)
        ldi r18,0
        ldi r19,0
        ldi r20,0
        ldi r21,0
!       mov r12,r22
!       mov r13,r23
!       mov r14,r24
!       mov r15,r25
!       mov r25,r15
!       mov r24,r14
!       mov r23,r13
!       mov r22,r12
        rcall __gtsf2
        cp __zero_reg__,r24
        brlt .L2
        ldi r28,0
.L2:
        mov r24,r28
/* epilogue start */
        pop r28
        pop r15
        pop r14
        pop r13
        pop r12
        ret

Still there are a lot of moves in the generated code.  I'll think how
to solve the problem. I think coalescing could do this.
Unfortunately, IRA/LRA do not coalesce moves involving subregs.  May
be implementing coalescing at end of LRA could be a solution.

In any case, the full PR solution would take some time.  The first, I am
going to submit the patch above after thorough testing a few major
targets.  Then I'll work on removing redundant moves.  I'll
periodically publish updates on the PR progress.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2022-12-13 14:10 ` vmakarov at gcc dot gnu.org
@ 2022-12-15 19:20 ` cvs-commit at gcc dot gnu.org
  2022-12-16 13:59 ` gjl at gcc dot gnu.org
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-12-15 19:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Vladimir Makarov <vmakarov@gcc.gnu.org>:

https://gcc.gnu.org/g:12abd5a7d13209f79664ea603b3f3517f71b8c4f

commit r13-4727-g12abd5a7d13209f79664ea603b3f3517f71b8c4f
Author: Vladimir N. Makarov <vmakarov@redhat.com>
Date:   Thu Dec 15 14:11:05 2022 -0500

    IRA: Check that reg classes contain a hard reg of given mode in reg move
cost calculation

    IRA calculates wrong AVR costs for moving general hard regs of SFmode.  To
    calculate the costs we did not exclude sub-classes which do not contain
    hard regs of given mode.  This was the reason for spilling a pseudo in the
    PR. The patch fixes this.

            PR rtl-optimization/90706

    gcc/ChangeLog:

            * ira-costs.cc: Include print-rtl.h.
            (record_reg_classes, scan_one_insn): Add code to print debug info.
            * ira.cc (ira_init_register_move_cost): Check that at least one
hard
            reg of the mode are in the class contents to calculate the
            register move costs.

    gcc/testsuite/ChangeLog:

            * gcc.target/avr/pr90706.c: New.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2022-12-15 19:20 ` cvs-commit at gcc dot gnu.org
@ 2022-12-16 13:59 ` gjl at gcc dot gnu.org
  2022-12-16 18:33 ` vmakarov at gcc dot gnu.org
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: gjl at gcc dot gnu.org @ 2022-12-16 13:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |gjl at gcc dot gnu.org

--- Comment #16 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
Created attachment 54113
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54113&action=edit
More elaborate C test case.

This is a more complicated test case, compile with

> avr-gcc -c pi-i.c -mmcu=atmega8 -Os -mcall-prologues -fno-tree-loop-optimize -fno-move-loop-invariants && avr-size pi-i.o

Code sizes are:

664 with avr-gcc v8.5
992 with avr-gcc v11.3
834 with avr-gcc master with the change from comment #13

So there is a clear improvement with patch #13, but size is still +25% compared
to v8. What also has an effect is -fno-split-wide-types.

The test case mostly operates on float; unfortunately I don't have a similar
test-case for 32-bit integers at hand.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2022-12-16 13:59 ` gjl at gcc dot gnu.org
@ 2022-12-16 18:33 ` vmakarov at gcc dot gnu.org
  2023-03-02 22:22 ` cvs-commit at gcc dot gnu.org
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2022-12-16 18:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #17 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
I've reverted my patch as it resulted in two new PRs.  I'll do more work on
this PR and I'll start this job in Jan.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2022-12-16 18:33 ` vmakarov at gcc dot gnu.org
@ 2023-03-02 22:22 ` cvs-commit at gcc dot gnu.org
  2023-03-04 13:45 ` gjl at gcc dot gnu.org
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-02 22:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Vladimir Makarov <vmakarov@gcc.gnu.org>:

https://gcc.gnu.org/g:2639f9d2313664e6b4ed2f8131fefa60aeeb0518

commit r13-6424-g2639f9d2313664e6b4ed2f8131fefa60aeeb0518
Author: Vladimir N. Makarov <vmakarov@redhat.com>
Date:   Thu Mar 2 16:29:05 2023 -0500

    IRA: Use minimal cost for hard register movement

    This is the 2nd attempt to fix PR90706.  IRA calculates wrong AVR
    costs for moving general hard regs of SFmode.  This was the reason for
    spilling a pseudo in the PR.  In this patch we use smaller move cost
    of hard reg in its natural and operand modes.

            PR rtl-optimization/90706

    gcc/ChangeLog:

            * ira-costs.cc: Include print-rtl.h.
            (record_reg_classes, scan_one_insn): Add code to print debug info.
            (record_operand_costs): Find and use smaller cost for hard reg
            move.

    gcc/testsuite/ChangeLog:

            * gcc.target/avr/pr90706.c: New.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2023-03-02 22:22 ` cvs-commit at gcc dot gnu.org
@ 2023-03-04 13:45 ` gjl at gcc dot gnu.org
  2023-03-31 12:42 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: gjl at gcc dot gnu.org @ 2023-03-04 13:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |8.5.0

--- Comment #19 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
(In reply to CVS Commits from comment #18)
> https://gcc.gnu.org/g:2639f9d2313664e6b4ed2f8131fefa60aeeb0518
> 
> commit r13-6424-g2639f9d2313664e6b4ed2f8131fefa60aeeb0518
> Author: Vladimir N. Makarov <vmakarov@redhat.com>
> Date:   Thu Mar 2 16:29:05 2023 -0500
> 
>     IRA: Use minimal cost for hard register movement

Thank you; the code looks clean now. (For my test case from comment #16 I
needed -fno-split wide-types which is a different story).

Is there any chance your fix will be back-ported?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (12 preceding siblings ...)
  2023-03-04 13:45 ` gjl at gcc dot gnu.org
@ 2023-03-31 12:42 ` cvs-commit at gcc dot gnu.org
  2023-03-31 12:45 ` vmakarov at gcc dot gnu.org
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-31 12:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #20 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Vladimir Makarov
<vmakarov@gcc.gnu.org>:

https://gcc.gnu.org/g:88792f04e5c63025506244b9ac7186a3cc10c25a

commit r12-9372-g88792f04e5c63025506244b9ac7186a3cc10c25a
Author: Vladimir N. Makarov <vmakarov@redhat.com>
Date:   Thu Mar 2 16:29:05 2023 -0500

    IRA: Use minimal cost for hard register movement

    This is the 2nd attempt to fix PR90706.  IRA calculates wrong AVR
    costs for moving general hard regs of SFmode.  This was the reason for
    spilling a pseudo in the PR.  In this patch we use smaller move cost
    of hard reg in its natural and operand modes.

            PR rtl-optimization/90706

    gcc/ChangeLog:

            * ira-costs.cc: Include print-rtl.h.
            (record_reg_classes, scan_one_insn): Add code to print debug info.
            (record_operand_costs): Find and use smaller cost for hard reg
            move.

    gcc/testsuite/ChangeLog:

            * gcc.target/avr/pr90706.c: New.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (13 preceding siblings ...)
  2023-03-31 12:42 ` cvs-commit at gcc dot gnu.org
@ 2023-03-31 12:45 ` vmakarov at gcc dot gnu.org
  2023-03-31 14:38 ` gjl at gcc dot gnu.org
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2023-03-31 12:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #21 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to CVS Commits from comment #20)
> The releases/gcc-12 branch has been updated by Vladimir Makarov
> <vmakarov@gcc.gnu.org>:
> 
> https://gcc.gnu.org/g:88792f04e5c63025506244b9ac7186a3cc10c25a
> 
> 

The trunk with the patch behaved good for a few weeks.  So I backported it to
gcc-12 branch.  GCC-12 branch with the patch was successfully tested and
bootstrapped on x86-64.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (14 preceding siblings ...)
  2023-03-31 12:45 ` vmakarov at gcc dot gnu.org
@ 2023-03-31 14:38 ` gjl at gcc dot gnu.org
  2023-05-21 15:25 ` gjl at gcc dot gnu.org
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 19+ messages in thread
From: gjl at gcc dot gnu.org @ 2023-03-31 14:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |10.0, 11.0, 12.0, 9.0
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
      Known to work|                            |13.0

--- Comment #22 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
Fixed in 12.3+

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (15 preceding siblings ...)
  2023-03-31 14:38 ` gjl at gcc dot gnu.org
@ 2023-05-21 15:25 ` gjl at gcc dot gnu.org
  2024-03-05 19:28 ` gjl at gcc dot gnu.org
  2024-05-18  8:15 ` gjl at gcc dot gnu.org
  18 siblings, 0 replies; 19+ messages in thread
From: gjl at gcc dot gnu.org @ 2023-05-21 15:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #23 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
Created attachment 55130
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55130&action=edit
Test case for -Os -mmcu=attiny40

As it appears, this bug is not fixed completely.  For the -mmcu=avrtiny
architecture, there is still bloat for even the smallest test cases like:

$ avr-gcc bloat.c -mmcu=attiny40 -Os -S

char func3 (char c)
{
    return 1 + c;
}

"GCC: (GNU) 14.0.0 20230520 (experimental)" compiles this to:

func3:
        push r28                 ;  22  [c=4 l=1]  pushqi1/0
        push r29                 ;  23  [c=4 l=1]  pushqi1/0
        push __tmp_reg__         ;  27  [c=4 l=1]  *addhi3_sp
        in r28,__SP_L__  ;  38  [c=4 l=2]  *movhi/7
        in r29,__SP_H__
/* prologue: function */
/* frame size = 1 */
/* stack size = 3 */
        mov r20,r24      ;  18  [c=4 l=1]  movqi_insn/0
        subi r20,lo8(-(1))       ;  19  [c=4 l=1]  *addqi3/1
        mov r24,r20      ;  21  [c=4 l=1]  movqi_insn/0
/* epilogue start */
        pop __tmp_reg__  ;  33  [c=4 l=1]  *addhi3_sp
        pop r29          ;  34  [c=4 l=1]  popqi
        pop r28          ;  35  [c=4 l=1]  popqi
        ret              ;  36  [c=0 l=1]  return_from_epilogue

For reference, avr-gcc v8 generates for this function:

func3:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
        subi r24,lo8(-(1))       ;  6   [c=4 l=1]  addqi3/1
/* epilogue start */
        ret              ;  17  [c=0 l=1]  return

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (16 preceding siblings ...)
  2023-05-21 15:25 ` gjl at gcc dot gnu.org
@ 2024-03-05 19:28 ` gjl at gcc dot gnu.org
  2024-05-18  8:15 ` gjl at gcc dot gnu.org
  18 siblings, 0 replies; 19+ messages in thread
From: gjl at gcc dot gnu.org @ 2024-03-05 19:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #24 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
(In reply to Georg-Johann Lay from comment #23)
> As it appears, this bug is not fixed completely.  For the -mmcu=avrtiny
> architecture, there is still bloat for even the smallest test cases like:

Different story, f'up to PR113927.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
       [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
                   ` (17 preceding siblings ...)
  2024-03-05 19:28 ` gjl at gcc dot gnu.org
@ 2024-05-18  8:15 ` gjl at gcc dot gnu.org
  18 siblings, 0 replies; 19+ messages in thread
From: gjl at gcc dot gnu.org @ 2024-05-18  8:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.5                        |12.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-05-18  8:15 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-90706-4@http.gcc.gnu.org/bugzilla/>
2020-03-12 11:59 ` [Bug rtl-optimization/90706] [9/10 Regression] Useless code generated for stack / register operations on AVR jakub at gcc dot gnu.org
2020-03-13 23:25 ` bseifert at gmx dot at
2020-04-29 16:50 ` gjl at gcc dot gnu.org
2021-06-01  8:14 ` [Bug rtl-optimization/90706] [9/10/11/12 " rguenth at gcc dot gnu.org
2022-05-27  9:40 ` [Bug rtl-optimization/90706] [10/11/12/13 " rguenth at gcc dot gnu.org
2022-06-28 10:37 ` jakub at gcc dot gnu.org
2022-11-01 15:55 ` gjl at gcc dot gnu.org
2022-12-13 14:10 ` vmakarov at gcc dot gnu.org
2022-12-15 19:20 ` cvs-commit at gcc dot gnu.org
2022-12-16 13:59 ` gjl at gcc dot gnu.org
2022-12-16 18:33 ` vmakarov at gcc dot gnu.org
2023-03-02 22:22 ` cvs-commit at gcc dot gnu.org
2023-03-04 13:45 ` gjl at gcc dot gnu.org
2023-03-31 12:42 ` cvs-commit at gcc dot gnu.org
2023-03-31 12:45 ` vmakarov at gcc dot gnu.org
2023-03-31 14:38 ` gjl at gcc dot gnu.org
2023-05-21 15:25 ` gjl at gcc dot gnu.org
2024-03-05 19:28 ` gjl at gcc dot gnu.org
2024-05-18  8:15 ` gjl at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).