public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4
@ 2021-08-19 14:55 dumoulin.thibaut at gmail dot com
  2021-08-20  8:30 ` [Bug target/101981] " rguenth at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: dumoulin.thibaut at gmail dot com @ 2021-08-19 14:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

            Bug ID: 101981
           Summary: GCC10 produces bigger asm for simple switch than GCC7
                    - cortexM4
           Product: gcc
           Version: 10.3.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dumoulin.thibaut at gmail dot com
  Target Milestone: ---

For cortex-m4 -Os, GCC10 produces bigger assembly code than GCC7 for very
simple switch statements.

Here is the C code example to trigger the regression:

*file: switch.c*
```C
int switchFunction(int foo) {
  switch (foo) {
  case 0:
    return 0;
  case 1:
    return 1;
  default:
    return -1;
  }
}

int main() {
  return 0;
}
```
To reproduce, I downloaded the toolchain from here
https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-rm/downloads

Compile command: `arm-none-eabi-gcc -Os -mcpu=cortex-m4 ./switch.c`

For GCC7 (arm-none-eabi-gcc (GNU Tools for Arm Embedded Processors
7-2018-q2-update) 7.3.1 20180622 (release) [ARM/embedded-7-branch revision
261907]) it produces this assembly code:
```asm
000080f8 <switchFunction>:
    80f8:       2801            cmp     r0, #1
    80fa:       bf88            it      hi
    80fc:       f04f 30ff       movhi.w r0, #4294967295 ; 0xffffffff
    8100:       4770            bx      lr
```

While GCC10 (arm-none-eabi-gcc (GNU Arm Embedded Toolchain 10.3-2021.07) 10.3.1
20210621 (release)) produces one more line (25% bigger):
```asm
00008100 <switchFunction>:
    8100:       b118            cbz     r0, 810a <switchFunction+0xa>
    8102:       2801            cmp     r0, #1
    8104:       bf18            it      ne
    8106:       f04f 30ff       movne.w r0, #4294967295 ; 0xffffffff
    810a:       4770            bx      lr
```

Note: GCC10 produces the same code as if it was a simple `if`:
```C
int ifFunction(int foo) {
  if (foo == 0) {
    return 0;
  } else if (foo == 1) {
    return 1;
  } else {
    return -1;
  }
}
```

Shouldn't GCC10 produces the same code as GCC7?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
@ 2021-08-20  8:30 ` rguenth at gcc dot gnu.org
  2021-08-20 12:12 ` dumoulin.thibaut at gmail dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-20  8:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|c                           |target
             Target|cortex-m4                   |arm
              Build|-Os                         |

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
It's likely caused by switch lowering where costing for such special cases is
going to be interesting at least.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
  2021-08-20  8:30 ` [Bug target/101981] " rguenth at gcc dot gnu.org
@ 2021-08-20 12:12 ` dumoulin.thibaut at gmail dot com
  2021-08-20 13:13 ` marxin at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: dumoulin.thibaut at gmail dot com @ 2021-08-20 12:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

--- Comment #2 from Thibaut M. <dumoulin.thibaut at gmail dot com> ---
I'm not sure to understand your statement.

It looks like the switch lowering is wrong here because it takes now more time
with the new GCC than the previous one. Is looks like a regression.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
  2021-08-20  8:30 ` [Bug target/101981] " rguenth at gcc dot gnu.org
  2021-08-20 12:12 ` dumoulin.thibaut at gmail dot com
@ 2021-08-20 13:13 ` marxin at gcc dot gnu.org
  2021-08-20 14:20 ` [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8 marxin at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-08-20 13:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-08-20
     Ever confirmed|0                           |1
           Assignee|unassigned at gcc dot gnu.org      |marxin at gcc dot gnu.org
             Status|UNCONFIRMED                 |ASSIGNED

--- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> ---
I'll take a look.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
                   ` (2 preceding siblings ...)
  2021-08-20 13:13 ` marxin at gcc dot gnu.org
@ 2021-08-20 14:20 ` marxin at gcc dot gnu.org
  2021-08-20 15:16 ` dumoulin.thibaut at gmail dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-08-20 14:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|arm                         |arm, x86_64
             Status|ASSIGNED                    |NEW
            Summary|GCC10 produces bigger asm   |GCC10 produces bigger asm
                   |for simple switch than GCC7 |for simple switch than GCC7
                   |- cortexM4                  |- cortexM4 since
                   |                            |r8-2701-g9dc3d6a96167b4c8

--- Comment #4 from Martin Liška <marxin at gcc dot gnu.org> ---
I can confirm the minor regression that started with r8-2701-g9dc3d6a96167b4c8
(aka switch lowering made on the TREE level).

One can also see it on x86_64-linux:

gcc-7
switchFunction:
.LFB0:
        .cfi_startproc
        movl    %edi, %eax
        cmpl    $2, %edi
        movl    $-1, %edx
        cmovnb  %edx, %eax
        ret
        .cfi_endproc

gcc-12
switchFunction:
.LFB0:
        .cfi_startproc
        movl    %edi, %eax
        testl   %edi, %edi
        je      .L2
        cmpl    $1, %edi
        movl    $-1, %edx
        cmovne  %edx, %eax

Note the test-case is quite special as the shorter version does:

int switchFunction(int foo) {
  switch (foo) {
  case 0:
    return foo;
  case 1:
    return foo;
  default:
    return -1;
  }
}

that's trick that's used.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
                   ` (3 preceding siblings ...)
  2021-08-20 14:20 ` [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8 marxin at gcc dot gnu.org
@ 2021-08-20 15:16 ` dumoulin.thibaut at gmail dot com
  2021-08-25 14:01 ` marxin at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: dumoulin.thibaut at gmail dot com @ 2021-08-20 15:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

--- Comment #5 from Thibaut M. <dumoulin.thibaut at gmail dot com> ---
Thanks Martin!
Do you think it can be patched?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
                   ` (4 preceding siblings ...)
  2021-08-20 15:16 ` dumoulin.thibaut at gmail dot com
@ 2021-08-25 14:01 ` marxin at gcc dot gnu.org
  2021-09-09 12:51 ` dumoulin.thibaut at gmail dot com
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-08-25 14:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

--- Comment #6 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Thibaut M. from comment #5)
> Thanks Martin!
> Do you think it can be patched?

Dunno. The pass was moved from RTL to GIMPLE and I don't see a simple way how
to fix it. It's expected such a change would lead to a regression..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
                   ` (5 preceding siblings ...)
  2021-08-25 14:01 ` marxin at gcc dot gnu.org
@ 2021-09-09 12:51 ` dumoulin.thibaut at gmail dot com
  2021-09-14 12:53 ` marxin at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: dumoulin.thibaut at gmail dot com @ 2021-09-09 12:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

--- Comment #7 from Thibaut M. <dumoulin.thibaut at gmail dot com> ---
There are other regressions (in term of code size) but not sure if this is the
same.
For example, this code:

```C
#include <stdio.h>

void big_switch(int a) {
  switch (a) {
  default:
    printf("default a(%d)\n", a);
  case 0x1:
  case 0x2:
  // case 0x3:
  // case 0x4:
  case 0x5:
  case 0x6:
  // case 0x7:
  // case 0x8:
  // case 0x9:
  case 0xA:
  case 0xD:
  case 0xE:
  case 0xF:
  // case 0x10:
  case 0x11:
  case 0x12:
  // case 0x13:
  // case 0x14:
  case 0x15:
  case 0x17:
  case 0x18:
  case 0x19:
  case 0x1A:
  // case 0x1B:
  case 0x1C:
  case 0x1D: {
    printf("a(%d)\n", a);
  } break;
  }
}

int main(void) {
  big_switch(2);
  return 0;
}
```

with compile parameters `arm-none-eabi-gcc -Os -mcpu=cortex-m4 ./switch.c`
the asm is lighter with GCC7.3.1 than with GCC10.3.1.


While GCC7.3.1 (20 lines)
```asm
00008134 <big_switch>:
    cmp     r0, #29
    push    {r4, lr}
    mov     r4, r0
    bhi.n   8146 <big_switch+0x12>
    movs    r2, #1
    ldr     r3, [pc, #28]   ; (815c <big_switch+0x28>)
    lsls    r2, r0
    ands    r3, r2
    cbnz    r3, 814e <big_switch+0x1a>
    mov     r1, r4
    ldr     r0, [pc, #20]   ; (8160 <big_switch+0x2c>)
    bl      8264 <printf>
    mov     r1, r4
    ldr     r0, [pc, #16]   ; (8164 <big_switch+0x30>)
    ldmia.w sp!, {r4, lr}
    b.w     8264 <printf>
    nop
    .word   0x37a6e466
    .word   0x0000f5f8
    .word   0x0000f600
```

GCC10.3.1 (24 lines)
```asm
0000813c <big_switch>:
    subs    r3, r0, #1
    push    {r4, lr}
    mov     r4, r0
    cmp     r3, #28
    bhi.n   8168 <big_switch+0x2c>
    tbb     [pc, r3]
    .short  0x1313
    .word   0x13130f0f
    .word   0x130f0f0f
    .word   0x13130f0f
    .word   0x13130f13
    .word   0x0f130f0f
    .word   0x13131313
    .short  0x130f
    .byte   0x13
    .byte   0x00
    mov     r1, r0
    ldr     r0, [pc, #16]   ; (817c <big_switch+0x40>)
    bl      828c <printf>
    mov     r1, r4
    ldr     r0, [pc, #8]    ; (817c <big_switch+0x40>)
    ldmia.w sp!, {r4, lr}
    b.w     828c <printf>
    .word   0x0001011c
```

Not a big difference in term of instructions is this case but as much as the
switch increases, the difference becomes huge (in my case it switched from 75
to 94 lines)
Code size increases of about 10%.


@Martin, do you know if this is linked?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
                   ` (6 preceding siblings ...)
  2021-09-09 12:51 ` dumoulin.thibaut at gmail dot com
@ 2021-09-14 12:53 ` marxin at gcc dot gnu.org
  2021-09-14 12:56 ` marxin at gcc dot gnu.org
  2021-11-04 12:53 ` marxin at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-14 12:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

--- Comment #8 from Martin Liška <marxin at gcc dot gnu.org> ---
> 
> Not a big difference in term of instructions is this case but as much as the
> switch increases, the difference becomes huge (in my case it switched from
> 75 to 94 lines)
> Code size increases of about 10%.
> 
> 
> @Martin, do you know if this is linked?

Yes, it's slightly related. This one is about a bit changed emission of
bit-tests:

GCC 7 emits:

  <bb 2> [100.00%]:
  _6 = (unsigned int) a_2(D);
  if (_6 > 29)
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]

  <bb 3> [50.00%]:
  _8 = 1 << _6;
  _9 = _8 & 933684326;
  if (_9 != 0)
    goto <bb 5>; [46.00%]
  else
    goto <bb 4>; [54.00%]

  <bb 4> [77.00%]:
  printf ("default a(%d)\n", a_2(D));

  <bb 5> [100.00%]:
  printf ("a(%d)\n", a_2(D)); [tail call]
  return;

while GCC master does:

  <bb 2> [local count: 1073741824]:
  _13 = (unsigned int) a_2(D);
  _12 = _13 > 29;
  if (_12 != 0)
    goto <bb 4>; [20.00%]
  else
    goto <bb 3>; [80.00%]

  <bb 3> [local count: 858993464]:
  _11 = 933684326 >> _13;
  _10 = _11 & 1;
  _9 = _10 != 0;
  if (_9 != 0)
    goto <bb 5>; [58.62%]
  else
    goto <bb 4>; [41.38%]

  <bb 4> [local count: 354334800]:
<L0>:
  printf ("default a(%d)\n", a_2(D));

  <bb 5> [local count: 1073741824]:
<L1>:
  printf ("a(%d)\n", a_2(D));
  return;

which is pretty much the same code. Note your line count metrics is a bit
misleading
as directives like .word or .byte have a different size in the final binary,
simiarly for other
instructions.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
                   ` (7 preceding siblings ...)
  2021-09-14 12:53 ` marxin at gcc dot gnu.org
@ 2021-09-14 12:56 ` marxin at gcc dot gnu.org
  2021-11-04 12:53 ` marxin at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-14 12:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

--- Comment #9 from Martin Liška <marxin at gcc dot gnu.org> ---
So GCC 7 emits:

$ nm --print-size pr101981-2.o
00000000 00000034 T big_switch

and GCC master emits:

nm --print-size pr101981-2.o
00000000 00000030 T big_switch

So the code is smaller with the current master.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8
  2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
                   ` (8 preceding siblings ...)
  2021-09-14 12:56 ` marxin at gcc dot gnu.org
@ 2021-11-04 12:53 ` marxin at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-11-04 12:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101981

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WORKSFORME
             Status|NEW                         |RESOLVED

--- Comment #10 from Martin Liška <marxin at gcc dot gnu.org> ---
Thus closing as work for me.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-11-04 12:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-19 14:55 [Bug c/101981] New: GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 dumoulin.thibaut at gmail dot com
2021-08-20  8:30 ` [Bug target/101981] " rguenth at gcc dot gnu.org
2021-08-20 12:12 ` dumoulin.thibaut at gmail dot com
2021-08-20 13:13 ` marxin at gcc dot gnu.org
2021-08-20 14:20 ` [Bug target/101981] GCC10 produces bigger asm for simple switch than GCC7 - cortexM4 since r8-2701-g9dc3d6a96167b4c8 marxin at gcc dot gnu.org
2021-08-20 15:16 ` dumoulin.thibaut at gmail dot com
2021-08-25 14:01 ` marxin at gcc dot gnu.org
2021-09-09 12:51 ` dumoulin.thibaut at gmail dot com
2021-09-14 12:53 ` marxin at gcc dot gnu.org
2021-09-14 12:56 ` marxin at gcc dot gnu.org
2021-11-04 12:53 ` marxin at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).