public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/106615] New: redundant load and store introduced in if-true-branch
@ 2022-08-14 15:48 absoler at smail dot nju.edu.cn
  2022-08-14 16:28 ` [Bug tree-optimization/106615] " pinskia at gcc dot gnu.org
  2024-02-16  5:56 ` absoler at smail dot nju.edu.cn
  0 siblings, 2 replies; 3+ messages in thread
From: absoler at smail dot nju.edu.cn @ 2022-08-14 15:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106615

            Bug ID: 106615
           Summary: redundant load and store introduced in if-true-branch
           Product: gcc
           Version: 12.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: absoler at smail dot nju.edu.cn
  Target Milestone: ---

Given the code:

int g_36 = 0x0B137D37L;
int *g_35 = &g_36;
int g_44[2] = {0x964071C1L,0x964071C1L};

unsigned char  func_1(void);
int * func_16(int * p_17, unsigned long int  p_18);
int * func_19(int * p_20);


unsigned char func_1() { func_16(func_19(g_35), g_44[1]); }
int* func_16(int *a, unsigned long int b) {
        unsigned int c=1;
        *a = g_44;
        if ((g_44[0] = c) <= b)
                ;
        else
                *a = 0;
}
int* func_19(int32_t *d) {
        g_44[1] |= g_36 ;
        return d;
}

when it's compiled on gcc-12.1 with option -O1, the generated asm code of
func_16 will be:

0000000000401186 <func_16>:
  401186:       b8 60 40 40 00          mov    $0x404060,%eax
  40118b:       89 07                   mov    %eax,(%rdi)
  40118d:       c7 05 c9 2e 00 00 01    movl   $0x1,0x2ec9(%rip)        #
404060 <g_44>
  401194:       00 00 00 
  401197:       b8 00 00 00 00          mov    $0x0,%eax
  40119c:       48 85 f6                test   %rsi,%rsi
  40119f:       74 02                   je     4011a3 <func_16+0x1d>
  4011a1:       8b 07                   mov    (%rdi),%eax    
 # rdi keep the address of g_36
  4011a3:       89 07                   mov    %eax,(%rdi)
  4011a5:       c3                      retq  

We can see g_36 will be loaded to %eax and stored back as is when the
if-condition is true. This operation should be redundant?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/106615] redundant load and store introduced in if-true-branch
  2022-08-14 15:48 [Bug rtl-optimization/106615] New: redundant load and store introduced in if-true-branch absoler at smail dot nju.edu.cn
@ 2022-08-14 16:28 ` pinskia at gcc dot gnu.org
  2024-02-16  5:56 ` absoler at smail dot nju.edu.cn
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-08-14 16:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106615

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2022-08-14
             Status|UNCONFIRMED                 |NEW
           Severity|normal                      |enhancement

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Reduced testcase:
int g_44;

void func_16(int *a, unsigned long int b) {
        *a = 5;
        if ((g_44 = 1) <= b)
                ;
        else
                *a = 0;
}

---- CUT ----
So CSElim on the gimple level goes from:

  *a_3(D) = 5;
  g_44 = 1;
  if (b_6(D) != 0)
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]

  <bb 3> [local count: 536870913]:
  *a_3(D) = 0;

  <bb 4> [local count: 1073741824]:
  return;

to:

  *a_3(D) = 5;
  g_44 = 1;
  if (b_6(D) != 0)
    goto <bb 3>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 3> [local count: 536870912]:
  cstore_8 = MEM <int> [(void *)a_3(D)];

  <bb 4> [local count: 1073741824]:
  # cstore_9 = PHI <cstore_8(3), 0(2)>
  MEM <int> [(void *)a_3(D)] = cstore_9;

Thinking it might be able to remove the load inside the if branch (for an
example at -O2 with 1 instead of 5, GCC can remove the load).
And then nothing afterwards will undo that transformation.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/106615] redundant load and store introduced in if-true-branch
  2022-08-14 15:48 [Bug rtl-optimization/106615] New: redundant load and store introduced in if-true-branch absoler at smail dot nju.edu.cn
  2022-08-14 16:28 ` [Bug tree-optimization/106615] " pinskia at gcc dot gnu.org
@ 2024-02-16  5:56 ` absoler at smail dot nju.edu.cn
  1 sibling, 0 replies; 3+ messages in thread
From: absoler at smail dot nju.edu.cn @ 2024-02-16  5:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106615

--- Comment #2 from absoler at smail dot nju.edu.cn ---
(In reply to Andrew Pinski from comment #1)
> Reduced testcase:
> int g_44;
> 
> void func_16(int *a, unsigned long int b) {
> 	*a = 5;
> 	if ((g_44 = 1) <= b)
> 		;
> 	else
> 		*a = 0;
> }
> 
> ---- CUT ----
> So CSElim on the gimple level goes from:
> 
>   *a_3(D) = 5;
>   g_44 = 1;
>   if (b_6(D) != 0)
>     goto <bb 4>; [50.00%]
>   else
>     goto <bb 3>; [50.00%]
> 
>   <bb 3> [local count: 536870913]:
>   *a_3(D) = 0;
> 
>   <bb 4> [local count: 1073741824]:
>   return;
> 
> to:
> 
>   *a_3(D) = 5;
>   g_44 = 1;
>   if (b_6(D) != 0)
>     goto <bb 3>; [50.00%]
>   else
>     goto <bb 4>; [50.00%]
> 
>   <bb 3> [local count: 536870912]:
>   cstore_8 = MEM <int> [(void *)a_3(D)];
> 
>   <bb 4> [local count: 1073741824]:
>   # cstore_9 = PHI <cstore_8(3), 0(2)>
>   MEM <int> [(void *)a_3(D)] = cstore_9;
> 
> Thinking it might be able to remove the load inside the if branch (for an
> example at -O2 with 1 instead of 5, GCC can remove the load).
> And then nothing afterwards will undo that transformation.

Now I found a similar interesting case. It will introduce redundant load&store
on gcc-13.2.0 with "-O2 -ffinite-loops" but not with only "-O1", while it can
be triggered on gcc-12.2.0 with only "-O2" flag.

source:
```
int g_4[1][1][1] = {{{1L}}};
int g_80 = 0L;
int g_161 = 0x37FF6474L;
int *g_160 = &g_161;
int **g_159 = &g_160;
char f;

void func_1() {
  int *a[4];
  int b = 0, c;
  for (; b < 4; b++)
    a[b] = (int *)g_4;
  *a[3] |= 6;
  **g_159 = g_80;
  int d[1][1];
  unsigned e;
  for (b = 0; b < 1; b++)
    for (c = 0; c < 1; c++)
      d[b][c] = e;
  if (0 > f)
    *a[3] = 0;
}

```

problematic binary generated by gcc-13.2.0 -O2 -ffinite-loops, or gcc-12.2.0
-O2
```
00000000004016e0 <func_1>:
func_1():
  4016e0:       mov    0x29c9(%rip),%rax        # 4040b0 <g_159>
  4016e7:       mov    0x105bb(%rip),%edx        # 411ca8 <g_80>
/root/myCSmith/test/output2.c:42
  4016ed:       orl    $0x6,0x29ec(%rip)        # 4040e0 <g_4>
/root/myCSmith/test/output2.c:43
  4016f4:       mov    (%rax),%rax
  4016f7:       mov    %edx,(%rax)
/root/myCSmith/test/output2.c:50
  4016f9:       xor    %eax,%eax
/root/myCSmith/test/output2.c:49
  4016fb:       cmpb   $0x0,0x2992(%rip)        # 404094 <g_277+0x14>
  401702:       js     40170a <func_1+0x2a>
/root/myCSmith/test/output2.c:50
  401704:       mov    0x29d6(%rip),%eax        # 4040e0 <g_4>
  40170a:       mov    %eax,0x29d0(%rip)        # 4040e0 <g_4>
/root/myCSmith/test/output2.c:51
  401710:       retq   
  401711:       data16 nopw %cs:0x0(%rax,%rax,1)
  40171c:       nopl   0x0(%rax)
```

better binary generated by gcc-13.2.0 -O2:
```
0000000000401470 <func_1>:
func_1():
/root/myCSmith/test/output2.c:41
  401470:       mov    0x2be9(%rip),%rax        # 404060 <g_159>
  401477:       mov    0x1072f(%rip),%edx        # 411bac <g_80>
/root/myCSmith/test/output2.c:40
  40147d:       orl    $0x6,0x2bf0(%rip)        # 404074 <g_4>
/root/myCSmith/test/output2.c:47
  401484:       cmpb   $0x0,0x1071d(%rip)        # 411ba8 <f>
/root/myCSmith/test/output2.c:41
  40148b:       mov    (%rax),%rax
  40148e:       mov    %edx,(%rax)
/root/myCSmith/test/output2.c:47
  401490:       js     401498 <func_1+0x28>
/root/myCSmith/test/output2.c:49
  401492:       retq   
  401493:       nopl   0x0(%rax,%rax,1)
/root/myCSmith/test/output2.c:48
  401498:       movl   $0x0,0x2bd2(%rip)        # 404074 <g_4>
/root/myCSmith/test/output2.c:49
  4014a2:       retq   
  4014a3:       data16 nopw %cs:0x0(%rax,%rax,1)
  4014ae:       xchg   %ax,%ax
```

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-02-16  5:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-14 15:48 [Bug rtl-optimization/106615] New: redundant load and store introduced in if-true-branch absoler at smail dot nju.edu.cn
2022-08-14 16:28 ` [Bug tree-optimization/106615] " pinskia at gcc dot gnu.org
2024-02-16  5:56 ` absoler at smail dot nju.edu.cn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).