public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu
@ 2014-08-15  6:14 su at cs dot ucdavis.edu
  2014-08-15  7:33 ` [Bug tree-optimization/62151] [5 Regression] " mpolacek at gcc dot gnu.org
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: su at cs dot ucdavis.edu @ 2014-08-15  6:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

            Bug ID: 62151
           Summary: wrong code at -O2 and -O3 on x86_64-linux-gnu
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: su at cs dot ucdavis.edu

The current gcc trunk miscompiles the following code on x86_64-linux-gnu at -O2
and -O3 in both 32-bit and 64-bit modes.

This is a regression from 4.9.x. 

$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-unknown-linux-gnu/4.10.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-trunk/configure --prefix=/usr/local/gcc-trunk
--enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
gcc version 4.10.0 20140814 (experimental) [trunk revision 213937] (GCC) 
$ 
$ gcc-trunk -Os small.c; a.out
$ gcc-4.9 -O2 small.c; a.out
$     
$ gcc-trunk -O2 small.c; a.out
Aborted (core dumped)
$ gcc-trunk -O3 small.c; a.out
Aborted (core dumped)
$ 


----------------------------


int a, c, d, e, f, g, h, i;
short b;

int
fn1 ()
{
  b = 0;
  for (;;)
    {
      int j[2];
      j[f] = 0;
      if (h)
    d = 0;
      else
    {
      for (; f; f++)
        ;
      for (a = 0; a < 1; a++)
        for (;;)
          {
        i = b & ((b ^ 1) & 83647) ? b : b - 1;
        g = 1 ? i : 0;
        e = j[0];
        if (c)
          break;
        return 0;
          }
    }
    }
}

int
main ()
{
  fn1 ();
  if (g != -1) 
    __builtin_abort (); 
  return 0;
}


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
@ 2014-08-15  7:33 ` mpolacek at gcc dot gnu.org
  2014-08-15  7:34 ` mpolacek at gcc dot gnu.org
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: mpolacek at gcc dot gnu.org @ 2014-08-15  7:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

Marek Polacek <mpolacek at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-08-15
                 CC|                            |mpolacek at gcc dot gnu.org,
                   |                            |zhenqiang.chen at linaro dot org
            Summary|wrong code at -O2 and -O3   |[5 Regression] wrong code
                   |on x86_64-linux-gnu         |at -O2 and -O3 on
                   |                            |x86_64-linux-gnu
     Ever confirmed|0                           |1

--- Comment #1 from Marek Polacek <mpolacek at gcc dot gnu.org> ---
Started with r211885.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
  2014-08-15  7:33 ` [Bug tree-optimization/62151] [5 Regression] " mpolacek at gcc dot gnu.org
@ 2014-08-15  7:34 ` mpolacek at gcc dot gnu.org
  2014-08-15  7:42 ` [Bug rtl-optimization/62151] " mpolacek at gcc dot gnu.org
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: mpolacek at gcc dot gnu.org @ 2014-08-15  7:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

Marek Polacek <mpolacek at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |5.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
  2014-08-15  7:33 ` [Bug tree-optimization/62151] [5 Regression] " mpolacek at gcc dot gnu.org
  2014-08-15  7:34 ` mpolacek at gcc dot gnu.org
@ 2014-08-15  7:42 ` mpolacek at gcc dot gnu.org
  2014-08-15  7:55 ` jakub at gcc dot gnu.org
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: mpolacek at gcc dot gnu.org @ 2014-08-15  7:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #2 from Marek Polacek <mpolacek at gcc dot gnu.org> ---
Seems like with -fno-move-loop-invariants even gcc 4.9 miscompiles this.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (2 preceding siblings ...)
  2014-08-15  7:42 ` [Bug rtl-optimization/62151] " mpolacek at gcc dot gnu.org
@ 2014-08-15  7:55 ` jakub at gcc dot gnu.org
  2014-08-15  8:04 ` amker at gcc dot gnu.org
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-08-15  7:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
With -O3 -fno-move-loop-invariants it started with r208165.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (3 preceding siblings ...)
  2014-08-15  7:55 ` jakub at gcc dot gnu.org
@ 2014-08-15  8:04 ` amker at gcc dot gnu.org
  2014-08-15  9:53 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amker at gcc dot gnu.org @ 2014-08-15  8:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

amker at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amker at gcc dot gnu.org

--- Comment #4 from amker at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #3)
> With -O3 -fno-move-loop-invariants it started with r208165.

It's my checkin, I shall have a look.
Thanks.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (4 preceding siblings ...)
  2014-08-15  8:04 ` amker at gcc dot gnu.org
@ 2014-08-15  9:53 ` rguenth at gcc dot gnu.org
  2014-08-15  9:53 ` rguenth at gcc dot gnu.org
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-08-15  9:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note that probably also made a latent issue pop up.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (5 preceding siblings ...)
  2014-08-15  9:53 ` rguenth at gcc dot gnu.org
@ 2014-08-15  9:53 ` rguenth at gcc dot gnu.org
  2014-08-15 15:39 ` amker at gcc dot gnu.org
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-08-15  9:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code
           Priority|P3                          |P1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (6 preceding siblings ...)
  2014-08-15  9:53 ` rguenth at gcc dot gnu.org
@ 2014-08-15 15:39 ` amker at gcc dot gnu.org
  2014-08-16 12:11 ` amker at gcc dot gnu.org
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amker at gcc dot gnu.org @ 2014-08-15 15:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #6 from amker at gcc dot gnu.org ---
(In reply to Richard Biener from comment #5)
> Note that probably also made a latent issue pop up.

Indeed.  After preliminary investigation, I think this case reveals two latent
issues.  The first one is a missed optimization opportunity on GIMPLE, while
the second one is a wrong code issue in RTL combine.

I will firstly look into the combine issue and send a report later.

Thanks,
bin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (7 preceding siblings ...)
  2014-08-15 15:39 ` amker at gcc dot gnu.org
@ 2014-08-16 12:11 ` amker at gcc dot gnu.org
  2014-08-19 18:55 ` ebotcazou at gcc dot gnu.org
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amker at gcc dot gnu.org @ 2014-08-16 12:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #7 from amker at gcc dot gnu.org ---
It's a combine pass issue and it happens on x86 too.  Dump before combine pass
is fine as below.

   30: r83:SI=0
   71: flags:CC=cmp(r83:SI,0x1)
      REG_DEAD r83:SI
   72: {r83:SI=-ltu(flags:CC,0);clobber flags:CC;}
      REG_DEAD flags:CC
      REG_UNUSED flags:CC
   41: [`i']=r83:SI
   42: [`g']=r83:SI

Combine pass optimizes it in a wrong way and generates below dump.

   30: NOTE_INSN_DELETED
   71: NOTE_INSN_DELETED
   72: NOTE_INSN_DELETED
   41: [`i']=0xffffffffffffffff
   42: [`g']=r83:SI

Apparently, insns 30/71/72/41 are combined.  The intermediate state of code
during combine is as below.

   30: NOTE_INSN_DELETED
   71: NOTE_INSN_DELETED
   72: r83:SI=0xffffffffffffffff
   41: [`i']=0xffffffffffffffff
   42: [`g']=r83:SI

Since r83 is used across 41, insn72 shouldn't be deleted here.  Though
try_combine take this into consideration at the first place, it falsely deletes
it in calling to distribute_notes as in below calling stack.

#0  set_insn_deleted (insn=0xb7bd6438) at ../../gcc/gcc/emit-rtl.c:4040
#1  0x08f22c3c in distribute_notes (notes=0xb7bd5f10, from_insn=0xb7bd6414,
i3=0xb7bd6120, i2=0xb7bd6438, elim_i2=0x0, elim_i1=0xb7bd5cf0, elim_i0=0x0) at
../../gcc/gcc/combine.c:13560
#2  0x08f09fc4 in try_combine (i3=0xb7bd6120, i2=0xb7bd6438, i1=0xb7bd6414,
i0=0xb7bd6000, new_direct_jump_p=0xbfffed14, last_combined_insn=0xb7bd6120) at
../../gcc/gcc/combine.c:4177
#3  0x08f01605 in combine_instructions (f=0xb7b36ec0, nregs=107) at
../../gcc/gcc/combine.c:1385
#4  0x08f23b10 in rest_of_handle_combine () at ../../gcc/gcc/combine.c:13896

I assume distribute_notes is trying to get notes of insn71 and link it to
appropriate instruction.  Code snippet of distribute_notes shows it checks
whether tem(i2/insn72) is a set to r83 (which is the register of the REG_DEAD
note of insn71), since the check is true, it just deletes tem(i2/insn72).


      if (place == 0)
        {
          basic_block bb = this_basic_block;

          for (tem = PREV_INSN (tem); place == 0; tem = PREV_INSN (tem))
        {
          if (!NONDEBUG_INSN_P (tem))
            {
              if (tem == BB_HEAD (bb))
            break;
              continue;
            }

          /* If the register is being set at TEM, see if that is all
             TEM is doing.  If so, delete TEM.  Otherwise, make this
             into a REG_UNUSED note instead. Don't delete sets to
             global register vars.  */
          if ((REGNO (XEXP (note, 0)) >= FIRST_PSEUDO_REGISTER
               || !global_regs[REGNO (XEXP (note, 0))])
              && reg_set_p (XEXP (note, 0), PATTERN (tem)))
            {
              rtx set = single_set (tem);
              rtx inner_dest = 0;
#ifdef HAVE_cc0
              rtx cc0_setter = NULL_RTX;
#endif

              if (set != 0)
            for (inner_dest = SET_DEST (set);
                 (GET_CODE (inner_dest) == STRICT_LOW_PART
                  || GET_CODE (inner_dest) == SUBREG
                  || GET_CODE (inner_dest) == ZERO_EXTRACT);
                 inner_dest = XEXP (inner_dest, 0))
              ;

              /* Verify that it was the set, and not a clobber that
             modified the register.

             CC0 targets must be careful to maintain setter/user
             pairs.  If we cannot delete the setter due to side
             effects, mark the user with an UNUSED note instead
             of deleting it.  */

              if (set != 0 && ! side_effects_p (SET_SRC (set))
              && rtx_equal_p (XEXP (note, 0), inner_dest)
#ifdef HAVE_cc0
              && (! reg_mentioned_p (cc0_rtx, SET_SRC (set))
                  || ((cc0_setter = prev_cc0_setter (tem)) != NULL
                  && sets_cc0_p (PATTERN (cc0_setter)) > 0))
#endif
              )
            {
              /* Move the notes and links of TEM elsewhere.
                 This might delete other dead insns recursively.
                 First set the pattern to something that won't use
                 any register.  */
              rtx old_notes = REG_NOTES (tem);

              PATTERN (tem) = pc_rtx;
              REG_NOTES (tem) = NULL;

              distribute_notes (old_notes, tem, tem, NULL_RTX,
                        NULL_RTX, NULL_RTX, NULL_RTX);
              distribute_links (LOG_LINKS (tem));

              SET_INSN_DELETED (tem);
              if (tem == i2)
                i2 = NULL_RTX;


To be honest, I don't fully understand above code, but it seems wrong to delete
an instruction after from_insn which sets a previously DEAD register.

Quoted from same function, there is :

      /* If the register is set or already dead at PLACE, we needn't do
         anything with this note if it is still a REG_DEAD note.
         We check here if it is set at all, not if is it totally replaced,
         which is what `dead_or_set_p' checks, so also check for it being
         set partially.  */

      if (place && REG_NOTE_KIND (note) == REG_DEAD)
        {
          unsigned int regno = REGNO (XEXP (note, 0));
          reg_stat_type *rsp = &reg_stat[regno];

          if (dead_or_set_p (place, XEXP (note, 0))
          || reg_bitfield_target_p (XEXP (note, 0), PATTERN (place)))
        {
          /* Unless the register previously died in PLACE, clear
             last_death.  [I no longer understand why this is
             being done.] */
          if (rsp->last_death != place)
            rsp->last_death = 0;
          place = 0;
        }
          else
        rsp->last_death = place;

It says we don't need to do anything about this REG_DEAD note if the register
is set or already dead at place.  This seems right to be and we need to adapt
previous code by considering register setting.  For example, below code can fix
the issue.

          if (from_insn
          && CALL_P (from_insn)
          && find_reg_fusage (from_insn, USE, XEXP (note, 0)))
        place = from_insn;
          else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)))
        place = i3;
          else if (i2 != 0 && next_nonnote_nondebug_insn (i2) == i3
               && (reg_referenced_p (XEXP (note, 0), PATTERN (i2))
               || reg_set_p (XEXP (note, 0), PATTERN (i2))))
        place = i2;
          else if ((rtx_equal_p (XEXP (note, 0), elim_i2)
            && !(i2mod
                 && reg_overlap_mentioned_p (XEXP (note, 0),
                             i2mod_old_rhs)))
               || rtx_equal_p (XEXP (note, 0), elim_i1)
               || rtx_equal_p (XEXP (note, 0), elim_i0))
        break;

          tem = i3;

Or it can be fixed by setting tem to the first register setting instruction (in
this case, i2), rather than i3 by default.

I will try to test a patch, meanwhile, I am wondering if any combine expert has
something to share.  BTW, combine pass seems to be another ad-hoc
implementation like reload, I saw several "Don't know" in comments during this
investigation!

Thanks,
bin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (8 preceding siblings ...)
  2014-08-16 12:11 ` amker at gcc dot gnu.org
@ 2014-08-19 18:55 ` ebotcazou at gcc dot gnu.org
  2014-08-20  2:32 ` amker at gcc dot gnu.org
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2014-08-19 18:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebotcazou at gcc dot gnu.org

--- Comment #8 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> I will try to test a patch, meanwhile, I am wondering if any combine expert
> has something to share.

distribute_notes is certainly an endless source of bugs.

> BTW, combine pass seems to be another ad-hoc implementation like reload, I saw
> several "Don't know" in comments during this investigation!

You would need to define what an "ad-hoc implementation" is exactly.  See the
top of the file for the origin of the pass.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (9 preceding siblings ...)
  2014-08-19 18:55 ` ebotcazou at gcc dot gnu.org
@ 2014-08-20  2:32 ` amker at gcc dot gnu.org
  2014-10-02  2:18 ` segher at gcc dot gnu.org
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amker at gcc dot gnu.org @ 2014-08-20  2:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #9 from amker at gcc dot gnu.org ---
(In reply to Eric Botcazou from comment #8)
> > I will try to test a patch, meanwhile, I am wondering if any combine expert
> > has something to share.
> 
> distribute_notes is certainly an endless source of bugs.
One patch is sent at https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01718.html
It's a scratch patch for discussion's purpose. Though I think it's right to
find distribution place for REG_DEAD note starting from i2 if it kills the
register, I am not sure if it's fixing the issue in a case by case manner.  So
it's highly appreciated if any expert in this area can share some opinions, or
review the patch?

> 
> > BTW, combine pass seems to be another ad-hoc implementation like reload, I saw
> > several "Don't know" in comments during this investigation!
> 
> You would need to define what an "ad-hoc implementation" is exactly.  See
> the top of the file for the origin of the pass.
Err, I may used the wrong word.  These are just frustrating words after seeing
several "Don't know" comments in the code.  Please ignore them.

Thanks.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (10 preceding siblings ...)
  2014-08-20  2:32 ` amker at gcc dot gnu.org
@ 2014-10-02  2:18 ` segher at gcc dot gnu.org
  2014-10-02  2:30 ` segher at gcc dot gnu.org
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: segher at gcc dot gnu.org @ 2014-10-02  2:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #10 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Author: segher
Date: Thu Oct  2 02:18:01 2014
New Revision: 215789

URL: https://gcc.gnu.org/viewcvs?rev=215789&root=gcc&view=rev
Log:
2014-10-01  Segher Boessenkool  <segher@kernel.crashing.org>

gcc/
    PR rtl-optimization/62151
    * combine.c (can_combine_p): Allow the destination register of INSN
    to be clobbered in I3.
    (subst): Do not substitute into clobbers of registers.

gcc/testsuite/
    * gcc.dg/combine-clobber.c: New.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/combine.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (11 preceding siblings ...)
  2014-10-02  2:18 ` segher at gcc dot gnu.org
@ 2014-10-02  2:30 ` segher at gcc dot gnu.org
  2014-12-09  8:48 ` amker.cheng at gmail dot com
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: segher at gcc dot gnu.org @ 2014-10-02  2:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org

--- Comment #11 from Segher Boessenkool <segher at gcc dot gnu.org> ---
With the patch in #c10, insn 30/71/72 (from #c7) are now combined, hiding
the problem.  It looks like the problematic situation can still happen
though, with some very improbable instruction sequence (the three-insn
combination has to be more expensive than the original first three insns,
but the three-insn combination and the four-insn combination together have
to be cheaper than the original four insns).


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (12 preceding siblings ...)
  2014-10-02  2:30 ` segher at gcc dot gnu.org
@ 2014-12-09  8:48 ` amker.cheng at gmail dot com
  2014-12-09  9:27 ` amker.cheng at gmail dot com
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amker.cheng at gmail dot com @ 2014-12-09  8:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

bin.cheng <amker.cheng at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amker.cheng at gmail dot com,
                   |                            |law at redhat dot com

--- Comment #12 from bin.cheng <amker.cheng at gmail dot com> ---
Seems there are two problems in distribute_notes when handling REG_DEAD note. 
Quote from source code:
          if (from_insn
          && CALL_P (from_insn)
          && find_reg_fusage (from_insn, USE, XEXP (note, 0)))
        place = from_insn;
          else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)))
        place = i3;
          else if (i2 != 0 && next_nonnote_nondebug_insn (i2) == i3
               && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
        place = i2;
          else if ((rtx_equal_p (XEXP (note, 0), elim_i2)
            && !(i2mod
                 && reg_overlap_mentioned_p (XEXP (note, 0),
                             i2mod_old_rhs)))
               || rtx_equal_p (XEXP (note, 0), elim_i1)
               || rtx_equal_p (XEXP (note, 0), elim_i0))
        break;
          tem_insn = i3;

Issue 1, Checks against elim_i0/elim_i1/elim_i2 and their values are not
correct for some cases.  In this case as below

  i0:  r0 <- i0src
  i1:  r1 <- i1src (using r0)
             REG_DEAD (r0)
  i2:  r0 <- i2src (using r1)
  i3:  r3 <- i3src (using r0)
  ix:  other use of r0
In try_combine, elim_i0/elim_i1/elim_i2 are set to 0/r1/0 correspondingly. 
When calling distribute_notes for reg_notes in i1, the check "rtx_equal_p (XEXP
(note, 0), elim_i0)" evaluates to FALSE.  As a result, GCC falls through to
below code which is to handle another note distribution case.  This is why I
thought the logic doesn't make any sense at the first glance.

So If variables elim_i0/elim_i1/elim_i2 are set to correct values,
distribution_notes acts as expected by discarding the REG_DEAD note.

The point is, when we distributing REG_DEAD note in i1, elim_i0 should be set
to i1dest (r0 in the example), which means r0 is eliminated as far as i1 is
handled, no matter if it is re-set by i2 or not.

Another issue is about tem_insn, I think in any cases, we should start
searching reference/def of the noted register from "from_insn" (if it exist). 
Think about below case:

  ix:  rA <- xxx
  i0:  r0 <- const_0
  i1:  r1 <- rA & r0
             REG_DEAD (rA)
  i2:  r0 <- i2src (using r1)
  iy:  rA <- yyy
  i3:  r3 <- i3src (using r0)

Starting from i3, we will delete insn iy, but what should be deleted actually
is insn ix!


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (13 preceding siblings ...)
  2014-12-09  8:48 ` amker.cheng at gmail dot com
@ 2014-12-09  9:27 ` amker.cheng at gmail dot com
  2014-12-10  8:14 ` amker at gcc dot gnu.org
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amker.cheng at gmail dot com @ 2014-12-09  9:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #13 from bin.cheng <amker.cheng at gmail dot com> ---
The check itself is suspicious too.  Why do we want to check elim_i2/elim_i1
when distributing REG_DEAD note from i1?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (14 preceding siblings ...)
  2014-12-09  9:27 ` amker.cheng at gmail dot com
@ 2014-12-10  8:14 ` amker at gcc dot gnu.org
  2014-12-11  3:10 ` amker.cheng at gmail dot com
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amker at gcc dot gnu.org @ 2014-12-10  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #14 from amker at gcc dot gnu.org ---
Working on a patch.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (15 preceding siblings ...)
  2014-12-10  8:14 ` amker at gcc dot gnu.org
@ 2014-12-11  3:10 ` amker.cheng at gmail dot com
  2014-12-11  8:52 ` amker at gcc dot gnu.org
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amker.cheng at gmail dot com @ 2014-12-11  3:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #15 from bin.cheng <amker.cheng at gmail dot com> ---
Hmm, words on tem_insn issue at the end of comment #12 isn't mature.  It's more
complicated than that.

Turns out live range of register which is noted as DEAD in i1/i2 can be
extended because we propagate its use into i3, like below example:

ix: rx <- frags:CCZ!=0
i1: r1 <- r0 & 0xfffe
          REG_DEAD (r0)
i2: r2 <- r1 | rx
          REG_DEAD (r1)
          REG_DEAD (rx)
i3: r3 <- r2 & 0xfffd
          REG_DEAD (r2)

Which can be combined into:

ix: rx <- frags:CCZ!=0
i1: INSN_DELETED
i2: r2 <- r0 & 0xfffc
          REG_DEAD (r0)
i3: r3 <- r2 | rx
          REG_DEAD (rx)

Apparently, live ranges of both r0 and rx are extended.  The REG_DEAD(r0) in
new i2 and REG_DEAD(rx) in new i3 are distributed from the original i1 and i2. 
In this case, if we set tem_insn to from_insn, gcc will delete the previous
definition of r0/rx, resulting in wrong code.

I think the case I given in #12 may still hold.  Interesting thing is, the
comment in distribute_notes is describing that case, rather than what it
handles as in this comment:
Quoted from source code:

          /* ...
         If the register is not used as an input in either I3 or I2
         and it is not one of the registers we were supposed to eliminate,
         there are two possibilities.  We might have a non-adjacent I2
         or we might have somehow eliminated an additional register
         from a computation.  For example, we might have had A & B where
         we discover that B will always be zero.  In this case we will
         eliminate the reference to A.

         In both cases, we must search to see if we can find a previous
         use of A and put the death note there.  */

I will rephrase the comments along with code changes.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (16 preceding siblings ...)
  2014-12-11  3:10 ` amker.cheng at gmail dot com
@ 2014-12-11  8:52 ` amker at gcc dot gnu.org
  2014-12-22 10:25 ` amker at gcc dot gnu.org
  2014-12-22 10:29 ` amker at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: amker at gcc dot gnu.org @ 2014-12-11  8:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #16 from amker at gcc dot gnu.org ---
For calls of distribute_notes with from_insn != NULL, I kind of understand why
it is vulnerable, at least when handling REG_DEAD notes.

When we distribute REG_DEAD note of one register from FROM_INSN, generally it
means live range shrink of that register.  For example:
  i1: r1 <- const_0
  i2: r2 <- rx & const_0
            REG_DEAD (rx)
  i3: r3 <- i3src (using r2)
In this case, we need to search backward in instruction flow, find previous
reference of rx and put a REG_DEAD note there, or delete the definition of rx
if there is no other reference.

But in combine, thing is complicate when use of rx in r1 is combined into i2. 
It means live range extend of register rx.  In this case, we need to search
forward and decide whether we need to put a REG_DEAD for rx at i2.  Sometimes
we are lucky when i2 is the predecessor of i3, which means we can infer the
insert point directly.  This is why the code handles this specially (as
quoting):

          /* ...
         If the register is used as an input in I3, it dies there.
         Similarly for I2, if it is nonzero and adjacent to I3.
             ...  */

          else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)))
        place = i3;
          else if (i2 != 0 && next_nonnote_nondebug_insn (i2) == i3
               && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
        place = i2;

For other cases, GCC just start from i3 and search backward in flow.  This
causes this PR.

Thing becomes even complicated when we support four insn combination because
it's possible that i1dest is re-set in i2, like below example:
  i0: r0 <- const_0
  i1: r1 <- i1src (using r0)
            REG_DEAD (r0)
  i2: r0 <- i2src (using r1)
  i3: r3 <- i3src (using r0)

Since GCC wrongly handles live range shrink of r0 as live range extend, it
searches from r3 backward in flow, resulting in i2 deleted.

So distribute_notes is complicated and vulnerable because it tries to
understand live range change by guessing based on information like
elim_ix/i2/i3, etc., rather than real live range information.

>From my point of view, all distribute_notes needs to do when "FROM_INSN!=NULL"
are three live range changes:
1) live range shrink, like the first example.
2) live range shrink, but the instruction has already deleted by combine, like
the last example.
3) live range extend, like below example:
      before combine       after combine
  i0: r0 <- const        i0: INSN_DELETED
  i1: r1 <- rx & r0      i1: INSN_DELETED
            REG_DEAD(rx)
  i2: r2 <- r1           i2: r2 <- rx & const
                                   REG_DEAD(rx)
  i3: r3 <- i3src        i3: r3 <- i3src'

Problem is if we can infer live range change from order of i0/i1/i2/i3, using
INSN_LUID maybe?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (17 preceding siblings ...)
  2014-12-11  8:52 ` amker at gcc dot gnu.org
@ 2014-12-22 10:25 ` amker at gcc dot gnu.org
  2014-12-22 10:29 ` amker at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: amker at gcc dot gnu.org @ 2014-12-22 10:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

--- Comment #17 from amker at gcc dot gnu.org ---
Author: amker
Date: Mon Dec 22 10:25:10 2014
New Revision: 219008

URL: https://gcc.gnu.org/viewcvs?rev=219008&root=gcc&view=rev
Log:

    PR rtl-optimization/62151
    * combine.c (try_combine): New local variables local_elim_i1
    and local_elim_i0.  Set elim_i1 and elim_i0 using the local
    version variables.  Distribute notes from i0notes or i1notes
    using the local variables.

    gcc/testsuite/ChangeLog
    PR rtl-optimization/62151
    * gcc.c-torture/execute/pr62151.c: New test.


Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/pr62151.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/combine.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/62151] [5 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
  2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (18 preceding siblings ...)
  2014-12-22 10:25 ` amker at gcc dot gnu.org
@ 2014-12-22 10:29 ` amker at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: amker at gcc dot gnu.org @ 2014-12-22 10:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62151

amker at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #18 from amker at gcc dot gnu.org ---
Should be fixed now.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-12-22 10:29 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-15  6:14 [Bug tree-optimization/62151] New: wrong code at -O2 and -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
2014-08-15  7:33 ` [Bug tree-optimization/62151] [5 Regression] " mpolacek at gcc dot gnu.org
2014-08-15  7:34 ` mpolacek at gcc dot gnu.org
2014-08-15  7:42 ` [Bug rtl-optimization/62151] " mpolacek at gcc dot gnu.org
2014-08-15  7:55 ` jakub at gcc dot gnu.org
2014-08-15  8:04 ` amker at gcc dot gnu.org
2014-08-15  9:53 ` rguenth at gcc dot gnu.org
2014-08-15  9:53 ` rguenth at gcc dot gnu.org
2014-08-15 15:39 ` amker at gcc dot gnu.org
2014-08-16 12:11 ` amker at gcc dot gnu.org
2014-08-19 18:55 ` ebotcazou at gcc dot gnu.org
2014-08-20  2:32 ` amker at gcc dot gnu.org
2014-10-02  2:18 ` segher at gcc dot gnu.org
2014-10-02  2:30 ` segher at gcc dot gnu.org
2014-12-09  8:48 ` amker.cheng at gmail dot com
2014-12-09  9:27 ` amker.cheng at gmail dot com
2014-12-10  8:14 ` amker at gcc dot gnu.org
2014-12-11  3:10 ` amker.cheng at gmail dot com
2014-12-11  8:52 ` amker at gcc dot gnu.org
2014-12-22 10:25 ` amker at gcc dot gnu.org
2014-12-22 10:29 ` amker at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).