public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/52082] New: Memory loads not rematerialized
@ 2012-02-01 10:30 jakub at gcc dot gnu.org
  2012-02-03  2:30 ` [Bug rtl-optimization/52082] " pinskia at gcc dot gnu.org
  2021-09-05  4:52 ` pinskia at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-02-01 10:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52082

             Bug #: 52082
           Summary: Memory loads not rematerialized
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, ra
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jakub@gcc.gnu.org
                CC: vmakarov@gcc.gnu.org
            Target: x86_64-linux


On the following testcase at -O2 (distilled from genautomata.c):

struct S { unsigned long *s1; struct S *s2; };
int v1 __attribute__((visibility ("hidden")));
struct T
{
  int a, b, c;
} *v2 __attribute__((visibility ("hidden")));
struct S **v3 __attribute__((visibility ("hidden")));
struct S **v4 __attribute__((visibility ("hidden")));

int __attribute__((noinline, noclone))
foo (unsigned long *x, unsigned long *y, int z)
{
  int j, k, l;
  unsigned int i;
  struct S *m;

  for (j = 0; j < v1; j++)
    if (y[j])
      for (i = 0; i < 8 * sizeof (unsigned long); i++)
  if ((y[j] >> i) & 1)
    {
      k = j * 8 * sizeof (unsigned long) + i;
      if (k >= v2->c)
        break;
      for (m = (z ? v4 [k] : v3 [k]); m != ((void *)0); m = m->s2)
        {
          for (l = 0; l < v1; l++)
            if ((x [l] & m->s1 [l]) != m->s1 [l] && m->s1 [l])
              break;
          if (l >= v1)
            return 0;
        }
    }
  return 1;
}

tree LIM moves the loads from v2/v3/v4 before the loop, but unfortunately the
register pressure is high and the pseudos holding the v3/v4 pointers don't get
a a hard register and are immediately spilled to the stack.  I wonder whether
we couldn't instead just rematerialize them and put the original MEM loads into
the loop (assuming they don't alias with anything on the way, but that must be
the case here when LIM moved them there first, after all this loop doesn't have
any MEM stores at all).


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug rtl-optimization/52082] Memory loads not rematerialized
  2012-02-01 10:30 [Bug middle-end/52082] New: Memory loads not rematerialized jakub at gcc dot gnu.org
@ 2012-02-03  2:30 ` pinskia at gcc dot gnu.org
  2021-09-05  4:52 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-02-03  2:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52082

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-02-03
          Component|middle-end                  |rtl-optimization
     Ever Confirmed|0                           |1

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-02-03 02:29:47 UTC ---
Confirmed.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug rtl-optimization/52082] Memory loads not rematerialized
  2012-02-01 10:30 [Bug middle-end/52082] New: Memory loads not rematerialized jakub at gcc dot gnu.org
  2012-02-03  2:30 ` [Bug rtl-optimization/52082] " pinskia at gcc dot gnu.org
@ 2021-09-05  4:52 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-05  4:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52082

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
One thing I noticed that LLVM does to reduce the register pressure is:
(z ? v4 [k] : v3 [k])

Gets pulled out of the loop such that it is:
tmpaddr = z ? v4 : v3;

and then inside the loop it does:
(tempaddr)[k]

GCC still has (I changed the bb order just so it is easier to see what is going
on):
  if (z_39(D) != 0)
    goto <bb 9>; [50.00%]
  else
    goto <bb 11>; [50.00%]

  <bb 11> [local count: 5427362]:
  _21 = v3.3_18 + _157;
  iftmp.1_40 = *_21;
  goto <bb 10>; [100.00%]

  <bb 9> [local count: 5427362]:
  _17 = v4.2_14 + _157;
  iftmp.1_41 = *_17;

  <bb 10> [local count: 10854724]:
  # m_8 = PHI <iftmp.1_40(11), iftmp.1_41(9)>
  if (m_8 != 0B)
    goto <bb 12>; [94.50%]
  else
    goto <bb 19>; [5.50%]

we should able to do the similar it seems and need two less registers; one to
hold z and one to hold either v3 or v4.  This won't be enough for this testcase
but it will be something.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-05  4:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-01 10:30 [Bug middle-end/52082] New: Memory loads not rematerialized jakub at gcc dot gnu.org
2012-02-03  2:30 ` [Bug rtl-optimization/52082] " pinskia at gcc dot gnu.org
2021-09-05  4:52 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).