public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Shrink wrapping issues
@ 2011-11-05  9:51 Jakub Jelinek
  2011-11-05 10:28 ` Alan Modra
  0 siblings, 1 reply; 3+ messages in thread
From: Jakub Jelinek @ 2011-11-05  9:51 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Richard Henderson, gcc

Hi!

On the following testcase with -m64 -O3 -mavx2 (but it is just an example,
you can replace the loop there with any code that doesn't touch the
stack or frame pointer at all), only f3 is shrink wrapped and in that case
it on the other side doesn't add vzeroupper before leaving the AVX using
code that it IMNSHO should.  But I wonder why we can't shrink-wrap also
the first two testcases (well, in the second testcase it wouldn't be book
shrink-wrapping, but essentially throwing away the prologue/epilogue).

From quick look, f1 isn't shrink-wrapped probably because of the set
of bb's that need prologue/epilogue around it doesn't end in a return,
but in a tail call.  Can't we just add a prologue before the bar call
and throw the epilogue away (normally the epilogue in a function that
ends only in a tail call is just emitted after the barrier and
optimized away I think, we could do the same?).

And f2 is something that IMHO with especially AVX/AVX2 code happens very
often, the prologue is expensive as it realigns the stack.  The reason
for that is that until reload we don't know whether something won't be
spilled on the stack and we need/want 32-byte aligned stack slots
for that spilling.  Isn't the case when none of the bbs actually need
stack/frame pointer just a special case of shrink wrapping?  Can't we
either throw the prologue/epilogue away then and just end the function
in simple_return?  f4 is another test case for the same thing,
this time with no AVX/AVX2 intrinsics, but which the vectorizer
vectorizes using 256-bit vectors.

#include <x86intrin.h>

__m256i a[16], b[16], f;
__m256d g[16], h;
extern void bar (void);
extern void baz (void);

void
f1 (int c)
{
  int i;
  if (c)
    for (i = 0; i < 16; i++)
      a[i] = _mm256_i64gather_epi64 (NULL, b[i], 1);
  else
    {
      bar ();
      baz ();
    }
}

void
f2 (void)
{
  int i;
  for (i = 0; i < 16; i++)
    a[i] = _mm256_i64gather_epi64 (NULL, b[i], 1);
}

int
f3 (int c)
{
  int i;
  if (c)
    for (i = 0; i < 16; i++)
      a[i] = _mm256_i64gather_epi64 (NULL, b[i], 1);
  else
    {
      bar ();
      baz ();
    }
  return c;
}

float x[8], y[8];

void
f4 (void)
{
  int i;
  for (i = 0; i < 8; i++)
    x[i] = y[i] * 2 - x[i];
}


	Jakub

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-11-05 10:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-05  9:51 Shrink wrapping issues Jakub Jelinek
2011-11-05 10:28 ` Alan Modra
2011-11-05 10:50   ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).