[Bug tree-optimization/53243] New: Use vector comparisons for if cascades

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "drepper.fsp at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/53243] New: Use vector comparisons for if cascades
Date: Sat, 05 May 2012 04:06:00 -0000	[thread overview]
Message-ID: <bug-53243-4@http.gcc.gnu.org/bugzilla/> (raw)

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53243

             Bug #: 53243
           Summary: Use vector comparisons for if cascades
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: drepper.fsp@gmail.com
            Target: x86_64-linux


Created attachment 27312
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27312
Test program (compile with and without -DOLD)

The vector units can compare multiple comparisons concurrently but this is not
used automatically in gcc in situations where it can lead to better
performance.  Assume a function like this:

void
f(float a)
{
  if (a < 1.0)
    cb(1);
  else if (a < 2.0)
    cb(2);
  else if (a < 3.0)
    cb(3);
  else if (a < 4.0)
    cb(4);
  else if (a < 5.0)
    cb(5);
  else if (a < 6.0)
    cb(6);
  else if (a < 7.0)
    cb(7);
  else if (a < 8.0)
    cb(8);
  else
    ++o;
}

In this case the first or second if is not marked with __builtin_expect as
likely, otherwise the following *might* not apply.

The routine can be rewritten for AVX machines like this:

void
f(float a)
{
  const __m256 fv = _mm256_set_ps(8.0,7.0,6.0,5.0,4.0,3.0,2.0,1.0);
  __m256 r = _mm256_cmp_ps(fv, _mm256_set1_ps(a), _CMP_LT_OS);
  int i = _mm256_movemask_ps(r);
  asm goto ("bsr %0, %0; jz %l[less1]; .pushsection .rodata; 1: .quad %l2, %l3,
%l4, %l5, %l6, %l7, %l8, %l9; .popsection; jmp *1b(,%0,8)" : : "r" (i) : :
less1, less2, less3, less4, less5, less6, less7, less8, gt8);
  __builtin_unreachable ();
 less1:
  cb(1);
  return;
 less2:
  cb(2);
  return;
 less3:
  cb(3);
  return;
 less4:
  cb(4);
  return;
 less5:
  cb(5);
  return;
 less6:
  cb(6);
  return;
 less7:
  cb(7);
  return;
 less8:
  cb(8);
  return;
 gt8:
  ++o;
}

This might not generate the absolute best code but it runs for the test program
which I attach 20% faster.

The same technique can be applied to integer comparisons.  More complex if
cascades can also be simplified a lot by masking the integer bsr result
accordingly.  This should still be faster.

next             reply	other threads:[~2012-05-05  4:06 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-05  4:06 drepper.fsp at gmail dot com [this message]
2012-05-07  9:33 ` [Bug tree-optimization/53243] " rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-53243-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).