[Bug target/54716] New: Select best typed instruction for bitwise operations

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/54716] New: Select best typed instruction for bitwise operations
@ 2012-09-26 12:37 glisse at gcc dot gnu.org
  2012-09-26 13:44 ` [Bug target/54716] " jakub at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-09-26 12:37 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

             Bug #: 54716
           Summary: Select best typed instruction for bitwise operations
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: glisse@gcc.gnu.org


Hello,

consider these 3 versions of OR:

#include <x86intrin.h>

__m256d f(__m256d x,__m256d y){
  return (__m256d)((__m256i)x|(__m256i)y);
}

__m256d g(__m256d x,__m256d y){
  return _mm256_or_pd(x,y);
}

__m256i h(__m256i x,__m256i y){
  return x|y;
}

With -mavx, they compile to vorps, vorpd, vorps.
With -mavx2, they compile to vpor, vorpd, vpor.

Functions g and h are fine, but for f (which is about the only way to write OR
using the C vector extensions, since | doesn't apply to floats) it would be
great to see vorpd in both cases.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
@ 2012-09-26 13:44 ` jakub at gcc dot gnu.org
  2012-09-26 13:46 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-09-26 13:44 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-09-26
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |rth at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org
     Ever Confirmed|0                           |1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
  2012-09-26 13:44 ` [Bug target/54716] " jakub at gcc dot gnu.org
@ 2012-09-26 13:46 ` jakub at gcc dot gnu.org
  2012-09-26 13:56 ` ubizjak at gmail dot com
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-09-26 13:46 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-09-26 13:46:25 UTC ---
Created attachment 28282
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28282
gcc48-pr54716.patch

Untested patch to optimize this.

Unfortunately it will also change generated code for:
__m256d
i (__m256d x, __m256d y)
{
  return (__m256d) _mm256_or_si256 ((__m256i) x, (__m256i) y);
}

Not sure if that is an issue or not.  If we wanted to emit what the user for
whatever reason asked for, the builtin expander could perhaps in those cases
copy one of the arguments into a temporary pseudo before expansion.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
  2012-09-26 13:44 ` [Bug target/54716] " jakub at gcc dot gnu.org
  2012-09-26 13:46 ` jakub at gcc dot gnu.org
@ 2012-09-26 13:56 ` ubizjak at gmail dot com
  2012-09-26 14:01 ` ubizjak at gmail dot com
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2012-09-26 13:56 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

--- Comment #2 from Uros Bizjak <ubizjak at gmail dot com> 2012-09-26 13:55:28 UTC ---
(In reply to comment #1)
> Created attachment 28282 [details]
> gcc48-pr54716.patch

Does this patch also fix xfail in gcc.target/i386/xorps-sse2.c?

IIRC, we generated correct instructions for float arguments, but deliberatly
removed this functionality for some reason. I tried to look for the reason in
the SVN history, but didn't found anything relevant.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2012-09-26 13:56 ` ubizjak at gmail dot com
@ 2012-09-26 14:01 ` ubizjak at gmail dot com
  2012-09-26 14:20 ` ubizjak at gmail dot com
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2012-09-26 14:01 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

--- Comment #3 from Uros Bizjak <ubizjak at gmail dot com> 2012-09-26 14:01:13 UTC ---
Maybe also relevant: [1].

[1] http://gcc.gnu.org/ml/gcc-patches/2007-08/msg01546.html


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2012-09-26 14:01 ` ubizjak at gmail dot com
@ 2012-09-26 14:20 ` ubizjak at gmail dot com
  2012-09-26 14:23 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2012-09-26 14:20 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

Uros Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com

--- Comment #4 from Uros Bizjak <ubizjak at gmail dot com> 2012-09-26 14:19:35 UTC ---
Based on claim in [1], AMD chips do not care at all, so we use *ps variants
which are one byte shorter. Let's ask HJ about Intel.

[1] http://gcc.gnu.org/ml/gcc-patches/2007-08/msg01564.html


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2012-09-26 14:20 ` ubizjak at gmail dot com
@ 2012-09-26 14:23 ` jakub at gcc dot gnu.org
  2012-09-26 14:57 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-09-26 14:23 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-09-26 14:23:29 UTC ---
(In reply to comment #2)
> (In reply to comment #1)
> > Created attachment 28282 [details]
> > gcc48-pr54716.patch
> 
> Does this patch also fix xfail in gcc.target/i386/xorps-sse2.c?
> 
> IIRC, we generated correct instructions for float arguments, but deliberatly
> removed this functionality for some reason. I tried to look for the reason in
> the SVN history, but didn't found anything relevant.

It doesn't, the optimization is keyed there on the casts from vector float on
both operands, not on the cast of results.  And on that testcase one of the
arguments is not a SUBREG from the floating vector.
I think doing the optimization is questionable if both operands aren't float
vectors, because then we could very well pessimize the generated code instead
of improving it.  If both are float vectors, then most likely we'll get rid of
two reinterpretation penalties and perhaps worst case add one on the result.
To fix up xorps-sse2.c, we could tweak the expander's predicates, from
nonimmediate_operand to say "", and we could handle CONST_VECTOR as any kind of
load for the purpose of the test, then of course force to register if not
nonimmediate_operand.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2012-09-26 14:23 ` jakub at gcc dot gnu.org
@ 2012-09-26 14:57 ` jakub at gcc dot gnu.org
  2012-09-26 15:41 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-09-26 14:57 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #28282|0                           |1
        is obsolete|                            |

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-09-26 14:57:15 UTC ---
Created attachment 28286
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28286
gcc48-pr54716.patch

Updated patch that fixes also xorps-sse2.c.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2012-09-26 14:57 ` jakub at gcc dot gnu.org
@ 2012-09-26 15:41 ` jakub at gcc dot gnu.org
  2012-09-26 15:53 ` glisse at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-09-26 15:41 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-09-26 15:40:53 UTC ---
On
#define vector __attribute__ ((vector_size (16)))

__attribute__((noinline, noclone))
vector float foo(vector float f, vector float h)
{
  vector int g = { 0x80000000, 0, 0x80000000, 0 };
  vector int f_int = (vector int) f;
  return ((vector float) (f_int ^ g)) + h;
}

vector float a = { 1.0, 2.0, 3.0, 4.0 }, b = { 5.0, 6.0, 7.0, 8.0 }, c = { 9.0,
10.0, 11.0, 12.0 }, r;

int
main ()
{
  int i;
  for (i = 0; i < 1000000000; i++)
    {
      asm volatile ("" : : : "memory");
      r = foo(a + b, a + c) - a;
      asm volatile ("" : : : "memory");
    }
  return 0;
}

I haven't noticed a measurable performance difference though on Intel SNB 2600
CPU though, so perhaps the patch isn't needed.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2012-09-26 15:41 ` jakub at gcc dot gnu.org
@ 2012-09-26 15:53 ` glisse at gcc dot gnu.org
  2012-09-26 20:59 ` ubizjak at gmail dot com
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-09-26 15:53 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

--- Comment #8 from Marc Glisse <glisse at gcc dot gnu.org> 2012-09-26 15:53:00 UTC ---
(In reply to comment #7)
> I haven't noticed a measurable performance difference though on Intel SNB 2600
> CPU though, so perhaps the patch isn't needed.

Ah, I assumed they had a good reason for creating so many variants of the same
instruction. If there is no difference (or even a difference in the wrong
direction because of the instruction size), feel free to close the bug.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2012-09-26 15:53 ` glisse at gcc dot gnu.org
@ 2012-09-26 20:59 ` ubizjak at gmail dot com
  2012-09-28 12:21 ` jakub at gcc dot gnu.org
  2012-11-10 12:55 ` glisse at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2012-09-26 20:59 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

--- Comment #9 from Uros Bizjak <ubizjak at gmail dot com> 2012-09-26 20:59:24 UTC ---
(In reply to comment #8)
> (In reply to comment #7)
> > I haven't noticed a measurable performance difference though on Intel SNB 2600
> > CPU though, so perhaps the patch isn't needed.
> 
> Ah, I assumed they had a good reason for creating so many variants of the same
> instruction. If there is no difference (or even a difference in the wrong
> direction because of the instruction size), feel free to close the bug.

I think we should still go with the proposed patch. Insn size is handled by
choosing *PS mode attribute for -Os in the insn pattern, and there is no size
difference for AVX. If some target prefers *PS variants, there is always
X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL flag available.

But please put the code into a helper function. Due to VI mode iterator, the
code is emitted eight times!


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2012-09-26 20:59 ` ubizjak at gmail dot com
@ 2012-09-28 12:21 ` jakub at gcc dot gnu.org
  2012-11-10 12:55 ` glisse at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-09-28 12:21 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-09-28 12:21:01 UTC ---
Author: jakub
Date: Fri Sep 28 12:20:54 2012
New Revision: 191827

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191827
Log:
    PR target/54716
    * config/i386/predicates.md (nonimmediate_or_const_vector_operand):
    New predicate.
    * config/i386/i386.c (ix86_expand_vector_logical_operator): New
    function.
    * config/i386/i386-protos.h (ix86_expand_vector_logical_operator): New
    prototype.
    * config/i386/sse.md (<code><mode>3 VI logic): Use it.

    * gcc.target/i386/xorps-sse2.c: Remove xfails.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386-protos.h
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/predicates.md
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/i386/xorps-sse2.c


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/54716] Select best typed instruction for bitwise operations
  2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2012-09-28 12:21 ` jakub at gcc dot gnu.org
@ 2012-11-10 12:55 ` glisse at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-11-10 12:55 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54716

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #11 from Marc Glisse <glisse at gcc dot gnu.org> 2012-11-10 12:55:43 UTC ---
It looks like Jakub's patch fixed this completely. I now see

-mavx: vorpd, vorpd, vorps
-mavx2: vorpd, vorpd, vpor

so closing.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-11-10 12:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-26 12:37 [Bug target/54716] New: Select best typed instruction for bitwise operations glisse at gcc dot gnu.org
2012-09-26 13:44 ` [Bug target/54716] " jakub at gcc dot gnu.org
2012-09-26 13:46 ` jakub at gcc dot gnu.org
2012-09-26 13:56 ` ubizjak at gmail dot com
2012-09-26 14:01 ` ubizjak at gmail dot com
2012-09-26 14:20 ` ubizjak at gmail dot com
2012-09-26 14:23 ` jakub at gcc dot gnu.org
2012-09-26 14:57 ` jakub at gcc dot gnu.org
2012-09-26 15:41 ` jakub at gcc dot gnu.org
2012-09-26 15:53 ` glisse at gcc dot gnu.org
2012-09-26 20:59 ` ubizjak at gmail dot com
2012-09-28 12:21 ` jakub at gcc dot gnu.org
2012-11-10 12:55 ` glisse at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).