public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/35714] x86 poor code with pmaddwd
  2008-03-27  1:03 [Bug target/35714] New: x86 poor code with pmaddwd astrange at ithinksw dot com
@ 2008-03-27  1:03 ` astrange at ithinksw dot com
  2008-03-27  7:42 ` ubizjak at gmail dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: astrange at ithinksw dot com @ 2008-03-27  1:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from astrange at ithinksw dot com  2008-03-27 01:02 -------
Created an attachment (id=15384)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15384&action=view)
source


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35714


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/35714]  New: x86 poor code with pmaddwd
@ 2008-03-27  1:03 astrange at ithinksw dot com
  2008-03-27  1:03 ` [Bug target/35714] " astrange at ithinksw dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: astrange at ithinksw dot com @ 2008-03-27  1:03 UTC (permalink / raw)
  To: gcc-bugs

> /usr/local/gcc44/bin/gcc -v
Using built-in specs.
Target: i386-apple-darwin9.2.0
Configured with: ../gcc/configure --prefix=/usr/local/gcc44
--enable-threads=posix --with-arch=core2 --with-tune=core2 --with-gmp=/sw
--with-mpfr=/sw --disable-nls --disable-bootstrap --enable-checking=yes,rtl
CFLAGS=-g LDFLAGS=/usr/lib/libiconv.dylib --enable-languages=c,c++,objc
Thread model: posix
gcc version 4.4.0 20080326 (experimental) (GCC)
> /usr/local/gcc44/bin/gcc -Os -march=core2 -fno-pic -fomit-frame-pointer -flax-vector-conversions -S pmaddwd.c

generates:
_madd_swapped:
        subl    $12, %esp
        movaps  LC0, %xmm1
        addl    $12, %esp
        pmaddwd %xmm1, %xmm0
        ret
.globl _madd
_madd:
        subl    $12, %esp
        movaps  LC0, %xmm1
        addl    $12, %esp
        pmaddwd %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        ret

Both of these should be:
_madd:
        pmaddwd LC0, %xmm0
        ret

since the stack isn't referenced and pmaddwd is commutative. (the variable
being renamed LC0 is PR 31043)


-- 
           Summary: x86 poor code with pmaddwd
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: astrange at ithinksw dot com
 GCC build triplet: i386-apple-darwin9.2.0
  GCC host triplet: i386-apple-darwin9.2.0
GCC target triplet: i386-apple-darwin9.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35714


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/35714] x86 poor code with pmaddwd
  2008-03-27  1:03 [Bug target/35714] New: x86 poor code with pmaddwd astrange at ithinksw dot com
  2008-03-27  1:03 ` [Bug target/35714] " astrange at ithinksw dot com
@ 2008-03-27  7:42 ` ubizjak at gmail dot com
  2008-03-30 20:40 ` pinskia at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2008-03-27  7:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from ubizjak at gmail dot com  2008-03-27 07:41 -------
Combine doesn't want to merge memory operand into sse2_pmaddwd pattern. Perhaps
this is the limitation of combiner, since pmaddwd pattern defines multiple uses
of its input operands.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35714


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/35714] x86 poor code with pmaddwd
  2008-03-27  1:03 [Bug target/35714] New: x86 poor code with pmaddwd astrange at ithinksw dot com
  2008-03-27  1:03 ` [Bug target/35714] " astrange at ithinksw dot com
  2008-03-27  7:42 ` ubizjak at gmail dot com
@ 2008-03-30 20:40 ` pinskia at gcc dot gnu dot org
  2008-05-06 12:56 ` ubizjak at gmail dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-03-30 20:40 UTC (permalink / raw)
  To: gcc-bugs



-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35714


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/35714] x86 poor code with pmaddwd
  2008-03-27  1:03 [Bug target/35714] New: x86 poor code with pmaddwd astrange at ithinksw dot com
                   ` (2 preceding siblings ...)
  2008-03-30 20:40 ` pinskia at gcc dot gnu dot org
@ 2008-05-06 12:56 ` ubizjak at gmail dot com
  2008-05-07 13:13 ` uros at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2008-05-06 12:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from ubizjak at gmail dot com  2008-05-06 12:55 -------
This is due to this code snippet from i386.c, ix86_expand_binop_builtin ():

--cut here--
  /* ??? Using ix86_fixup_binary_operands is problematic when
     we've got mismatched modes.  Fake it.  */

  xops[0] = target;
  xops[1] = op0;
  xops[2] = op1;

  if (tmode == mode0 && tmode == mode1)
    {
      target = ix86_fixup_binary_operands (UNKNOWN, tmode, xops);
      op0 = xops[1];
      op1 = xops[2];
    }
  else if (optimize || !ix86_binary_operator_ok (UNKNOWN, tmode, xops))
    {
      op0 = force_reg (mode0, op0);
      op1 = force_reg (mode1, op1);
      target = gen_reg_rtx (tmode);
    }
--cut here--

Since UNKNOWN is not commutative operator, this code disables many
optimizations that can be performed by treating builtins individually using
ix86_fixup_binary_operands_no_copy ().


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35714


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/35714] x86 poor code with pmaddwd
  2008-03-27  1:03 [Bug target/35714] New: x86 poor code with pmaddwd astrange at ithinksw dot com
                   ` (3 preceding siblings ...)
  2008-05-06 12:56 ` ubizjak at gmail dot com
@ 2008-05-07 13:13 ` uros at gcc dot gnu dot org
  2008-05-07 13:34 ` ubizjak at gmail dot com
  2009-10-28  9:37 ` ubizjak at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: uros at gcc dot gnu dot org @ 2008-05-07 13:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from uros at gcc dot gnu dot org  2008-05-07 13:12 -------
Subject: Bug 35714

Author: uros
Date: Wed May  7 13:12:02 2008
New Revision: 135041

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=135041
Log:
        PR target/35714
        * config/i386/mmx.md (mmx_subv2sf3): New expander.
        (*mmx_subv2sf3): Rename from mmx_subv2sf3 insn pattern.
        (*mmx_eqv2sf3): Rename from mmx_eqv2sf3 insn pattern.
        (mmx_eqv2sf3): New expander.  Use ix86_fixup_binary_operands_no_copy
        to handle nonimmediate operands.
        (*mmx_paddwd): Rename from mmx_paddwd insn pattern.
        (mmx_paddwd): New expander.  Use ix86_fixup_binary_operands_no_copy
        to handle nonimmediate operands.
        (*mmx_pmulhrwv4hi3): Rename from mmx_pmulhrwv4hi3 insn pattern.
        (mmx_pmulhrwv4hi3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse2_umulv1siv1di3): Rename from sse2_umulv1siv1di3 insn pattern.
        (sse2_umulv1siv1di3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*mmx_eq<mode>3): Rename from mmx_eq<mode>3 insn pattern.
        (mmx_eq<mode>3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*mmx_uavgv8qi3): Rename from mmx_uavgv8qi3 insn pattern.
        (mmx_uavgv8qi3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*mmx_uavgv4hi3): Rename from mmx_uavgv4hi3 insn pattern.
        (mmx_uavgv4hi3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.

        * config/i386/sse.md
        (*sse_movhlps): Rename from sse_movhlps insn pattern.
        (sse_movhlps): New expander.  Use ix86_fixup_binary_operands
        to handle nonimmediate operands.
        (*sse_movlhps): Rename from sse_movlhps insn pattern.
        (sse_movlhps): New expander.  Use ix86_fixup_binary_operands
        to handle nonimmediate operands.
        (*sse_loadhps): Rename from sse_loadhps insn pattern.
        (sse_loadhps): New expander.  Use ix86_fixup_binary_operands
        to handle nonimmediate operands.
        (*sse_loadlps): Rename from sse_loadlps insn pattern.
        (sse_loadlps): New expander.  Use ix86_fixup_binary_operands
        to handle nonimmediate operands.
        (*sse2_unpckhpd): Rename from sse2_unpckhpd insn pattern.
        (sse2_unpckhpd): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse2_unpcklpd): Rename from sse2_unpcklpd insn pattern.
        (sse2_unpcklpd): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse_loadhpd): Rename from sse_loadhpd insn pattern.
        (sse_loadhpd): New expander.  Use ix86_fixup_binary_operands
        to handle nonimmediate operands.
        (*sse_loadlpd): Rename from sse_loadlpd insn pattern.
        (sse_loadlpd): New expander.  Use ix86_fixup_binary_operands
        to handle nonimmediate operands.
        (*sse2_<plusminus_insn><mode>3): Rename from
        sse2_<plusminus_insn><mode>3 insn pattern.
        (sse2_<plusminus_insn><mode>3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse2_umulv2siv2di3): Rename from sse2_umulv2siv2di3 insn pattern.
        (sse2_umulv2siv2di3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse4_1_mulv2siv2di3): Rename from sse4_1_mulv2siv2di3 insn pattern.
        (sse4_1_mulv2siv2di3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse2_pmaddwd): Rename from sse2_pmaddwd insn pattern.
        (sse2_pmaddwd): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse2_eq<mode>3): Rename from sse2_eq<mode>3 insn pattern.
        (sse2_eq<mode>3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse4_1_eqv2di3): Rename from sse4_1_eqv2di3 insn pattern.
        (sse4_1_eqv2di3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse2_uavgv16qi3): Rename from sse2_uavgv16qi3 insn pattern.
        (sse2_uavgv16qi3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse2_uavgv16qi3): Rename from sse2_uavgv16qi3 insn pattern.
        (sse2_uavgv16qi3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*sse2_uavgv8hi3): Rename from sse2_uavgv8hi3 insn pattern.
        (sse2_uavgv8hi3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*ssse3_pmulhrswv8hi3): Rename from ssse3_pmulhrswv8hi3 insn pattern.
        (ssse3_pmulhrswv8hi3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.
        (*ssse3_pmulhrswv4hi3): Rename from ssse3_pmulhrswv4hi3 insn pattern.
        (ssse3_pmulhrswv4hi3): New expander.  Use
        ix86_fixup_binary_operands_no_copy to handle nonimmediate operands.

        (<sse>_vm<plusminus_insn><mode>3): Do not use ix86_binary_operator_ok.
        (<sse>_vmmul<mode>3): Ditto.
        (divv4sf3): Do not use ix86_fixup_binary_operands_no_copy.
        (divv2df3): Ditto.
        (ssse3_pmaddubsw128): Use register_operand for operand 1.
        (ssse3_pmaddubsw): Ditto.

        * config/i386/sse.md (ix86_fixup_binary_operands): Assert that src1
        and src2 must have the same mode when swapped.
        (ix86_expand_binop_builtin): Do not use ix86_fixup_binary_operands
        and ix86_binary_operator_ok.  Do not force operands in registers
        when optimizing.

testsuite/ChangeLog:

        PR target/35714
        * gcc.target/i386/pr35714.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr35714.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/mmx.md
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35714


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/35714] x86 poor code with pmaddwd
  2008-03-27  1:03 [Bug target/35714] New: x86 poor code with pmaddwd astrange at ithinksw dot com
                   ` (4 preceding siblings ...)
  2008-05-07 13:13 ` uros at gcc dot gnu dot org
@ 2008-05-07 13:34 ` ubizjak at gmail dot com
  2009-10-28  9:37 ` ubizjak at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2008-05-07 13:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from ubizjak at gmail dot com  2008-05-07 13:33 -------
The problem with memory operands has been fixed by the patch, so we generate
optimal one insn sequence for both functions in:

--cut here--
#include <emmintrin.h>

extern __m128i a;

__m128i madd (__m128i b)
{
  return _mm_madd_epi16(a, b);
}

__m128i madd_swapped (__m128i b)
{
    return _mm_madd_epi16(b, a);
}
--cut here--

Original testcase passes immediate operand to expanders. Since immediates don't
satisfy insn operand constraints, we move them to the register. Since there is
no direct imm->reg load insn for V4SF, they are first pushed to memory and then
loaded to register.

To solve this problem, we should push immediates to the memory in case insn
supports memory operands. Alternatively, we can perhaps find original memory
location (if available) and pass this location to the expander.


-- 

ubizjak at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |ubizjak at gmail dot com
                   |dot org                     |
             Status|UNCONFIRMED                 |ASSIGNED
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2008-05-07 13:33:55
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35714


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/35714] x86 poor code with pmaddwd
  2008-03-27  1:03 [Bug target/35714] New: x86 poor code with pmaddwd astrange at ithinksw dot com
                   ` (5 preceding siblings ...)
  2008-05-07 13:34 ` ubizjak at gmail dot com
@ 2009-10-28  9:37 ` ubizjak at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: ubizjak at gmail dot com @ 2009-10-28  9:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from ubizjak at gmail dot com  2009-10-28 09:36 -------
Original testcase is now also fixed in mainline, on x86_64 compiles to (-O2):

madd:
        pmaddwd a(%rip), %xmm0
        ret

madd_swapped:
        pmaddwd a(%rip), %xmm0
        ret

        .section        .rodata
        .align 16
        .type   a, @object
        .size   a, 16
a:
        .value  -22725
        .value  -12873
        .value  -22725
        .value  -12873
        .value  -22725
        .value  -12873
        .value  -22725
        .value  -12873


-- 

ubizjak at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|---                         |4.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35714


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-10-28  9:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-27  1:03 [Bug target/35714] New: x86 poor code with pmaddwd astrange at ithinksw dot com
2008-03-27  1:03 ` [Bug target/35714] " astrange at ithinksw dot com
2008-03-27  7:42 ` ubizjak at gmail dot com
2008-03-30 20:40 ` pinskia at gcc dot gnu dot org
2008-05-06 12:56 ` ubizjak at gmail dot com
2008-05-07 13:13 ` uros at gcc dot gnu dot org
2008-05-07 13:34 ` ubizjak at gmail dot com
2009-10-28  9:37 ` ubizjak at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).