From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-333223-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 6610 invoked by alias); 30 Nov 2012 13:42:47 -0000
Received: (qmail 6597 invoked by uid 22791); 30 Nov 2012 13:42:46 -0000
X-SWARE-Spam-Status: No, hits=-5.0 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE
X-Spam-Check-By: sourceware.org
Received: from mail-pa0-f47.google.com (HELO mail-pa0-f47.google.com) (209.85.220.47)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 30 Nov 2012 13:42:39 +0000
Received: by mail-pa0-f47.google.com with SMTP id fa10so320289pad.20        for <gcc-patches@gcc.gnu.org>; Fri, 30 Nov 2012 05:42:39 -0800 (PST)
MIME-Version: 1.0
Received: by 10.66.79.72 with SMTP id h8mr3291187pax.49.1354282959039; Fri, 30 Nov 2012 05:42:39 -0800 (PST)
Received: by 10.66.246.232 with HTTP; Fri, 30 Nov 2012 05:42:38 -0800 (PST)
In-Reply-To: <alpine.DEB.2.02.1211301317160.3783@laptop-mg.saclay.inria.fr>
References: <alpine.DEB.2.02.1210131032460.9651@stedding.saclay.inria.fr>	<CAFULd4YHdLF1ZyxrMG8MhRjo40f-EfAJZnDOEBc80pOGa4WNGQ@mail.gmail.com>	<alpine.DEB.2.02.1210141057010.3752@laptop-mg.saclay.inria.fr>	<alpine.DEB.2.02.1211301317160.3783@laptop-mg.saclay.inria.fr>
Date: Fri, 30 Nov 2012 13:55:00 -0000
Message-ID: <CAFULd4bVFbr1KC6FrjWOhEfmqEr_v2VuQMFjg4+TpTB681=HhA@mail.gmail.com>
Subject: Re: [i386] scalar ops that preserve the high part of a vector
From: Uros Bizjak <ubizjak@gmail.com>
To: Marc Glisse <marc.glisse@inria.fr>
Cc: gcc-patches@gcc.gnu.org
Content-Type: text/plain; charset=ISO-8859-1
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2012-11/txt/msg02545.txt.bz2

On Fri, Nov 30, 2012 at 1:34 PM, Marc Glisse <marc.glisse@inria.fr> wrote:

> Hello,
>
> I experimented with the simplify-rtx transformation you suggested, see:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54855
>
> It works when the argument is a register, but not for memory (which is where
> the constant is in the testcase). And the description of the operation in
> sse.md does seem problematic. It says the second argument is:
>
>             (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))
>
> but Intel's documentation says "The source operand can be an XMM register or
> a 64-bit memory location", not quite the same.
>
> Do you think the .md description should really stay this way, or could we
> change it to something that better reflects "64-bit memory location"?

For reference, we are talking about:

(define_insn "<sse>_vm<plusminus_insn><mode>3"
  [(set (match_operand:VF_128 0 "register_operand" "=x,x")
	(vec_merge:VF_128
	  (plusminus:VF_128
	    (match_operand:VF_128 1 "register_operand" "0,x")
	    (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))
	  (match_dup 1)
	  (const_int 1)))]
  "TARGET_SSE"
  "@
   <plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %0|%0, %2}
   v<plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
  [(set_attr "isa" "noavx,avx")
   (set_attr "type" "sseadd")
   (set_attr "prefix" "orig,vex")
   (set_attr "mode" "<ssescalarmode>")])

No, looking at your description, the operand 2 should be scalar
operand (we use _s{s,d} scalar instruction here), and for doubles this
should refer to 64bit memory location. I don't remember all the
details about vec_merge scalar instructions, but it looks to me that
canonical representation should be more like your proposal:

+(define_insn "*sse2_vm<plusminus_insn>v2df3"
+  [(set (match_operand:V2DF 0 "register_operand" "=x,x")
+    (vec_concat:V2DF
+      (plusminus:DF
+        (vec_select:DF
+          (match_operand:V2DF 1 "register_operand" "0,x")
+          (parallel [(const_int 0)]))
+        (match_operand:DF 2 "nonimmediate_operand" "xm,xm"))
+      (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))))]
+  "TARGET_SSE2"

Uros.