From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8996 invoked by alias); 30 Nov 2012 22:33:05 -0000 Received: (qmail 8921 invoked by uid 22791); 30 Nov 2012 22:33:02 -0000 X-SWARE-Spam-Status: No, hits=-8.0 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,TW_ZJ X-Spam-Check-By: sourceware.org Received: from mail4-relais-sop.national.inria.fr (HELO mail4-relais-sop.national.inria.fr) (192.134.164.105) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 30 Nov 2012 22:32:41 +0000 Received: from ip-53.net-81-220-141.rev.numericable.fr (HELO laptop-mg.local) ([81.220.141.53]) by mail4-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 30 Nov 2012 23:32:39 +0100 Date: Fri, 30 Nov 2012 22:36:00 -0000 From: Marc Glisse To: Uros Bizjak cc: gcc-patches@gcc.gnu.org Subject: Re: [i386] scalar ops that preserve the high part of a vector In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-11/txt/msg02576.txt.bz2 On Fri, 30 Nov 2012, Uros Bizjak wrote: > For reference, we are talking about: > > (define_insn "_vm3" > [(set (match_operand:VF_128 0 "register_operand" "=x,x") > (vec_merge:VF_128 > (plusminus:VF_128 > (match_operand:VF_128 1 "register_operand" "0,x") > (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm")) > (match_dup 1) > (const_int 1)))] > "TARGET_SSE" > "@ > \t{%2, %0|%0, %2} > v\t{%2, %1, %0|%0, %1, %2}" > [(set_attr "isa" "noavx,avx") > (set_attr "type" "sseadd") > (set_attr "prefix" "orig,vex") > (set_attr "mode" "")]) > > No, looking at your description, the operand 2 should be scalar > operand (we use _s{s,d} scalar instruction here), and for doubles this > should refer to 64bit memory location. I don't remember all the > details about vec_merge scalar instructions, but it looks to me that > canonical representation should be more like your proposal: > > +(define_insn "*sse2_vmv2df3" > + [(set (match_operand:V2DF 0 "register_operand" "=x,x") > + (vec_concat:V2DF > + (plusminus:DF > + (vec_select:DF > + (match_operand:V2DF 1 "register_operand" "0,x") > + (parallel [(const_int 0)])) > + (match_operand:DF 2 "nonimmediate_operand" "xm,xm")) > + (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))))] > + "TARGET_SSE2" Thank you. Among the following possible patterns, my choice (if nobody objects) is to use 4) for V2DF and 3) (rewritten without iterators) for V4SF. The question is then what should be done about the builtins and intrinsics. _mm_add_sd takes two __m128. If I change the signature of __builtin_ia32_addsd, I can make _mm_add_sd pass __B[0] as second argument, but I don't know if I am allowed to change that signature. Otherwise I guess I'll need to keep a separate expander for it (I'd rather not). And then there are several other operations than +/- to handle. 1) Current pattern: [(set (match_operand:VF_128 0 "register_operand" "=x,x") (vec_merge:VF_128 (plusminus:VF_128 (match_operand:VF_128 1 "register_operand" "0,x") (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm")) (match_dup 1) (const_int 1)))] 2) Minimal fix: [(set (match_operand:VF_128 0 "register_operand" "=x,x") (vec_merge:VF_128 (plusminus:VF_128 (match_operand:VF_128 1 "register_operand" "0,x") (vec_duplicate:VF_128 (match_operand: 2 "nonimmediate_operand" "xm,xm"))) (match_dup 1) (const_int 1)))] 3) With the operation in scalar mode: [(set (match_operand:VF_128 0 "register_operand" "=x,x") (vec_merge:VF_128 (vec_duplicate:VF_128 (plusminus: (vec_select: (match_operand:VF_128 1 "register_operand" "0,x") (parallel [(const_int 0)])) (match_operand: 2 "nonimmediate_operand" "xm,xm")))) (match_dup 1) (const_int 1)))] 4) Special version which only makes sense for vectors of 2 elements: [(set (match_operand:V2DF 0 "register_operand" "=x,x") (vec_concat:V2DF (plusminus:DF (vec_select:DF (match_operand:V2DF 1 "register_operand" "0,x") (parallel [(const_int 0)])) (match_operand:DF 2 "nonimmediate_operand" "xm,xm")) (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))))] -- Marc Glisse