From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30760 invoked by alias); 4 Dec 2012 14:05:28 -0000 Received: (qmail 30555 invoked by uid 22791); 4 Dec 2012 14:05:25 -0000 X-SWARE-Spam-Status: No, hits=-7.9 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_HI,RCVD_IN_HOSTKARMA_W,RP_MATCHES_RCVD,TW_ZJ X-Spam-Check-By: sourceware.org Received: from mail1-relais-roc.national.inria.fr (HELO mail1-relais-roc.national.inria.fr) (192.134.164.82) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 04 Dec 2012 14:05:16 +0000 Received: from ip-53.net-81-220-141.rev.numericable.fr (HELO laptop-mg.local) ([81.220.141.53]) by mail1-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2012 15:05:15 +0100 Date: Tue, 04 Dec 2012 14:05:00 -0000 From: Marc Glisse To: Uros Bizjak cc: gcc-patches@gcc.gnu.org, "H.J. Lu" Subject: Re: [i386] scalar ops that preserve the high part of a vector In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-12/txt/msg00227.txt.bz2 On Mon, 3 Dec 2012, Uros Bizjak wrote: > On Mon, Dec 3, 2012 at 4:34 PM, Marc Glisse wrote: > >>> However, looking a bit more into the usage cases for these patterns, >>> they are only used through intrinsics with _m128 operands. While your >>> proposed patch makes these patterns more general (they can use 64bit >>> aligned memory), this is not their usual usage, and for their intended >>> usage, your proposed improvement complicates these patterns >>> unnecessarily. Following on these facts, I'd say that we leave these >>> special patters (since they serve their purpose well) and rather >>> introduce new patterns for "other" uses. >> >> >> You mean like in the original patch? >> http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01279.html >> >> (it only had the V2DF version, not the V4SF one) >> >> Funny how we switched sides, now I am the one who would rather have a single >> pattern instead of having one for the builtin and one for recog. It seems >> that once we add the new pattern, keeping the old one is a waste of >> maintenance time, and the few extra rtx from the slightly longer pattern for >> these seldomly used builtins should be negligible. > > Yes, I didn't notice at the time that the intention of existing > patterns was to implement intrinsics that exclusively use _m128 > operands. > >> But I don't mind, if that's the version you prefer, I'll update the patch. > > Actually, both approaches have their benefits and drawbacks. > Specialized vec_merge patterns can be efficiently macroized, and > support builtins with _m128 operands in a simple and efficient way. > You are proposing patterns that do not macroize well (this is what was > learned from your last patch) and require breakup of existing > macroized patterns. > > So, we are actually adding new functionality - operations on an array > of values. IMO, this warrants new patterns, but please find a way for > V2DF and V4SF to macroize in the same way. I am still confused as to what is wanted. If the quantity to minimize is the number of entries in sse.md, we should replace the existing vec_merge pattern with this one: it macroizes just as well, it directly matches for V4SF, and the piece of code needed in simplify-rtx for V2DF isn't too absurd. (then we need to adjust the builtins as in one of the previous patches) [(set (match_operand:VF_128 0 "register_operand" "=x,x") (vec_merge:VF_128 (vec_duplicate:VF_128 (plusminus: (vec_select: (match_operand:VF_128 1 "register_operand" "0,x") (parallel [(const_int 0)])) (match_operand: 2 "nonimmediate_operand" "xm,xm")))) (match_dup 1) (const_int 1)))] Then there is the question (i) of possibly introducing a specialized version for V2DF (different pattern) instead of adding code to simplify-rtx. And finally there is the question (ii) of keeping the old define_insn in addition to the new one(s), just for the builtins. My preference is: (i) specialized pattern for V2DF (ii) remove It seems like you might be ok with: (i) simplify-rtx (ii) remove Do you agree? -- Marc Glisse