From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-333485-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 30760 invoked by alias); 4 Dec 2012 14:05:28 -0000
Received: (qmail 30555 invoked by uid 22791); 4 Dec 2012 14:05:25 -0000
X-SWARE-Spam-Status: No, hits=-7.9 required=5.0	tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_HI,RCVD_IN_HOSTKARMA_W,RP_MATCHES_RCVD,TW_ZJ
X-Spam-Check-By: sourceware.org
Received: from mail1-relais-roc.national.inria.fr (HELO mail1-relais-roc.national.inria.fr) (192.134.164.82)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 04 Dec 2012 14:05:16 +0000
Received: from ip-53.net-81-220-141.rev.numericable.fr (HELO laptop-mg.local) ([81.220.141.53])  by mail1-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2012 15:05:15 +0100
Date: Tue, 04 Dec 2012 14:05:00 -0000
From: Marc Glisse <marc.glisse@inria.fr>
To: Uros Bizjak <ubizjak@gmail.com>
cc: gcc-patches@gcc.gnu.org, "H.J. Lu" <hjl.tools@gmail.com>
Subject: Re: [i386] scalar ops that preserve the high part of a vector
In-Reply-To: <CAFULd4a5my_Hre6k5kTUm2UqpBX06kKKe9jjMiFisBJSUJPxyw@mail.gmail.com>
Message-ID: <alpine.DEB.2.02.1212041436150.3790@laptop-mg.saclay.inria.fr>
References: <alpine.DEB.2.02.1210131032460.9651@stedding.saclay.inria.fr> <CAFULd4YHdLF1ZyxrMG8MhRjo40f-EfAJZnDOEBc80pOGa4WNGQ@mail.gmail.com> <alpine.DEB.2.02.1210141057010.3752@laptop-mg.saclay.inria.fr> <alpine.DEB.2.02.1211301317160.3783@laptop-mg.saclay.inria.fr> <CAFULd4bVFbr1KC6FrjWOhEfmqEr_v2VuQMFjg4+TpTB681=HhA@mail.gmail.com> <alpine.DEB.2.02.1211302244290.3783@laptop-mg.saclay.inria.fr> <alpine.DEB.2.02.1212011800400.19206@stedding.saclay.inria.fr> <CAFULd4Z=8xeVScUM7JSqkHHXJrmO7qVa40q-h-mUE+Vehb86vw@mail.gmail.com> <alpine.DEB.2.02.1212021215330.3747@laptop-mg.saclay.inria.fr> <CAFULd4Z4Ywk+BQupB+=tDKiWx5QrCt9VL83XF4Xtgu03=y0VZw@mail.gmail.com> <alpine.DEB.2.02.1212030958050.3737@laptop-mg.saclay.inria.fr> <CAFULd4a5my_Hre6k5kTUm2UqpBX06kKKe9jjMiFisBJSUJPxyw@mail.gmail.com>
User-Agent: Alpine 2.02 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2012-12/txt/msg00227.txt.bz2

On Mon, 3 Dec 2012, Uros Bizjak wrote:

> On Mon, Dec 3, 2012 at 4:34 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>
>>> However, looking a bit more into the usage cases for these patterns,
>>> they are only used through intrinsics with _m128 operands. While your
>>> proposed patch makes these patterns more general (they can use 64bit
>>> aligned memory), this is not their usual usage, and for their intended
>>> usage, your proposed improvement complicates these patterns
>>> unnecessarily. Following on these facts, I'd say that we leave these
>>> special patters (since they serve their purpose well) and rather
>>> introduce new patterns for "other" uses.
>>
>>
>> You mean like in the original patch?
>> http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01279.html
>>
>> (it only had the V2DF version, not the V4SF one)
>>
>> Funny how we switched sides, now I am the one who would rather have a single
>> pattern instead of having one for the builtin and one for recog. It seems
>> that once we add the new pattern, keeping the old one is a waste of
>> maintenance time, and the few extra rtx from the slightly longer pattern for
>> these seldomly used builtins should be negligible.
>
> Yes,  I didn't notice at the time that the intention of existing
> patterns was to implement intrinsics that exclusively use _m128
> operands.
>
>> But I don't mind, if that's the version you prefer, I'll update the patch.
>
> Actually, both approaches have their benefits and drawbacks.
> Specialized vec_merge patterns can be efficiently macroized, and
> support builtins with _m128 operands in a simple and efficient way.
> You are proposing patterns that do not macroize well (this is what was
> learned from your last patch) and require breakup of existing
> macroized patterns.
>
> So, we are actually adding new functionality - operations on an array
> of values. IMO, this warrants new patterns, but please find a way for
> V2DF and V4SF to macroize in the same way.

I am still confused as to what is wanted. If the quantity to minimize is
the number of entries in sse.md, we should replace the existing
vec_merge pattern with this one: it macroizes just as well, it directly
matches for V4SF, and the piece of code needed in simplify-rtx for V2DF
isn't too absurd. (then we need to adjust the builtins as in one of the
previous patches)

[(set (match_operand:VF_128 0 "register_operand" "=x,x")
       (vec_merge:VF_128
 	(vec_duplicate:VF_128
 	  (plusminus:<ssescalarmode>
 	    (vec_select:<ssescalarmode>
 	      (match_operand:VF_128 1 "register_operand" "0,x")
 	      (parallel [(const_int 0)]))
 	    (match_operand:<ssescalarmode> 2 "nonimmediate_operand" "xm,xm"))))
       (match_dup 1)
       (const_int 1)))]

Then there is the question (i) of possibly introducing a specialized
version for V2DF (different pattern) instead of adding code to
simplify-rtx.

And finally there is the question (ii) of keeping the old define_insn in
addition to the new one(s), just for the builtins.

My preference is:
(i) specialized pattern for V2DF
(ii) remove

It seems like you might be ok with:
(i) simplify-rtx
(ii) remove

Do you agree?

-- 
Marc Glisse