From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30679 invoked by alias); 7 Dec 2012 08:46:02 -0000 Received: (qmail 30601 invoked by uid 22791); 7 Dec 2012 08:46:01 -0000 X-SWARE-Spam-Status: No, hits=-5.0 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE X-Spam-Check-By: sourceware.org Received: from mail-ob0-f175.google.com (HELO mail-ob0-f175.google.com) (209.85.214.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 07 Dec 2012 08:45:56 +0000 Received: by mail-ob0-f175.google.com with SMTP id vb8so186550obc.20 for ; Fri, 07 Dec 2012 00:45:55 -0800 (PST) MIME-Version: 1.0 Received: by 10.60.32.235 with SMTP id m11mr2956631oei.129.1354869955828; Fri, 07 Dec 2012 00:45:55 -0800 (PST) Received: by 10.182.153.201 with HTTP; Fri, 7 Dec 2012 00:45:55 -0800 (PST) In-Reply-To: References: Date: Fri, 07 Dec 2012 08:46:00 -0000 Message-ID: Subject: Re: [i386] scalar ops that preserve the high part of a vector From: Uros Bizjak To: Michael Zolotukhin Cc: Kirill Yukhin , "H.J. Lu" , Marc Glisse , gcc-patches List Content-Type: text/plain; charset=ISO-8859-1 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-12/txt/msg00457.txt.bz2 On Fri, Dec 7, 2012 at 7:49 AM, Michael Zolotukhin wrote: > Hi guys, > Could I ask several questions just to clarify the things up? > > 1) Does the root problem lay in the fact that even for scalar > additions we perform the addition on the whole vector and only then > drop the higher parts of the vector? I.e. to fix the test from the PR > we need to replace plus on vector mode with plus on scalar mode? Yes, existing pattern is used to implement intrinsics, and it is modelled with vector operand 2. But we in fact emit scalar operation, so we would like to model the pattern with a scalar operand 2. This way, the same pattern can be used to emit intrinsics _and_ can be used to optimize the code from the testcase at the same time. Also, please note that alignment requirements for vector operand and scalar operands are different. > 2) Is one of the main requirements having the same pattern for V4SF > and V2DF version? It is not required, but having macroized pattern avoids pattern explosion, eases maintenance (it is easier to understand similar functionality if it is described in some uniform way), and in some cases, macroization opportunities force author to rethink the RTL description, making the patterns more "universal". Uros.