From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22333 invoked by alias); 8 Apr 2010 05:48:41 -0000 Received: (qmail 22101 invoked by uid 22791); 8 Apr 2010 05:48:40 -0000 X-SWARE-Spam-Status: No, hits=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SARE_MSGID_LONG45 X-Spam-Check-By: sourceware.org Received: from mail-iw0-f200.google.com (HELO mail-iw0-f200.google.com) (209.85.223.200) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 08 Apr 2010 05:48:35 +0000 Received: by iwn38 with SMTP id 38so1281241iwn.8 for ; Wed, 07 Apr 2010 22:48:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.113.7 with HTTP; Wed, 7 Apr 2010 22:48:33 -0700 (PDT) In-Reply-To: <4BBB6358.4050602@codesourcery.com> References: <4BBB6358.4050602@codesourcery.com> Date: Thu, 08 Apr 2010 06:16:00 -0000 Received: by 10.231.149.10 with SMTP id r10mr702416ibv.63.1270705713433; Wed, 07 Apr 2010 22:48:33 -0700 (PDT) Message-ID: Subject: Re: lower subreg optimization From: roy rosen To: Jim Wilson Cc: gcc@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2010-04/txt/msg00126.txt.bz2 2010/4/6, Jim Wilson : > On 04/06/2010 02:24 AM, roy rosen wrote: > > (insn 33 32 34 7 a.c:25 (set (subreg:V2HI (reg:V4HI 114) 0) > > (plus:V2HI (subreg:V2HI (reg:V4HI 112) 0) > > (subreg:V2HI (reg:V4HI 113) 0))) 118 {addv2hi3} (nil)) > > > > Only subregs are decomposed. So use vec_select instead of subreg. I see > you already have a vec_concat to combine the two v2hi into one v4hi, so > there is no need for the subreg in the dest. You should try eliminating > that first and see if that helps. If that isn't enough, then replace the > subregs in the source with vec_select operations. > > Jim > Thanks Jim, I have implemented your suggestion and now I am using vec_select and the subreg optimization does not decomopose the instruction. The problem now is that I get stuck with redundent instructions (that I translate to move insns). For example: (insn 37 32 38 7 a.c:25 (set (reg:V2HI 116) (vec_concat:V2HI (vec_select:HI (reg:V4HI 112) (parallel [ (const_int 0 [0x0]) ])) (vec_select:HI (reg:V4HI 112) (parallel [ (const_int 1 [0x1]) ])))) 121 {v4hi_extract_low_v2hi} (expr_list:REG_DEAD (reg:V4HI 112) (nil))) This instruction eventually has to be optimized out somehow. It is dealing with extracting V2HI from V4HI. V4HI is stored in a register pair (like r0:r1) and V2HI would simply mean to take one of these registers - this does not need an instruction. I saw in arm/neon.md that they have a similar problem: ; FIXME: We wouldn't need the following insns if we could write subregs of ; vector registers. Make an attempt at removing unnecessary moves, though ; we're really at the mercy of the register allocator. (define_insn "move_lo_quad_v4si" [(set (match_operand:V4SI 0 "s_register_operand" "+w") (vec_concat:V4SI (match_operand:V2SI 1 "s_register_operand" "w") (vec_select:V2SI (match_dup 0) (parallel [(const_int 2) (const_int 3)]))))] "TARGET_NEON" { int dest = REGNO (operands[0]); int src = REGNO (operands[1]); if (dest != src) return "vmov\t%e0, %P1"; else return ""; } [(set_attr "neon_type" "neon_bp_simple")] ) Their solution is also not complete. What is the proper way to handle such a case and how do I let gcc know that this is a simple move instruction so that gcc would be able to optimize it out? Thanks, Roy.