From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30183 invoked by alias); 6 Apr 2010 09:24:43 -0000 Received: (qmail 30154 invoked by uid 22791); 6 Apr 2010 09:24:31 -0000 X-SWARE-Spam-Status: No, hits=0.3 required=5.0 tests=BAYES_05,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,SARE_MSGID_LONG45,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Received: from mail-pv0-f175.google.com (HELO mail-pv0-f175.google.com) (74.125.83.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 06 Apr 2010 09:24:26 +0000 Received: by pvb32 with SMTP id 32so122588pvb.20 for ; Tue, 06 Apr 2010 02:24:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.113.7 with HTTP; Tue, 6 Apr 2010 02:24:25 -0700 (PDT) Date: Tue, 06 Apr 2010 09:24:00 -0000 Received: by 10.114.139.10 with SMTP id m10mr5859009wad.128.1270545865175; Tue, 06 Apr 2010 02:24:25 -0700 (PDT) Message-ID: Subject: lower subreg optimization From: roy rosen To: gcc@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2010-04/txt/msg00075.txt.bz2 Hi, I have encountered several problems with lower subreg optimization in my port. In some cases I noticed that insns are decomposed in subreg1 pass and do not get recomposed later which causes at the end using two insns instead of one. For example I have the following dump before subreg1 (note 30 93 31 7 [bb 7] NOTE_INSN_BASIC_BLOCK) (insn 31 30 32 7 a.c:25 (set (reg:V4HI 112) (mem:V4HI (reg/f:SI 98 [ __vect_p_41 ]) [2 S8 A64])) 115 {*movv4hi_load} (nil)) (insn 32 31 33 7 a.c:25 (set (reg:V4HI 113) (mem:V4HI (reg/f:SI 99 [ __vect_p_36 ]) [2 S8 A64])) 115 {*movv4hi_load} (nil)) (insn 33 32 34 7 a.c:25 (set (subreg:V2HI (reg:V4HI 114) 0) (plus:V2HI (subreg:V2HI (reg:V4HI 112) 0) (subreg:V2HI (reg:V4HI 113) 0))) 118 {addv2hi3} (nil)) (insn 34 33 35 7 a.c:25 (set (subreg:V2HI (reg:V4HI 114) 4) (plus:V2HI (subreg:V2HI (reg:V4HI 112) 4) (subreg:V2HI (reg:V4HI 113) 4))) 118 {addv2hi3} (nil)) (insn 35 34 36 7 a.c:25 (set (reg:V4HI 114) (vec_concat:V4HI (subreg:V2HI (reg:V4HI 114) 0) (subreg:V2HI (reg:V4HI 114) 4))) 119 {concat_v2hi_to_v4hi} (expr_list:REG_EQUAL (plus:V4HI (reg:V4HI 112) (reg:V4HI 113)) (nil))) (insn 36 35 37 7 a.c:25 (set (mem:V4HI (reg/f:SI 97 [ __vect_p_47 ]) [2 S8 A64]) (reg:V4HI 114)) 116 {*movv4hi_store} (nil)) which turns into: (note 30 93 94 7 [bb 7] NOTE_INSN_BASIC_BLOCK) (insn 94 30 95 7 a.c:25 (set (reg:SI 142) (mem:SI (reg/f:SI 98 [ __vect_p_41 ]) [2 S4 A64])) 62 {movsi_load} (nil)) (insn 95 94 96 7 a.c:25 (set (reg:SI 143 [+4 ]) (mem:SI (plus:SI (reg/f:SI 98 [ __vect_p_41 ]) (const_int 4 [0x4])) [2 S4 A32])) 62 {movsi_load} (nil)) (insn 96 95 97 7 a.c:25 (set (reg:SI 144) (mem:SI (reg/f:SI 99 [ __vect_p_36 ]) [2 S4 A64])) 62 {movsi_load} (nil)) (insn 97 96 33 7 a.c:25 (set (reg:SI 145 [+4 ]) (mem:SI (plus:SI (reg/f:SI 99 [ __vect_p_36 ]) (const_int 4 [0x4])) [2 S4 A32])) 62 {movsi_load} (nil)) (insn 33 97 34 7 a.c:25 (set (subreg:V2HI (reg:V4HI 114) 0) (plus:V2HI (subreg:V2HI (reg:SI 142) 0) (subreg:V2HI (reg:SI 144) 0))) 118 {addv2hi3} (nil)) (insn 34 33 35 7 a.c:25 (set (subreg:V2HI (reg:V4HI 114) 4) (plus:V2HI (subreg:V2HI (reg:SI 143 [+4 ]) 0) (subreg:V2HI (reg:SI 145 [+4 ]) 0))) 118 {addv2hi3} (nil)) (insn 35 34 36 7 a.c:25 (set (reg:V4HI 114) (vec_concat:V4HI (subreg:V2HI (reg:V4HI 114) 0) (subreg:V2HI (reg:V4HI 114) 4))) 119 {concat_v2hi_to_v4hi} (nil)) (insn 36 35 98 7 a.c:25 (set (mem:V4HI (reg/f:SI 97 [ __vect_p_47 ]) [2 S8 A64]) (reg:V4HI 114)) 116 {*movv4hi_store} (nil)) notice that now the loads are being done in SI mode which is twice expensive than in V4HI mode. Can someone please help with that? Should this code be decomposed and then composed (which it doesn't) or should it not be decoposed at the first place. What should I change in order to get at the end a load for v4hi. Thanks, Roy.