From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 127282 invoked by alias); 14 Nov 2019 23:09:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 127271 invoked by uid 89); 14 Nov 2019 23:09:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.6 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 14 Nov 2019 23:09:17 +0000 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id xAEMxB3s014918; Thu, 14 Nov 2019 18:09:14 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 2w9fakhvbj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Nov 2019 18:09:14 -0500 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.27/8.16.0.27) with SMTP id xAEMxIOh015606; Thu, 14 Nov 2019 18:09:13 -0500 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0b-001b2d01.pphosted.com with ESMTP id 2w9fakhvb7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Nov 2019 18:09:13 -0500 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id xAEN5fhM003371; Thu, 14 Nov 2019 23:09:13 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma03dal.us.ibm.com with ESMTP id 2w9f8r8kg5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Nov 2019 23:09:12 +0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id xAEN9Cnb33554886 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Nov 2019 23:09:12 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 073C9C6066; Thu, 14 Nov 2019 23:09:12 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 75074C6069; Thu, 14 Nov 2019 23:09:11 +0000 (GMT) Received: from ibm-toto.the-meissners.org (unknown [9.32.77.177]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTPS; Thu, 14 Nov 2019 23:09:11 +0000 (GMT) Date: Thu, 14 Nov 2019 23:14:00 -0000 From: Michael Meissner To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn Subject: [PATCH], V7, #6 of 7, Fix issues with vector extract and prefixed instructions Message-ID: <20191114230909.GF7528@ibm-toto.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn References: <20191114222509.GA7581@ibm-toto.the-meissners.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191114222509.GA7581@ibm-toto.the-meissners.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2019-11/txt/msg01307.txt.bz2 This patch fixes two issues with vector extracts and prefixed instructions. The first is if you use a vector extract on a vector that is located in memory and to access the vector, you use a PC-relative address with a veriable index. I.e.: #include static vector int vi; int get (int n) { return vec_extract (vi, n); } In this case, the current code re-uses the temporary for calculating the offset of the element to load up the address of the vector, losing the offset. This code prevents the combiner from combining loading the vector from memory and the vector extract if the vector is accessed via a PC-relative address. Instead, the vector is loaded up into a register, and the variable extract from a register is done. I needed to add a new constraint (em) in addition to new predicate functions. I discovered that with the predicate function alone, the register allocator would re-create the address. The constraint prevents this combination. I also modified the vector extract code to generate a single PC-relative load if the vector has a PC-relative address and the offset is constant. I have built a bootstrap compiler with this change, and there were no regressions in the test suite. Can I check this into the FSF trunk? 2019-11-14 Michael Meissner * config/rs6000/constraints.md (em constraint): New constraint for non-prefixed memory. * config/rs6000/predicates.md (non_prefixed_memory): New predicate. (reg_or_non_prefixed_memory): New predicate. * config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support for optimizing extracting a constant vector element from a vector that uses a prefixed address. If the element number is variable and the address uses a prefixed address, abort. * config/rs6000/vsx.md (vsx_extract__var, VSX_D iterator): Do not allow combining prefixed memory with a variable vector extract. (vsx_extract_v4sf_var): Do not allow combining prefixed memory with a variable vector extract. (vsx_extract__var, VSX_EXTRACT_I iterator): Do not allow combining prefixed memory with a variable vector extract. (vsx_extract__mode_var): Do not allow combining prefixed memory with a variable vector extract. * doc/md.texi (PowerPC constraints): Document the em constraint. Index: gcc/config/rs6000/constraints.md =================================================================== --- gcc/config/rs6000/constraints.md (revision 278173) +++ gcc/config/rs6000/constraints.md (working copy) @@ -202,6 +202,11 @@ (define_constraint "H" ;; Memory constraints +(define_memory_constraint "em" + "A memory operand that does not contain a prefixed address." + (and (match_code "mem") + (match_test "non_prefixed_memory (op, mode)"))) + (define_memory_constraint "es" "A ``stable'' memory operand; that is, one which does not include any automodification of the base register. Unlike @samp{m}, this constraint Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (revision 278177) +++ gcc/config/rs6000/predicates.md (working copy) @@ -1836,3 +1836,24 @@ (define_predicate "prefixed_memory" { return address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT); }) + +;; Return true if the operand is a memory address that does not use a prefixed +;; address. +(define_predicate "non_prefixed_memory" + (match_code "mem") +{ + /* If the operand is not a valid memory operand even if it is not prefixed, + do not return true. */ + if (!memory_operand (op, mode)) + return false; + + return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT); +}) + +;; Return true if the operand is either a register or it is a non-prefixed +;; memory operand. +(define_predicate "reg_or_non_prefixed_memory" + (match_code "reg,subreg,mem") +{ + return gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode); +}) Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 278178) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -6734,6 +6734,7 @@ rs6000_adjust_vec_address (rtx scalar_re rtx element_offset; rtx new_addr; bool valid_addr_p; + bool pcrel_p = pcrel_local_address (addr, Pmode); /* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY. */ gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC); @@ -6771,6 +6772,38 @@ rs6000_adjust_vec_address (rtx scalar_re else if (REG_P (addr) || SUBREG_P (addr)) new_addr = gen_rtx_PLUS (Pmode, addr, element_offset); + /* Optimize PC-relative addresses with a constant offset. */ + else if (pcrel_p && CONST_INT_P (element_offset)) + { + rtx addr2 = addr; + HOST_WIDE_INT offset = INTVAL (element_offset); + + if (GET_CODE (addr2) == CONST) + addr2 = XEXP (addr2, 0); + + if (GET_CODE (addr2) == PLUS) + { + offset += INTVAL (XEXP (addr2, 1)); + addr2 = XEXP (addr2, 0); + } + + gcc_assert (SIGNED_34BIT_OFFSET_P (offset)); + if (offset) + { + addr2 = gen_rtx_PLUS (Pmode, addr2, GEN_INT (offset)); + new_addr = gen_rtx_CONST (Pmode, addr2); + } + else + new_addr = addr2; + } + + /* With only one temporary base register, we can't support a PC-relative + address added to a variable offset. This is because the PADDI instruction + requires RA to be 0 when doing a PC-relative add (i.e. no register to add + to). */ + else if (pcrel_p) + gcc_unreachable (); + /* Optimize D-FORM addresses with constant offset with a constant element, to include the element offset in the address directly. */ else if (GET_CODE (addr) == PLUS) @@ -6785,8 +6818,11 @@ rs6000_adjust_vec_address (rtx scalar_re HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset); rtx offset_rtx = GEN_INT (offset); - if (IN_RANGE (offset, -32768, 32767) - && (scalar_size < 8 || (offset & 0x3) == 0)) + if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset)) + new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx); + + else if (SIGNED_16BIT_OFFSET_P (offset) + && (scalar_size < 8 || (offset & 0x3) == 0)) new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx); else { @@ -6834,11 +6870,11 @@ rs6000_adjust_vec_address (rtx scalar_re new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset); } - /* If we have a PLUS, we need to see whether the particular register class - allows for D-FORM or X-FORM addressing. */ - if (GET_CODE (new_addr) == PLUS) + /* If we have a PLUS or a PC-relative address without the PLUS, we need to + see whether the particular register class allows for D-FORM or X-FORM + addressing. */ + if (GET_CODE (new_addr) == PLUS || pcrel_p) { - rtx op1 = XEXP (new_addr, 1); addr_mask_type addr_mask; unsigned int scalar_regno = reg_or_subregno (scalar_reg); @@ -6855,10 +6891,16 @@ rs6000_adjust_vec_address (rtx scalar_re else gcc_unreachable (); - if (REG_P (op1) || SUBREG_P (op1)) - valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0; - else + if (pcrel_p) valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0; + else + { + rtx op1 = XEXP (new_addr, 1); + if (REG_P (op1) || SUBREG_P (op1)) + valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0; + else + valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0; + } } else if (REG_P (new_addr) || SUBREG_P (new_addr)) Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 278173) +++ gcc/config/rs6000/vsx.md (working copy) @@ -3248,9 +3248,10 @@ (define_insn "vsx_vslo_" ;; Variable V2DI/V2DF extract (define_insn_and_split "vsx_extract__var" [(set (match_operand: 0 "gpc_reg_operand" "=v,wa,r") - (unspec: [(match_operand:VSX_D 1 "input_operand" "v,m,m") - (match_operand:DI 2 "gpc_reg_operand" "r,r,r")] - UNSPEC_VSX_EXTRACT)) + (unspec: + [(match_operand:VSX_D 1 "reg_or_non_prefixed_memory" "v,em,em") + (match_operand:DI 2 "gpc_reg_operand" "r,r,r")] + UNSPEC_VSX_EXTRACT)) (clobber (match_scratch:DI 3 "=r,&b,&b")) (clobber (match_scratch:V2DI 4 "=&v,X,X"))] "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" @@ -3318,9 +3319,10 @@ (define_insn_and_split "*vsx_extract_v4s ;; Variable V4SF extract (define_insn_and_split "vsx_extract_v4sf_var" [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r") - (unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m") - (match_operand:DI 2 "gpc_reg_operand" "r,r,r")] - UNSPEC_VSX_EXTRACT)) + (unspec:SF + [(match_operand:V4SF 1 "reg_or_non_prefixed_memory" "v,em,em") + (match_operand:DI 2 "gpc_reg_operand" "r,r,r")] + UNSPEC_VSX_EXTRACT)) (clobber (match_scratch:DI 3 "=r,&b,&b")) (clobber (match_scratch:V2DI 4 "=&v,X,X"))] "VECTOR_MEM_VSX_P (V4SFmode) && TARGET_DIRECT_MOVE_64BIT" @@ -3681,7 +3683,7 @@ (define_insn_and_split "*vsx_extract__var" [(set (match_operand: 0 "gpc_reg_operand" "=r,r,r") (unspec: - [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m") + [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_prefixed_memory" "v,v,em") (match_operand:DI 2 "gpc_reg_operand" "r,r,r")] UNSPEC_VSX_EXTRACT)) (clobber (match_scratch:DI 3 "=r,r,&b")) @@ -3701,7 +3703,7 @@ (define_insn_and_split "*vsx_extract_ 0 "gpc_reg_operand" "=r,r,r") (zero_extend: (unspec: - [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m") + [(match_operand:VSX_EXTRACT_I 1 "reg_or_non_prefixed_memory" "v,v,em") (match_operand:DI 2 "gpc_reg_operand" "r,r,r")] UNSPEC_VSX_EXTRACT))) (clobber (match_scratch:DI 3 "=r,r,&b")) Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 278173) +++ gcc/doc/md.texi (working copy) @@ -3373,6 +3373,9 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va is not. +@item em +A memory operand that does not contain a prefixed address. + @item es A ``stable'' memory operand; that is, one which does not include any automodification of the base register. This used to be useful when -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.ibm.com, phone: +1 (978) 899-4797