From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 178D83889831 for ; Mon, 3 Jun 2024 15:58:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 178D83889831 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 178D83889831 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717430319; cv=none; b=tUNt4Feam58HeuQDZLBn0FnZa9KltwP647bggManPYWjjJoRjovDkox1aiRwFtcrS4IJ/DLj5lVVhzrpGi588mlVwea3C1OIMDZIGWRiwsFbb2Db6IglRfcAfp0iRElYLHX9BC0s1KG6wcxRS8jqEJH6+VLoC8NkwLmHZcwKqxc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717430319; c=relaxed/simple; bh=jwmcfvcjaY7jRDvDqe1j5iAM5vHv3jcgdELE6TdQEM8=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=JUeT5AXK4Nexy/bk8wIlgr281Cg140n2yKlbc5tQndPHvlJHqTnQjcXFKdO58/ml4ZXtLErXVbqO0LxXUDOjTmgG/UhHt/R9H7ZLA6NClLxoCfrIyq/veEj2412pt4RySJh4lb1xLdHowhZ6PFi5GgJMWnA+5WeCQJ5egJ3dnhE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 453FvCS9024986; Mon, 3 Jun 2024 15:58:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=content-transfer-encoding : content-type : date : from : in-reply-to : message-id : mime-version : references : subject : to; s=pp1; bh=8slTGzxYa7gkIlj9rOzvKPx5JQigT/UsdH/ifUt5pTg=; b=lBhDadHrCcIElp+1UtMPA1ZUV5wWivVWtz51TkSrTP6fz1yasCr0K66mJU3KAB6LycKQ KI15wmVLqKEjDdypwjuAKWHLbVw1e2C/zPXhHV8vJaQEnDiXrR6DeCyerpNXg1uRRk02 JttZDss/yZhAbM71bvzO70wFjNY31Ckr2IkNf/9BWis/r3V9Lnn/lc/HMqbxa61RtOb8 Q1FE4ohw4ZtyP006sTs1TstYDYeZ3kFuwDBDhLqsNpbbXWBzwX/6ROqRHZvf3bT6RK3g qnxQ2VzB0wabv/L/0ZnSSOGrqRRtKRaPCquLQ7chQZThFCve7r1gKDxW0JTi60yaPRMK Ig== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yhgx1g03t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 03 Jun 2024 15:58:32 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 453FwWZc027606; Mon, 3 Jun 2024 15:58:32 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yhgx1g03q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 03 Jun 2024 15:58:32 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 453FtHgv031106; Mon, 3 Jun 2024 15:58:31 GMT Received: from smtprelay03.wdc07v.mail.ibm.com ([172.16.1.70]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3ygeyp8s30-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 03 Jun 2024 15:58:31 +0000 Received: from smtpav05.dal12v.mail.ibm.com (smtpav05.dal12v.mail.ibm.com [10.241.53.104]) by smtprelay03.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 453FwRWV57082122 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 3 Jun 2024 15:58:30 GMT Received: from smtpav05.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id ADC5858065; Mon, 3 Jun 2024 15:58:27 +0000 (GMT) Received: from smtpav05.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 858D65805D; Mon, 3 Jun 2024 15:58:24 +0000 (GMT) Received: from [9.36.9.27] (unknown [9.36.9.27]) by smtpav05.dal12v.mail.ibm.com (Postfix) with ESMTP; Mon, 3 Jun 2024 15:58:24 +0000 (GMT) Message-ID: <6a87ceb9-0de2-49f2-8998-17391c213c7d@linux.ibm.com> Date: Mon, 3 Jun 2024 21:28:22 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [Patch, rs6000, aarch64, middle-end] Add implementation for different targets for pair mem fusion To: Alex Coplan , "Kewen.Lin" , Segher Boessenkool , Michael Meissner , Peter Bergner , David Edelsohn , gcc-patches , richard.sandiford@arm.com References: <53ba46de-6c01-4c68-bd98-1ba6950a793a@linux.ibm.com> <95a33b0a-8090-4218-a62c-da1f53bebbb7@linux.ibm.com> <9efb06e2-74f1-42f1-8a52-931d13a57ebc@linux.ibm.com> <957bd4b7-11dd-4ebd-adf1-1c0815884944@linux.ibm.com> <99a53e50-ed04-4bd8-baa5-f13d5376585a@linux.ibm.com> Content-Language: en-US From: Ajit Agarwal In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: bJ_GVn4or8m8hPUuyrPSb4dFfhhtht45 X-Proofpoint-ORIG-GUID: AwYghi0fbU2ygWaqYrkVnF9mGyPloYzh X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.12.28.16 definitions=2024-06-03_13,2024-05-30_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 mlxscore=0 suspectscore=0 clxscore=1015 mlxlogscore=999 bulkscore=0 priorityscore=1501 malwarescore=0 spamscore=0 lowpriorityscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2406030132 X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,KAM_MANYTO,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Richard: On 03/06/24 8:24 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> >> On 03/06/24 7:47 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: >>>> On 03/06/24 5:03 pm, Richard Sandiford wrote: >>>>> Ajit Agarwal writes: >>>>>>> [...] >>>>>>> If it is intentional, what distinguishes things like vperm and xxinsertw >>>>>>> (and all other unspecs) from plain addition? >>>>>>> >>>>>>> [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa") >>>>>>> (plus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "wa") >>>>>>> (match_operand:VSX_F 2 "vsx_register_operand" "wa")))] >>>>>>> >>>>>> >>>>>> Plain addition are not supported currently. >>>>>> We have not seen many cases with plain addition and this patch >>>>>> will not accept plain addition. >>>>>> >>>>>> >>>>>>> This is why the intention behind the patch is important. As it stands, >>>>>>> it isn't clear what criteria the patch is using to distinguish "valid" >>>>>>> fuse candidates from "invalid" ones. >>>>>>> >>>>>> >>>>>> Intention behind this patch all variants of UNSPEC instructions are >>>>>> supported and uses without UNSPEC are not supported in this patch. >>>>> >>>>> But why make the distinction this way though? UNSPEC is a very >>>>> GCC-specific concept. Whether something is an UNSPEC or some other >>>>> RTL code depends largely on historical accident. E.g. we have specific >>>>> codes for VEC_SELECT, VEC_MERGE, and VEC_DUPLICATE, but don't have one >>>>> for VEC_PERM (even for VEC_PERM_EXPR exists in gimple). >>>>> >>>>> It seems unlikely that GCC's choice about whether to represent something >>>>> as an UNSPEC or as another RTL code lines up neatly with the kind of >>>>> codegen decisions that a good assembly programmer would make. >>>>> >>>>> I suppose another way of asking is to turn this around and say: what >>>>> kind of uses are you trying to exclude? Presumably things are worse >>>>> if you remove this function override. But what makes them worse? >>>>> What kind of uses cause the regression? >>>>> >>>> >>>> Uses of fused load where load with low address uses are modified with load with high address uses. >>>> >>>> Similarly load with high address uses are modified with load low address >>>> uses. >>> >>> It sounds like something is going wrong the subreg updates. >>> Can you give an example of where this occurs? For instance... >>> >>>> This is the semantics of lxvp instructions which can occur through >>>> UNSPEC uses otherwise it breaks the functionality and seen failure >>>> in almost all vect regressions and SPEC benchmarks. >>> >>> ...could you take one of the simpler vect regressions, show the before >>> and after RTL, and why the transformation is wrong? >> >> Before the change: >> >> (insn 32 30 103 5 (set (reg:V16QI 127 [ _32 ]) >> (mem:V16QI (reg:DI 130 [ ivtmp.37 ]) [1 MEM [(short unsigned int *)_55]+0 S16 A128])) {vsx_movv16qi_64bit} >> (nil)) >> (insn 103 32 135 5 (set (reg:V16QI 173 [ _32 ]) >> (mem:V16QI (plus:DI (reg:DI 130 [ ivtmp.37 ]) >> (const_int 16 [0x10])) [1 MEM [(short unsigned int *)_55]+0 S16 A128])) {vsx_movv16qi_64bit} >> (nil)) >> (insn 135 103 34 5 (set (reg:DI 155) >> (plus:DI (reg:DI 130 [ ivtmp.37 ]) >> (const_int 16 [0x10]))) 66 {*adddi3} >> (nil)) >> (insn 34 135 104 5 (set (reg:V16QI 143 [ _27 ]) >> (unspec:V16QI [ >> (reg:V16QI 127 [ _32 ]) repeated x2 >> (reg:V16QI 152) >> ] UNSPEC_VPERM)) {altivec_vperm_v16qi_direct} >> (expr_list:REG_DEAD (reg:V16QI 127 [ _32 ]) >> (nil))) >> (insn 104 34 35 5 (set (reg:V16QI 174 [ _27 ]) >> (unspec:V16QI [ >> (reg:V16QI 173 [ _32 ]) repeated x2 >> (reg:V16QI 152) >> ] UNSPEC_VPERM)) >> {altivec_vperm_v16qi_direct} >> >> >> After the change: >> >> (insn 103 30 135 5 (set (reg:OO 127 [ _32 ]) >> (mem:OO (reg:DI 130 [ ivtmp.37 ]) [1 MEM [(short unsigned int *)_55]+0 S16 A128])) {*movoo} >> (nil)) >> (insn 135 103 34 5 (set (reg:DI 155) >> (plus:DI (reg:DI 130 [ ivtmp.37 ]) >> (const_int 16 [0x10]))) 66 {*adddi3} >> (nil)) >> (insn 34 135 104 5 (set (reg:V16QI 143 [ _27 ]) >> (unspec:V16QI [ >> (subreg:V16QI (reg:OO 127 [ _32 ]) 16) >> (subreg:V16QI (reg:OO 127 [ _32 ]) 16) >> (reg:V16QI 152) >> ] UNSPEC_VPERM)) {altivec_vperm_v16qi_direct} >> (expr_list:REG_DEAD (reg:OO 127 [ _32 ]) >> (nil))) >> (insn 104 34 35 5 (set (reg:V16QI 174 [ _27 ]) >> (unspec:V16QI [ >> (subreg:V16QI (reg:OO 127 [ _32 ]) 0) >> (subreg:V16QI (reg:OO 127 [ _32 ]) 0) >> (reg:V16QI 152) >> ] UNSPEC_VPERM)) {altivec_vperm_v16qi_direct} >> >> After the change the tests passes. > > But isn't this an example of the optimisation working on unspecs, > and working correctly? > Yes this is working fine. > I meant instead: could you give an example of the vect regressions > that you saw with the unspec test removed? You mentioned that many > vect tests regressed without the unspec test, so it would be helpful > to see these failures in action. That is, it'd be helpful to take > a compiler that doesn't have the unspec tests and show: > > - the relevant rtl of one of the failing tests before the pass runs > (when the rtl is still correct) > > - the relevant rtl of one of the failing tests after the pass runs > (when the rtl is now incorrect) > > - the reason why the rtl after the pass is wrong > I meant to say this is the semantics of lxvp instructions which can occur through UNSPEC uses. If we dont use above semantics in UNSPEC the vect regressions and spec fails functionality. I will find a test without UNSPEC and let you know. Thanks & Regards Ajit > Thanks, > Richard