From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 12A123858D34 for ; Tue, 11 Jun 2024 11:26:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 12A123858D34 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 12A123858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718105192; cv=none; b=ilkcbfAXqCxRbKTz6qAA+1pNAadVLloXdJ2ddWBCvNyv00QCJRw26fTg6+WG8AmAkkSZhUb8m5YRDQxXGAC1udcgEDfWS6i2C1nhLYEd3vfVbCqMHV7BuiJMVHpnbhJjQ4UFaGEt9ywb544qTiMa01IJHDqAzzq2oh+TlW/Kby0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718105192; c=relaxed/simple; bh=Nl+VdEEmIeybK0xAY5z6zPOT0RJXZvJFwvOtUr8ig4A=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=A2Sa9QMcoOoEB+MNljpkpjVddht0xu7/RHsfPrcXvEMrfeMxfhSC96oL6SZqHoS15555smYkqbKAbmF/6Us2eAxe9SApuKuEana6V4tVROEm3m2FcTMSHCNzp9gA2IooXbN6av24tedQNaPSNomL96gKvEeOJfFoQz/W6B01wpk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45BBPi45004780; Tue, 11 Jun 2024 11:26:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h= message-id:date:mime-version:subject:to:references:from :in-reply-to:content-type:content-transfer-encoding; s=pp1; bh=Y zyCXmCcXbciDHBUzbNN5BnURw0otW8EVtNoeW3VlvE=; b=iK4SM1KXi0DsDjOxU Qs/62b3bvo/r8WojAUAsQyYEV+Nbw7YadpZHT8EQMiELnqRpAgtpzxaQ4jJPu2sD 8U9QI7rD+3ur1rpxJRnAiHzQeqmlvxxxMlA1R9PoORfptpm9G6l3vGlfSxUDSY2O PY57ro/VlPTMB8aQN9EK5KTegw+wyj1jm9z3dLsiAglqytIeHmhk5swX5qSKosC2 u1MjiQ0pT+eAlGaEq8FyY4kpxWf4E6iuO1ju79PB2iJNzLRxTA7FOFpiDiT4y4Ie mZWkQSWwk3Edct5O3MtGVsqSA0ebE6aqNbyP0nTwaJtT08I/T5lbqv/zjX3X3juH kOJMQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ypht3gnqe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jun 2024 11:26:27 +0000 (GMT) Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 45BBQQWc006671; Tue, 11 Jun 2024 11:26:26 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ypht3gnqa-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jun 2024 11:26:26 +0000 (GMT) Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 45BB7OwZ020069; Tue, 11 Jun 2024 11:26:26 GMT Received: from smtprelay07.dal12v.mail.ibm.com ([172.16.1.9]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3yn34mwdm7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jun 2024 11:26:26 +0000 Received: from smtpav01.dal12v.mail.ibm.com (smtpav01.dal12v.mail.ibm.com [10.241.53.100]) by smtprelay07.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 45BBQNJr14484188 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Jun 2024 11:26:25 GMT Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EF47158062; Tue, 11 Jun 2024 11:26:22 +0000 (GMT) Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AA49F58057; Tue, 11 Jun 2024 11:26:19 +0000 (GMT) Received: from [9.43.9.46] (unknown [9.43.9.46]) by smtpav01.dal12v.mail.ibm.com (Postfix) with ESMTP; Tue, 11 Jun 2024 11:26:19 +0000 (GMT) Message-ID: <5c2be215-e632-4d39-8f9b-0af1c9b34c3f@linux.ibm.com> Date: Tue, 11 Jun 2024 16:56:17 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion To: Alex Coplan , "Kewen.Lin" , Segher Boessenkool , Michael Meissner , Peter Bergner , David Edelsohn , gcc-patches , richard.sandiford@arm.com References: <115b83b5-db17-4192-9760-d6668a37cf85@linux.ibm.com> <3de58984-09a5-4b8c-8d70-b45c9b10f8ee@linux.ibm.com> <23b191a0-32de-412c-b1eb-4cf9e465ed83@linux.ibm.com> <140fa158-1fa1-4f5e-973e-a8f082d66688@linux.ibm.com> <96c0c0be-1fa6-4d49-b61b-aa2eef0ebc5a@linux.ibm.com> Content-Language: en-US From: Ajit Agarwal In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: cpgm-cmaii_lGMf2dnXi8do8aP_j6QMf X-Proofpoint-GUID: KvAZEvRNscFTPIlImXuq45T9_Jm6flsj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-11_07,2024-06-11_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 malwarescore=0 priorityscore=1501 spamscore=0 adultscore=0 mlxscore=0 bulkscore=0 lowpriorityscore=0 clxscore=1015 mlxlogscore=999 impostorscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2405170001 definitions=main-2406110085 X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,KAM_MANYTO,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Richard: On 11/06/24 4:36 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >>>>>> After LRA reload: >>>>>> >>>>>> (insn 9299 2472 2412 187 (set (reg:V2DF 51 19 [orig:240 vect__302.545 ] [240]) >>>>>> (mem:V2DF (plus:DI (reg:DI 8 8 [orig:1285 ivtmp.886 ] [1285]) >>>>>> (const_int 16 [0x10])) [1 MEM [(real(kind=8) *)_4188]+16 S16 A64])) "shell_lam.fppized.f":238:72 1190 {vsx_movv2df_64bit} >>>>>> (nil)) >>>>>> (insn 2412 9299 2477 187 (set (reg:V2DF 51 19 [orig:240 vect__302.545 ] [240]) >>>>>> (neg:V2DF (fma:V2DF (reg:V2DF 39 7 [ MEM [(real(kind=8) *)_4050]+16 ]) >>>>>> (reg:V2DF 44 12 [3119]) >>>>>> (neg:V2DF (reg:V2DF 51 19 [orig:240 vect__302.545 ] [240]))))) {*vsx_nfmsv2df4} >>>>>> (nil)) >>>>>> >>>>>> (insn 2473 9311 9312 187 (set (reg:V2DF 38 6 [orig:905 vect__302.545 ] [905]) >>>>>> (neg:V2DF (fma:V2DF (reg:V2DF 44 12 [3119]) >>>>>> (reg:V2DF 38 6 [orig:2561 MEM [(real(kind=8) *)_4050] ] [2561]) >>>>>> (neg:V2DF (reg:V2DF 47 15 [5266]))))) {*vsx_nfmsv2df4} >>>>>> (nil)) >>>>>> >>>>>> In the above allocated code it assign registers 51 and 47 and they are not sequential. >>>>> >>>>> The reload for 2412 looks valid. What was the original pre-reload >>>>> version of insn 2473? Also, what happened to insn 2472? Was it deleted? >>>>> >>>> >>>> This is preload version of 2473: >>>> >>>> (insn 2473 2396 2478 161 (set (reg:V2DF 905 [ vect__302.545 ]) >>>> (neg:V2DF (fma:V2DF (reg:V2DF 4283 [3119]) >>>> (subreg:V2DF (reg:OO 2561 [ MEM [(real(kind=8) *)_4050] ]) 0) >>>> (neg:V2DF (subreg:V2DF (reg:OO 2572 [ vect__300.543_236 ]) 0))))) {*vsx_nfmsv2df4} >>>> (expr_list:REG_DEAD (reg:OO 2572 [ vect__300.543_236 ]) >>>> (expr_list:REG_DEAD (reg:OO 2561 [ MEM [(real(kind=8) *)_4050] ]) >>>> (nil)))) >>>> >>>> insn 2472 is replaced with 9299 after reload. >>> >>> You'd have to check the dumps to be sure, but I think 9299 is instead >>> generated as an input reload of 2412, rather than being a replacement >>> of insn 2472. T >> >> Yes it is generated for 2412. The predecessor of 2412 is load from >> plus offset as in 2412 we have subreg:V2DF (reg OO 2572) 16). >> >> This is not correct as we are not generating lxvp and it is >> normal load lxv. >> As normal load is generated in predecessor insn of 2412 with >> plus constant offset it breaks the correctness. > > Not using lxvp is a deliberate choice though. > > If a (reg:OO R) has been spilled, there's no requirement for LRA > to load both halves of R when only one half is needed. LRA just > loads what it needs into whichever registers happen to be free. > > If the reload of R instead used lxvp, LRA would be forced to free > up another register for the other half of R, even though that value > would never be used. > If a (reg:OO R ) 16 is loaded when it is spilled then loaded value will be from plus offset 16 instead it should be loaded value from zero offset. As in load fusion pass we are replacing (reg:V2DI R) with subreg (reg:OO R) 16 and hence loaded value is from plus 16 offsets and thats why its breaking the correctness. Similarly we are replacing (reg:V2DI R) 16 with subreg (reg:OO R) 0 and loaded value is from 16 offset instead its loading from zero offset and thats why we are breaking the correctness. To generate lxvp this is the semantics of replacing in load fusion pass. >>> That is, LRA needs to reload (subreg:V2DF (reg:OO 2572) 16) >>> from memory for insn 2412. It can use the destination of insn 2412 (r51) >>> as a temporary to do that. It doesn't need to load the other half of >>> reg:OO 2572 for this instruction. That in itself looks ok. >>> >>> So it looks like the problem is specific to insn 2473. Perhaps LRA >>> thinks that r47 already contains the low half of (reg:OO 2572), >>> left behind by some previous instruction not shown above? >>> If LRA is wrong about that -- if r47 doesn't already contain the >>> low half of (reg:OO 2572) -- then there's a bug somewhere. >>> But we need to track down and fix the bug rather than try to sidestep >>> it in the fusion pass. >>> >> >> Similarly for 2473 normal load with 0 offset are generated in predecessor >> insn as we are generating subreg:V2DF (reg OO 2572) 0 in 2473. As we are not >> generating lxvp this is not correct and breaks the code. > > That too sounds ok, for the reasons above. > >> Above code is valid if we are generating lxvp that generates >> sequential registers, but we are not geneating lxvp and normal >> load is generated and this breaks the code. > > I think you said earlier that the code is miscompiled (fails at > runtime). If that's due to an RA issue, then presumably there is > an instruction that, after RA, is reading the wrong value. In other > words, there's presumably a register input somewhere that has the wrong > contents. Have you isolated which instruction and register that is? > The code is compiled and run successfully but we get miscompare error. > Thanks, > Richard Thanks & Regards Ajit