From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 539203858289 for ; Wed, 6 Dec 2023 02:23:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 539203858289 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 539203858289 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701829387; cv=none; b=n9siKo2HkYUFxXruJyb3hZqZ16m6fZ900gAfKlmi5A0HtGPgyoGtC19jYSJylM4gQ4bdi3WzX24O5guHS5W8EGBhWImNFDmiDk+zC+y7EMc5K9gVYy9J533QcWho6XRbFwj9e/7B6dlqvaGF9kFxFyYO88S7a7derLfpXixV+10= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701829387; c=relaxed/simple; bh=CdlxkRnl3IxVLmgFa6Njn8R4M/f2H87LUbG9LoPPQr4=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=mgL6X0jnJkvgScZCKEK22e8Hkd8dw1pvatuBld7Lv4w8UkLHQhVtTx698UlQaTJHq8KDTAW5SYZvq9gBrx3pfqVACeIfuKxmEy8Ezz5ZbguHD8Qcc2tgW+yxiMbRlFNX3U8xb+z0qzds6+5rKn+90ZuvpED0pjaDF1g6Fxhf76U= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B627vZX016220; Wed, 6 Dec 2023 02:23:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=dcStmH0JKsKxSMhN6ntiZUzy6c4nfd/5Td/unZAY3Gs=; b=GYkYeMgWihJ0bCzUfu1yK3pzgORE4fVlFA5TRfDV5kDgOgvqRu/vMRm8erUKkCJPpnqJ 8oPHwjG6qobcio/ca7d4q+aH6+rq8F7VpTd9E5nQh7d3gApHC9AaZZjwHO9UGQGt3Xmk Qoab7ItghTVuKJQaCsZH/2CsN42vsewIZMz7QiZm8+4nDlm5oZ3OKMytFKHnSeyz5gBf STG5eEwFfvBEz6LyKzrZ3B2rF+RQCBOnndGertOVgCD/UQhJvbX/Q9lLaW8M3h4hZrKW CJObz7cYt3dap/0I19+JjNHEPRXegSKdzO7N/oTN7MegoHZ1b5YXZRFqX03obiH1sEzF 7Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3utfwb8d1a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 02:23:05 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3B628kJ4019345; Wed, 6 Dec 2023 02:23:05 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3utfwb8d13-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 02:23:05 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3B60muOw014284; Wed, 6 Dec 2023 02:23:04 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3utau41kf8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 02:23:04 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3B62N14a3015222 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 6 Dec 2023 02:23:01 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 250872004D; Wed, 6 Dec 2023 02:23:01 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F041B20040; Wed, 6 Dec 2023 02:22:58 +0000 (GMT) Received: from [9.197.235.117] (unknown [9.197.235.117]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 6 Dec 2023 02:22:58 +0000 (GMT) Message-ID: Date: Wed, 6 Dec 2023 10:22:57 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp Content-Language: en-US To: Ajit Agarwal Cc: Segher Boessenkool , David Edelsohn , Peter Bergner , Michael Meissner , GCC Patches References: <77426697-1571-e180-add9-cfb6d10f8424@linux.ibm.com> <57d3fbcb-98b6-4658-8d08-e30f8c68a18c@linux.ibm.com> <11198028-5b04-4ebd-9374-a78dc85376a8@linux.ibm.com> From: "Kewen.Lin" In-Reply-To: <11198028-5b04-4ebd-9374-a78dc85376a8@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: XB9E4N9mz1SNr5tmaMHm5pMcCsf82u-m X-Proofpoint-ORIG-GUID: Kh79hGPhVw7mId0vSBDwAmJHwmi1sFxl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-06_01,2023-12-05_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 adultscore=0 mlxlogscore=999 lowpriorityscore=0 priorityscore=1501 malwarescore=0 suspectscore=0 impostorscore=0 spamscore=0 bulkscore=0 clxscore=1015 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312060018 X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: on 2023/12/6 02:01, Ajit Agarwal wrote: > Hello Kewen: > > > On 05/12/23 7:13 pm, Ajit Agarwal wrote: >> Hello Kewen: >> >> On 04/12/23 7:31 am, Kewen.Lin wrote: >>> Hi Ajit, >>> >>> on 2023/12/1 17:10, Ajit Agarwal wrote: >>>> Hello Kewen: >>>> >>>> On 24/11/23 3:01 pm, Kewen.Lin wrote: >>>>> Hi Ajit, >>>>> >>>>> Don't forget to CC David (CC-ed) :), some comments are inlined below. >>>>> >>>>> on 2023/10/8 03:04, Ajit Agarwal wrote: >>>>>> Hello All: >>>>>> >>>>>> This patch add new pass to replace contiguous addresses vector load lxv with mma instruction >>>>>> lxvp. >>>>> >>>>> IMHO the current binding lxvp (and lxvpx, stxvp{x,}) to MMA looks wrong, it's only >>>>> Power10 and VSX required, these instructions should perform well without MMA support. >>>>> So one patch to separate their support from MMA seems to go first. >>>>> >>>> >>>> I will make the changes for Power10 and VSX. >>>> >>>>>> This patch addresses one regressions failure in ARM architecture. >>>>> >>>>> Could you explain this? I don't see any test case for this. >>>> >>>> I have submitted v1 of the patch and there were regressions failure for Linaro. >>>> I have fixed in version V2. >>> >>> OK, thanks for clarifying. So some unexpected changes on generic code in v1 >>> caused the failure exposed on arm. >>> >>>> >>>> >>>>> Besides, it seems a bad idea to put this pass after reload? as register allocation >>>>> finishes, this pairing has to be restricted by the reg No. (I didn't see any >>>>> checking on the reg No. relationship for paring btw.) >>>>> >>>> >>>> Adding before reload pass deletes one of the lxv and replaced with lxvp. This >>>> fails in reload pass while freeing reg_eqivs as ira populates them and then >>> >>> I can't find reg_eqivs, I guessed you meant reg_equivs and moved this pass right before >>> pass_reload (between pass_ira and pass_reload)? IMHO it's unexpected as those two passes >>> are closely correlated. I was expecting to put it somewhere before ira. >> >> Yes they are tied together and moving before reload will not work. >> >>> >>>> vecload pass deletes some of insns and while freeing in reload pass as insn >>>> is already deleted in vecload pass reload pass segfaults. >>>> >>>> Moving vecload pass before ira will not make register pairs with lxvp and >>>> in ira and that will be a problem. >>> >>> Could you elaborate the obstacle for moving such pass before pass_ira? >>> >>> Basing on the status quo, the lxvp is bundled with OOmode, then I'd expect >>> we can generate OOmode move (load) and use the components with unspec (or >>> subreg with Peter's patch) to replace all the previous use places, it looks >>> doable to me. >> >> Moving before ira passes, we delete the offset lxv and generate lxvp and replace all >> the uses, that I am doing. But the offset lxvp register generated by ira are not >> register pair and generate random register and hence we cannot generate lxvp. >> >> For example one lxv is generated with register 32 and other pair is generated >> with register 45 by ira if we move it before ira passes. > > It generates the following. > lxvp %vs32,0(%r4) > xvf32ger 0,%vs34,%vs32 > xvf32gerpp 0,%vs34,%vs45 What do the RTL insns for these insns look like? I'd expect you use UNSPEC_MMA_EXTRACT to extract V16QI from the result of lxvp, the current define_insn_and_split "*vsx_disassemble_pair" should be able to take care of it further (eg: reg and regoff). BR, Kewen > xxmfacc 0 > stxvp %vs2,0(%r3) > stxvp %vs0,32(%r3) > blr > > > Instead of vs33 ira generates vs45 if we move before pass_ira. > > Thanks & Regards > Ajit > > >> Thanks & Regards >> Ajit >>> >> >>>> >>>> Making after reload pass is the only solution I see as ira and reload pass >>>> makes register pairs and vecload pass will be easier with generation of >>>> lxvp. >>>> >>>> Please suggest. >>>> >>>>> Looking forward to the comments from Segher/David/Peter/Mike etc. >>> >>> Still looking forward. :) >>> >>> BR, >>> Kewen