From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 0E8763858D20 for ; Fri, 14 Apr 2023 06:31:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0E8763858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33E5mnjj002127; Fri, 14 Apr 2023 06:31:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=qNgOmdHRxoGsliM4RpGSQ/V7qfHVrdL+YM7QF1nm8UM=; b=noZCzdTtcycKXMTYvYcnbL6fi1wg6hpeYfyWsTJIim+6/OrUB2nNS2r5gaN3eQoNo8S0 rovyRvD2Ry+QsPw72CCtCkxibH2cbRRaIB8aNi7yhXzY8iWeTVgvOi2r8RAc1QIhFxhe dBHczSxI3TWQKFZNuOc5kifKCxv+P/EfkfoiXVi6ShGB0BvHFl/Z75AoevRuhlU5VQaz 5nBQXNiSe1PW3CEGdW8gKp/43ggC07WBZkmtA881+xslmWkw7tcTmtWhQw9s0+vSFS5r Ex189TO44cq74xAaZJ7xdFjH0r3gf/9Nbxt8v39XYUy6dKf5ZDK860+Dw+lt7Kd8uuXh wA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3pxwfufdah-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 14 Apr 2023 06:31:16 +0000 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33E6H6Xi006687; Fri, 14 Apr 2023 06:31:15 GMT Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3pxwfufd9p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 14 Apr 2023 06:31:15 +0000 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 33E2nBZv031609; Fri, 14 Apr 2023 06:31:13 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma03fra.de.ibm.com (PPS) with ESMTPS id 3pu0mkjwq8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 14 Apr 2023 06:31:13 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 33E6VBns39649600 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 14 Apr 2023 06:31:11 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 56F3B20043; Fri, 14 Apr 2023 06:31:11 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A6E5020040; Fri, 14 Apr 2023 06:31:08 +0000 (GMT) Received: from [9.177.69.38] (unknown [9.177.69.38]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 14 Apr 2023 06:31:08 +0000 (GMT) Message-ID: <7f9a1d2e-84af-95a5-76b9-bb50a458842c@linux.ibm.com> Date: Fri, 14 Apr 2023 14:31:06 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization Content-Language: en-US To: "juzhe.zhong@rivai.ai" Cc: gcc-patches , jeffreyalaw , rdapp , "richard.sandiford" , rguenther References: <20230407014741.139387-1-juzhe.zhong@rivai.ai> <63723855B0BF2130+2023041120125573846623@rivai.ai> <139DA38AFC9CA5B5+2023041216004591287739@rivai.ai> <8D9731A1540E082A+202304122122129793085@rivai.ai> <0fb64ee1-1bf6-2a5f-4214-7567af5facbd@linux.ibm.com> <9D802FBA3E3A18CE+2023041321440689154856@rivai.ai> <2E7B1DB75F2F78AE+2023041411394350100020@rivai.ai> From: "Kewen.Lin" In-Reply-To: <2E7B1DB75F2F78AE+2023041411394350100020@rivai.ai> Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: o2uHE_V36rW6TjAkdVZgRF-yEvFQJKNs X-Proofpoint-ORIG-GUID: qM5y0nva0dEEwnxItoYPkx91iucCNy-6 Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-14_02,2023-04-13_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 clxscore=1015 mlxlogscore=999 phishscore=0 mlxscore=0 bulkscore=0 lowpriorityscore=0 impostorscore=0 adultscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304140054 X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Juzhe, on 2023/4/14 11:39, juzhe.zhong@rivai.ai wrote: > And also I already decided to make remove WHILE_LEN pattern since it seems to be unnecessary. > And as Richard said, it's just a simple airthmetic and it's not worthwhile to do that. > > So, I plan to replace WHILE_LEN into MIN_EXPR and make everything RVV specific done in RISC-V port. Yeah, MIN_EXPR is enough for IBM ports, but with seeing the special semantic of vsetvli on "vl = ceil(AVL / 2) for VLMAX < AVL < 2*VLMAX", I'm not sure if it's a good idea for RV, it seems to put the burden to RV backend. For one case that the iteration count is known, on the environment with the above capability, using the vector setting as [1], assuming the given iterations is 10, fully unrolled, when using MIN_EXPR, the lengths for two iterations would be folded into 8/2, while using WHILE_LEN artificial folding can make the lengths be 5/5. I assumed that on the environment with the above capability 5/5 is optimal than 8/2? that means if we use MIN then RV backend has to try to make 8/2 to 5/5. Or it's trivial since RV backend already supports and plans to support this kind of vsetvli load re-balancing? [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615634.html > I think it's more reasonable for IBM use and more target use in the future. > If RV needs WHILE_LEN, IMHO they can co-exist, like: for ports defining len_{load,store} but no while_len, use MIN; for ports defining while_len, then use WHILE_LEN. > So, this patch will need to changed as "introduce a new flow to do vectorization loop control" which is a new loop control flow > with saturating subtracting n down to zero, and add a target hook for it so that we can switch to this flow ? Yes, if you don't need WHILE_LEN, this proposal is more like to enhance the current partial vectorization with length (mainly on length preparation and loop control). But why would we need a new target hook? You want to keep the existing length handlings in vect_set_loop_controls_directly unchanged? it seems not necessary. IIUC, not requiring WHILE_LEN also means that this patch doesn't necessarily block the other RV backend patches on vector with length exploitation since the existing vector with length support already works well on functionality. BR, Kewen