From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id AF5573858D20 for ; Fri, 14 Apr 2023 05:40:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AF5573858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33E4mMdj004813; Fri, 14 Apr 2023 05:40:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=wu8DDbyVlLQZlWGoE23syiwkBb+t80BJPJxw3o9o9hM=; b=tMw0AjNo2NVSGyZH1Y8laJ+N0MJgHMq3tjJn9fYyynC/86eZbCRv9nBVaG6KG0tY2Mlr fe5LNFKNZXMJzETO3zTq6ow4Hci25wBJH5lCZclhUELT+IDUmBXcPSRrZD/aZHDC0Gq5 5PGaToKG4eRL25ozIVCWDDPYDW02mT0p05UDZdou2jkvANXPfSfvKrlVVCc/gGaQOOap MGlcUlwaVvpY70wOifEMqm11xc7cVYZljH/QxdtLO30v2HRaf8GeN4PqXPtwamPuzdlO BL8HGsOxuZUua2j1MDYWCWRS/zs9L2qnxyZ29pJcpYCZP617LsK0Ya/GuM0AJu/06xfy qQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pxwq5ndt8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 14 Apr 2023 05:40:25 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33E5c6V9016872; Fri, 14 Apr 2023 05:40:24 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pxwq5nds2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 14 Apr 2023 05:40:24 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 33D3G6Xo022165; Fri, 14 Apr 2023 05:40:22 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3pu0hdkfd7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 14 Apr 2023 05:40:22 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 33E5eKuM4981456 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 14 Apr 2023 05:40:20 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EEFA02004E; Fri, 14 Apr 2023 05:40:19 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4728B20043; Fri, 14 Apr 2023 05:40:17 +0000 (GMT) Received: from [9.177.69.38] (unknown [9.177.69.38]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 14 Apr 2023 05:40:16 +0000 (GMT) Message-ID: <4894ac03-b31e-c3c7-9030-6052ac312df7@linux.ibm.com> Date: Fri, 14 Apr 2023 13:40:15 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization Content-Language: en-US To: "juzhe.zhong@rivai.ai" Cc: gcc-patches , jeffreyalaw , rdapp , "richard.sandiford" , rguenther References: <20230407014741.139387-1-juzhe.zhong@rivai.ai> <63723855B0BF2130+2023041120125573846623@rivai.ai> <139DA38AFC9CA5B5+2023041216004591287739@rivai.ai> <8D9731A1540E082A+202304122122129793085@rivai.ai> <0fb64ee1-1bf6-2a5f-4214-7567af5facbd@linux.ibm.com> <9D802FBA3E3A18CE+2023041321440689154856@rivai.ai> <0B74E8705E17A2A0+2023041411091537265710@rivai.ai> From: "Kewen.Lin" In-Reply-To: <0B74E8705E17A2A0+2023041411091537265710@rivai.ai> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: FgaukZvfSbBkJttWUpWJF28WkJWeRCG1 X-Proofpoint-ORIG-GUID: 1rHCOHM3fR_0_pB-xqkx5J5tu-u4DuIj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-14_02,2023-04-13_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 impostorscore=0 spamscore=0 mlxscore=0 phishscore=0 suspectscore=0 clxscore=1015 bulkscore=0 priorityscore=1501 malwarescore=0 mlxlogscore=959 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304140050 X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Juzhe, >>> Yes, we just wants to add the whole vector register length in bytes. > OK, I learn it and appreciate you give me the information. > >>> I wonder if you also want WHILE_LEN to have the implicit effect >>>to update vector length register? >>>>From this perspective, Richi's >>>suggestion on "tieing the scalar result with the uses" looks better IMHO. > No, I don't want to make WHILE_LEN have implict side-effect. > Just tieing the scalar result with the uses. > Updating vector length register, I let RISC-V backend port to do that. > I don't want to involve any RISC-V specific feature into GCC middle-end. > Good, thanks for clarifying, that makes more sense. >>>No, for both cases, IV is variable, the dumping at loop2_doloop for the proposed sequence says >>>"Doloop: Possible infinite iteration case.", it seems to show that for the proposed sequence compiler >>>isn't able to figure out the loop is finite, it may miss the range information on n, or it isn't >>>able to analyze how the invariant involves, but I didn't look into it, all my guesses. > Ok, I think it may be fixed in the future. Yeah, it can be. It only matters for us when adopting --param vect-partial-vector-usage=2 but it's not default. > > So, I wonder whether you are basically agree with the concept of this patch? > Would you mind giving more suggestions  that I can fix this patch to make more benefits for IBM (s390 or rs6000)? > For example, will you try this patch to see whether it can work for IBM in case of multiple rgroup of SLP? The concept looks good to me, for IBM ports, it can benefit the length preparation for the case of --param vect-partial-vector-usage=2 (excepting for possible missing doloop chance), it's neutral for the case of --param vect-partial-vector-usage=1. IMHO, if possible you can extend the current function vect_set_loop_controls_directly rather than adding a new function vect_set_loop_controls_by_while_len, since that function does handle both masks and lengths (controls). And as vect_gen_len's comments shows, once you change the length preparation, you have to adjust the corresponding costs as well. And sure, once this becomes stable (all decisions from the discussions settled down, gets fully reviewed in stage 1), I'll test it on Power10 and get back to you. BR, Kewen