From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 04EAA3858CD1; Wed, 31 May 2023 09:00:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 04EAA3858CD1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0360083.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34V8lbFj016251; Wed, 31 May 2023 09:00:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=i/YEyf8bzZsiGjZ57kJf9iSbwptlQZwcwqXOtsL/IlA=; b=HPa1aiJiGgOXnpy65BQFbs1yKlcpyRPbCGyyqcEC8JYfWmaiKyjayQtJkXNUk8KzH/zc BQLGUu404+vdjzgAcyQRULXJYK6cfybyoQdJ3WZCZtJLkHONqHthQO710pashiOvsVu/ m5JLLewMqw+/BwHoL54Q0XMr73dEB79+tufP82X0fhQfJy6QMs1J2DhDIxDz9S6Y42py ztJPMamsCd4SmknKVbdBTYybdvHTSZdoyC6eNHQyICqVhI+EenBnxdxMiasoghoPjnKM R6damKFpZisBDnyE2QTTIf7cJ4o5RpqJBqIoY119s1p2K9CgndpaZvJaXOCINN32zRZ5 yg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qwjvfre90-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 31 May 2023 09:00:19 +0000 Received: from m0360083.ppops.net (m0360083.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 34V8PW8a029120; Wed, 31 May 2023 09:00:18 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qwjvfre6b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 31 May 2023 09:00:18 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 34V8oGtf015524; Wed, 31 May 2023 09:00:13 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma01fra.de.ibm.com (PPS) with ESMTPS id 3qu9g51kgy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 31 May 2023 09:00:13 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 34V90AZU23397044 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 31 May 2023 09:00:10 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8F4612004E; Wed, 31 May 2023 09:00:10 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 575C32004B; Wed, 31 May 2023 09:00:07 +0000 (GMT) Received: from [9.177.78.100] (unknown [9.177.78.100]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 31 May 2023 09:00:06 +0000 (GMT) Message-ID: <64d67b5d-6ab5-79bf-0375-49e426b7559e@linux.ibm.com> Date: Wed, 31 May 2023 17:00:05 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract* Content-Language: en-US To: Segher Boessenkool Cc: Rainer Orth , Mike Stump , David Edelsohn , Kewen Lin , gcc-patches@gcc.gnu.org, Alexandre Oliva , Vladimir Makarov References: <0737fbfc-726c-ffca-5f36-d6b3f0decfec@linux.ibm.com> <20230525112200.GJ19790@gate.crashing.org> From: "Kewen.Lin" In-Reply-To: <20230525112200.GJ19790@gate.crashing.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: lLTz1VLQIWz0MnEgluStfXldIYLqT6uE X-Proofpoint-GUID: J2lK_9hPRl97tMb_EUUfNz5hnz7RZF0s X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-05-31_04,2023-05-30_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 suspectscore=0 mlxlogscore=999 adultscore=0 priorityscore=1501 lowpriorityscore=0 bulkscore=0 mlxscore=0 impostorscore=0 phishscore=0 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305310074 X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Segher, on 2023/5/25 19:22, Segher Boessenkool wrote: > Hi! > > On Thu, May 25, 2023 at 07:05:55AM -0300, Alexandre Oliva wrote: >> On May 25, 2023, "Kewen.Lin" wrote: >>> So both lp64 and ilp32 have the same count, could we merge it and >>> remove the selectors? >> >> We could, but... I thought I wouldn't, since they were different >> before, and they're likely to diverge again in the future. I thought >> that combining them might suggest that they ought to be the same, when >> we already know that this is not the case. >> >> I'll prepare an alternate patch that combines them. > > Fwiw, updating the insn counts blindly like this has very small value on > the one hand, and negative value on the other. In total, negative > value. > > If it is not possible to keep these tests up-to-date easily the test > should be improved. If tests regressed otoh we should ***not*** paper > over that with patches like this, but investigate what happened instead: > such regressions are *real*. > > So which is it here? I am assuming it is a not-to-well written testcase > without all the necessary noipa attrs, and/or putting more than one > thing to test per function directly. Insn counts then shift easily if > the compiler decides to factor (CSE etc.) your code differently, but > that is a testcase artifact then, not something we want to adjust counts > for all of the time. > > It is feasible to do these insn count things only for trivial tiny > snippets. Everything bigger will regress all of the time, no one will > look at it properly, and instead people will just do blind "update > counts" patches like this :-/ *Good* insn count tests are quite > valuable, but harder to write. But maintenance costs noticably bigger > than zero for a testcase are not good, how many testcases do we run in > the testsuite? > > So, can we fix the underlying problem here please? Thanks for all the comments and good question. I looked into this issue and found the current counts for 32-bit are mainly for aix (it doesn't need any updates there), and there are some generated assembly differences between aix 32-bit and 32-bit Linux, and it seems to be related to if compiler saves the frame pointer or not. Take a function testbc_var from fold-vec-extract-char.p7.c as example: #include unsigned char testbc_var (vector bool char vbc2, signed int si) { return vec_extract (vbc2, si); } 1) on aix 32-bit, with trunk: .testbc_var: li 9,32 addi 10,1,-64 stxvw4x 34,10,9 rlwinm 3,3,0,28,31 addi 9,3,-64 // these two lines add 3,9,1 // can be combined, see below. lbz 3,32(3) blr with old gcc (without r11-6615): .testbc_var: addi 10,1,-64 li 9,32 stxvw4x 34,10,9 rlwinm 3,3,0,28,31 add 3,10,3 // better lbz 3,32(3) blr apparently an extra unnecessary addi is created. The test case expects one addi to adjust stack for a temp space, one add to prepare the index for the extracted byte. 2) same thing happens on aix 64-bit and Linux 64-bit: trunk: .testbc_var: li 9,48 addi 10,1,-64 stxvw4x 34,10,9 rldicl 5,5,0,60 addi 9,5,-64 // similar to aix 32-bit add 5,9,1 // .... lbz 3,48(5) blr vs. optimized: .testbc_var: addi 10,1,-64 li 9,48 stxvw4x 34,10,9 rldicl 5,5,0,60 add 5,10,5 // better lbz 3,48(5) blr 3) but for Linux 32-bit, they are the same between trunk and old gcc (without r11-6615): testbc_var: stwu 1,-48(1) li 9,16 rlwinm 3,3,0,28,31 stxvw4x 34,1,9 add 3,1,3 lbz 3,16(3) addi 1,1,48 blr So the expected count adjusted for aix 32-bit broke Linux 32-bit. As above, the behavior change (one more addi) on 64-bit and aix 32-bit results in sub-optimal code than before, but we updated the counts previously, so I changed PR101169's component to rtl-optimization for further investigation. From 3), what Alexandre proposed to fix for Linux 32-bit is actually to restore the expected count back to before. :) BR, Kewen