From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id C594B3858D28 for ; Sat, 23 Mar 2024 18:37:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C594B3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C594B3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711219049; cv=none; b=keyU0T576ODfHY4aY95VOP/U0YO4AHvJORHj3gMnfb11BEC9L+KRv98ECv5mCvYPoA4PvFJFqwHNipqRqeE+ucY+Ebi0LjZyia9fIR8kuZ8PimHNFUxbIBgFZqKjD+sifvmWzTqQSSm7dwLHJ9pmmTODQUUeC6+7iI9GOPuD050= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711219049; c=relaxed/simple; bh=KPP/cKajOQdqfXzKmeryaBFhzmh45m9jMQnN8XNm8Hg=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=lN+XrBWXfRAdQTAM/DNbQC6CviKiNGQYvu1lSaBHbekuvhbzA5AGNqZkSkj5CphKxybMqnLx4D5uk7+iUG5lS/0FYI75oUxjnSW9hu6tW/3gqtXOKe+tfkMLLVd1tcBHGE7HccfPpPX9QziT4At7LZnS46nseCWCJtFrYA4tVho= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 42NHtjal025409; Sat, 23 Mar 2024 18:37:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=olFTs1TVYQa8T+IcYmyIOhlsNsengDi2l5IVrlZ4hdA=; b=LbGFXqxUPkCZApx+M3lq4RFYbQxynVSRePaCa4DUP10bVKyAtpGcjROeET2f0Cz16kpE VMBidCHu6qX5QbVV2q7LLgv7AFAuxF2d3S/cFzW7JMFknlXdcZfRuwj2JFQv4BX+Uych W1w7kNYk6btqO/whf9dsI+zcBDQaNd9DKJLK84ywVJYfOQovzYcAf+LLoU8t7J8tnPES CekjjACuWzAjwVbEzwFW7WWi+WMhe3icO4Asfg7cokvfvq4sUECRKN07agrKthN5afiZ YWAG/vhoyyn9kMYqWo1j085fCddb3vxrCExj5Wrff/z4eZemCB8YKLYCPZONZibLepAa 4Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3x1vmagqw4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 23 Mar 2024 18:37:25 +0000 Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 42NIYxCo018557; Sat, 23 Mar 2024 18:37:24 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3x1vmagqvs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 23 Mar 2024 18:37:24 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 42NGfDJP029401; Sat, 23 Mar 2024 18:37:11 GMT Received: from smtprelay04.dal12v.mail.ibm.com ([172.16.1.6]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3x0x14wnsy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 23 Mar 2024 18:37:11 +0000 Received: from smtpav03.dal12v.mail.ibm.com (smtpav03.dal12v.mail.ibm.com [10.241.53.102]) by smtprelay04.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 42NIb8jo54329830 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 23 Mar 2024 18:37:10 GMT Received: from smtpav03.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6EB565803F; Sat, 23 Mar 2024 18:37:08 +0000 (GMT) Received: from smtpav03.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 76F2A58056; Sat, 23 Mar 2024 18:37:05 +0000 (GMT) Received: from [9.43.19.73] (unknown [9.43.19.73]) by smtpav03.dal12v.mail.ibm.com (Postfix) with ESMTP; Sat, 23 Mar 2024 18:37:05 +0000 (GMT) Message-ID: <5a2e511a-3a29-4cdd-88b1-12a6274d5a79@linux.ibm.com> Date: Sun, 24 Mar 2024 00:07:03 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799] Content-Language: en-US To: Peter Bergner , Jakub Jelinek , "Kewen.Lin" , Segher Boessenkool , David Edelsohn , Michael Meissner , gcc-patches References: <8e8dad73-43a6-4764-a496-b600e6a220e1@linux.ibm.com> <76307976-77b0-48f0-90b9-6dcec02e3c8f@linux.ibm.com> <6a592b9f-d536-4a0e-aa00-ee8d4a778afc@linux.ibm.com> From: Ajit Agarwal In-Reply-To: <6a592b9f-d536-4a0e-aa00-ee8d4a778afc@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: M-R6wGUbS7d-mYPzx_XHNmITYMgryjbo X-Proofpoint-ORIG-GUID: g0jL49cm9py-BA95J9XCzNHBP6HkL17x X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-23_12,2024-03-21_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 adultscore=0 clxscore=1015 lowpriorityscore=0 malwarescore=0 spamscore=0 mlxlogscore=999 priorityscore=1501 impostorscore=0 suspectscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2403210000 definitions=main-2403230126 X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,KAM_MANYTO,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,URIBL_BLACK autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 23/03/24 9:33 pm, Peter Bergner wrote: > On 3/23/24 4:33 AM, Ajit Agarwal wrote: >>>> - else if (align_words < GP_ARG_NUM_REG) >>>> + else if (align_words < GP_ARG_NUM_REG >>>> + || (cum->hidden_string_length >>>> + && cum->actual_parm_length <= GP_ARG_NUM_REG)) >>> { >>> if (TARGET_32BIT && TARGET_POWERPC64) >>> return rs6000_mixed_function_arg (mode, type, align_words); >>> >>> return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words); >>> } >>> else >>> return NULL_RTX; >>> >>> The old code for the unused hidden parameter (which was the 9th param) would >>> fall thru to the "return NULL_RTX;" which would make the callee assume there >>> was a parameter save area allocated. Now instead, we'll return a reg rtx, >>> probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll now >>> see a copy of r11 into a pseudo like we do for the other param regs. >>> Is that a problem? Given it's an unused parameter, it'll probably get deleted >>> as dead code, but could it cause any issues? What if we have more than one >>> unused hidden parameter and we return r12 and r13 which have specific uses >>> in our ABIs (eg, r13 is our TCB pointer), so it may not actually look dead. >>> Have you verified what the callee RTL looks like after expand for these >>> unused hidden parameters? Is there a rtx we can return that isn't a NULL_RTX >>> which triggers the assumption of a parameter save area, but isn't a reg rtx >>> which might lead to some rtl being generated? Would a (const_int 0) or >>> something else work? >>> >>> >> For the above use case it will return >> >> (reg:DI 5 %r5) and below check entry_parm = >> (reg:DI 5 %r5) and the following check will not return TRUE and hence >> parameter save area will not be allocated. > > Why r5?!?! The 8th (integer) param would return r10, so I'd assume if > the next param was a hidden param, then it'd get the next gpr, so r11. > How does it jump back to r5 which may have been used by the 3rd param? > > My mistake its r11 only for hidden param. > > > >> It will not generate any rtx in the callee rtl code but it just used to >> check whether to allocate parameter save area or not when number of args > 8. >> >> /* If there is no incoming register, we need a stack. */ >> entry_parm = rs6000_function_arg (args_so_far, arg); >> if (entry_parm == NULL) >> return true; >> >> /* Likewise if we need to pass both in registers and on the stack. */ >> if (GET_CODE (entry_parm) == PARALLEL >> && XEXP (XVECEXP (entry_parm, 0, 0), 0) == NULL_RTX) >> return true; > > Yes, this code in rs6000_parm_needs_stack() uses the rs6000_function_arg() > return value as a boolean to tell us whether a parameter save area is required > so what we return is unimportant other than to know it's not NULL_RTX. > > I'm more concerned about the use of the target hook targetm.calls.function_arg > used in the generic parts of the compiler. What will that code do differently > now that we return a reg rtx rather than NULL_RTX? Might that code use > the reg rtx to emit something? I'd feel better if you could verify what > happens in that code when we return a reg rtx for that 9th hidden param which > isn't really being passed in a register. > As per my understanding and debugging openBLAS code testcase I see that reg_rtx returned inside the below IF condition is used for check whether paramter save area is needed or not. In the generic code where targetm.calls.function_arg is called in calls.cc returned rtx is used for PARALLEL case so that we can check if we need to pass both in registers and stack then they emit store with respect to return rtx. If we identify that we need only registers for argument then it emits nothing. Thanks & Regards Ajit > > Peter > >