From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id EAA2F3858D28 for ; Fri, 15 Mar 2024 13:12:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EAA2F3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EAA2F3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710508383; cv=none; b=krNu5L19+p1DJoefcRCkBHwhGTM0nbVYK+RCp8BBuQd3CEHwaeo++6NGyruZ+dO4Y+R5ngZXpaRjzgfOr55LY1hGQo0zdbm+XeO6E/4kDrY6ghMLVgW1+lIyobY75o7QTS/KUp7dcoyG4lE/SKOP0nyoWXd2x0xFruOuJCn6iBk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710508383; c=relaxed/simple; bh=le3a62CDfzQJ6FzeSeYXURKMHf5FvEvEFbcuoySGcUE=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=J2UnrQ+WqhaBcV2+h2yTLO4riH4ADOp0T1iWYezd4DwRu7MmXpnXlniKef4JYhmRPTryUAASaNdaYjGsKwZxQX5vu0l/raf3IC4S84nnxbz5oR3shNi5YF98ysJeFktFyqN5eNZDM/z/4xln3n6z+R6Jajv2+o3GNbzjtrv3p5w= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rl7Ml-0000TY-Ux for gcc-patches@gcc.gnu.org; Fri, 15 Mar 2024 09:12:58 -0400 Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 42FBQrFY010254 for ; Fri, 15 Mar 2024 13:12:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : subject : message-id : references : content-type : in-reply-to : mime-version; s=pp1; bh=FXuAyGwbNdlAzF3buTs3irw+266Hby2PL29bHolkhV0=; b=LcathEg86Smjh4I11wbEq8PgajpybEOiKbtey9oLVTB/nBnagkRT819O/7ZeQ8fsJcID A4XHgezvuYPSNnQ18NwwrkDrj6J/RTw9OxPKh0YLAMgzTDigcW8R143CxQ/w+pXE6cDg dn0PygIrlug3n/rkA6n+/p1QCuQpkHoP6KAYecG7UCPJl9DiAxqx77kDoIyZBEKDPLpN QkuZhbbhspgBbB6T2ZSE956N5mwcOvd/ZQ2Y/oyhxORW3rxaEY6+3U2mNHNQ1dX7pacx EIuwk6RCz8gIIXkVEkmvqhGjfl6AgHPWtntTTK00OQtuGJVTwq4g857kj3Zi4Vy49qdP lg== Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3wvnfb1a0x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 15 Mar 2024 13:12:52 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 42FASnb0014861 for ; Fri, 15 Mar 2024 13:12:51 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3ws33pbytb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 15 Mar 2024 13:12:51 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 42FDCj0627722222 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 15 Mar 2024 13:12:47 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CBC752004B; Fri, 15 Mar 2024 13:12:45 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 96D9520040; Fri, 15 Mar 2024 13:12:45 +0000 (GMT) Received: from li-819a89cc-2401-11b2-a85c-cca1ce6aa768.ibm.com (unknown [9.171.79.104]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTPS; Fri, 15 Mar 2024 13:12:45 +0000 (GMT) Date: Fri, 15 Mar 2024 14:12:44 +0100 From: Stefan Schulze Frielinghaus To: Andreas Krebbel , gcc-patches@gcc.gnu.org Subject: Re: RFC: New mechanism for hard reg operands to inline asm Message-ID: References: Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: zFn-TiY0dr8rTa6Erftc3QvbgyOBqCpC X-Proofpoint-GUID: zFn-TiY0dr8rTa6Erftc3QvbgyOBqCpC X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-14_13,2024-03-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 adultscore=0 phishscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=999 bulkscore=0 spamscore=0 mlxscore=0 malwarescore=0 priorityscore=1501 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2403150107 Received-SPF: pass client-ip=148.163.156.1; envelope-from=stefansf@linux.ibm.com; helo=mx0a-001b2d01.pphosted.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_EF=-0.1,RCVD_IN_MSPIKE_H4=0.001,RCVD_IN_MSPIKE_WL=0.001,SPF_HELO_NONE=0.001,SPF_PASS=-0.001,T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,KAM_SHORT,SPF_HELO_PASS,SPF_SOFTFAIL,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Jun 04, 2021 at 06:02:27PM +0000, Andreas Krebbel via Gcc wrote: > Hi, > > I wonder if we could replace the register asm construct for > inline assemblies with something a bit nicer and more obvious. > E.g. turning this (real world example from IBM Z kernel code): > > int diag8_response(int cmdlen, char *response, int *rlen) > { > register unsigned long reg2 asm ("2") = (addr_t) cpcmd_buf; > register unsigned long reg3 asm ("3") = (addr_t) response; > register unsigned long reg4 asm ("4") = cmdlen | 0x40000000L; > register unsigned long reg5 asm ("5") = *rlen; /* <-- */ > asm volatile( > " diag %2,%0,0x8\n" > " brc 8,1f\n" > " agr %1,%4\n" > "1:\n" > : "+d" (reg4), "+d" (reg5) > : "d" (reg2), "d" (reg3), "d" (*rlen): "cc"); > *rlen = reg5; > return reg4; > } > > into this: > > int diag8_response(int cmdlen, char *response, int *rlen) > { > unsigned long len = cmdlen | 0x40000000L; > > asm volatile( > " diag %2,%0,0x8\n" > " brc 8,1f\n" > " agr %1,%4\n" > "1:\n" > : "+{r4}" (len), "+{r5}" (*rlen) > : "{r2}" ((addr_t)cpcmd_buf), "{r3}" ((addr_t)response), "d" (*rlen): "cc"); > return len; > } > > Apart from being much easier to read because the hard regs become part > of the inline assembly it solves also a couple of other issues: > > - function calls might clobber register asm variables see BZ100908 > - the constraints for the register asm operands are superfluous > - one register asm variable cannot be used for 2 different inline > assemblies if the value is expected in different hard regs > > I've started with a hackish implementation for IBM Z using the > TARGET_MD_ASM_ADJUST hook and let all the places parsing constraints > skip over the {} parts. But perhaps it would be useful to make this a > generic mechanism for all targets?! > > Andrea Hi all, I would like to resurrect this topic https://gcc.gnu.org/pipermail/gcc/2021-June/236269.html and have been coming up with a first implementation in order to discuss this further. Basically, I see two ways to implement this. First is by letting LRA assign the registers and the second one by introducing extra moves just before/after asm statements. Currently I went for the latter and emit extra moves during expand into hard regs as specified by the input/output constraints. Before going forward I would like to get some feedback whether this approach makes sense to you at all or whether you see some show stoppers. I was wondering whether my current approach is robust enough in the sense that no other pass could potentially remove the extra moves I introduced before. In particular I was first worried about code motion. Initially I thought I have to make use not only of hard regs but hard regs which are flagged as register-asms in order to prevent optimizations to fiddly around with those moves. However, after some more investigation I tend to conclude that this is not necessary. Any thoughts about this approach? With the current approach I can at least handle cases like: int __attribute__ ((noipa)) foo (int x) { return x; } int test (int x) { asm ("foo %0,%1\n" :: "{r3}" (foo (x + 1)), "{r2}" (x)); return x; } Note, this is written with the s390 ABI in mind where the first int argument and return value are passed in register r2. The point here is that r2 needs to be altered and restored multiple times until we reach } of function test(). Luckily, during expand we get all this basically for free. This brings me to the general question what should be allowed and what not? Evaluation order of input expressions is probably unspecified similar to function arguments. However, what about this one: int test (int x) { register int y asm ("r5") = x + 1; asm ("foo %0,%1\n" : "={r4}" (y) : "{r1}" (y)); return y; } IMHO the input is just fine but the output constraint is misleading and it is not obvious in which register variable y resides after the asm statement. With my current implementation, were I don't bail out, it is register r4 contrary to the decl. Interestingly, the other way around where one register is "aliased" by multiple variables is accepted by vanilla GCC: int foo (int x, int y) { register int a asm ("r1") = x; register int b asm ("r1") = y; return a + b; } Though, probably not intentionally. Cheers, Stefan