From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 1A99D3857810 for ; Thu, 19 Nov 2020 20:34:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1A99D3857810 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0AJKUfLo012388 for ; Thu, 19 Nov 2020 15:34:35 -0500 Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0b-001b2d01.pphosted.com with ESMTP id 34wy33sd1s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 19 Nov 2020 15:34:35 -0500 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0AJKM3UU012002 for ; Thu, 19 Nov 2020 20:34:35 GMT Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by ppma03wdc.us.ibm.com with ESMTP id 34t6v9gvfk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 19 Nov 2020 20:34:35 +0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0AJKYYYY16777802 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 19 Nov 2020 20:34:34 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B1E30AC05F; Thu, 19 Nov 2020 20:34:34 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EB226AC059; Thu, 19 Nov 2020 20:34:33 +0000 (GMT) Received: from [9.160.24.122] (unknown [9.160.24.122]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 19 Nov 2020 20:34:33 +0000 (GMT) Subject: Re: [PATCH 3/4] powerpc: Runtime selection between sc and scv for syscalls To: "Paul A. Clarke" Cc: libc-alpha@sourceware.org, tuliom@linux.ibm.com References: <20201118144703.75569-1-msc@linux.ibm.com> <20201118144703.75569-4-msc@linux.ibm.com> <20201118171219.GA15596@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com> From: Matheus Castanho Message-ID: Date: Thu, 19 Nov 2020 17:34:32 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <20201118171219.GA15596@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312, 18.0.737 definitions=2020-11-19_10:2020-11-19, 2020-11-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 malwarescore=0 lowpriorityscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 clxscore=1015 impostorscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2011190138 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Nov 2020 20:34:37 -0000 On 11/18/20 4:00 PM, Paul A. Clarke wrote: > On Wed, Nov 18, 2020 at 11:47:02AM -0300, Matheus Castanho via Libc-alpha wrote: >> Linux kernel v5.9 added support for system calls using the scv >> instruction for POWER9 and later. The new codepath provides better >> performance (see below) if compared to using sc. For the >> foreseeable future, both sc and scv mechanisms will co-exist, so this >> patch enables glibc to do a runtime check and always use scv when it is >> available. > > nit: "always" is perhaps too strong here, as there are exceptions, as noted > in your message further below. Right. I'll change that on v2. > >> Before issuing the system call to the kernel, we check hwcap2 in the TCB >> for PPC_FEATURE2_SCV to see if scv is supported by the kernel. If not, >> we fallback to sc and keep the old behavior. >> >> The kernel implements a different error return convention for scv, so >> when returning from a system call we need to handle the return value >> differently depending on the instruction we used to enter the kernel. >> >> For syscalls implemented in ASM, entry and exit are implemented by >> different macros (PSEUDO and PSEUDO_RET, resp.), which may be used in >> sequence (e.g. for templated syscalls) or with other instructions in >> between (e.g. clone). To avoid accessing the TCB a second time on >> PSEUDO_RET to check which instruction we used, the value read from >> hwcap2 is cached on a non-volatile register. >> >> This is not needed when using INTERNAL_SYSCALL macro, since entry and >> exit are bundled into the same inline asm directive. >> >> Since system calls may be called before the TCB has been setup (e.g. >> inside the dynamic loader), we also check the value of the thread >> pointer before effectively accessing the TCB. For such situations in >> which the availability of scv cannot be determined, sc is always used. >> >> Support for scv in syscalls implemented in their own ASM file (clone and >> vfork) will be added later. For now simply use sc as before. >> >> Average performance over 1M calls for each syscall "type": >> - stat: C wrapper calling INTERNAL_SYSCALL >> - getpid: templated ASM syscall >> - syscall: call to gettid using syscall function >> >> Standard: >> stat : 1.573445 us / ~3619 cycles >> getpid : 0.164986 us / ~379 cycles >> syscall : 0.162743 us / ~374 cycles >> >> With scv: >> stat : 1.537049 us / ~3535 cycles <~ -84 cycles / -2.32% >> getpid : 0.109923 us / ~253 cycles <~ -126 cycles / -33.25% >> syscall : 0.116410 us / ~268 cycles <~ -106 cycles / -28.34% >> >> Tested on powerpc, powerpc64, powerpc64le (with and without scv) >> --- >> sysdeps/powerpc/powerpc32/sysdep.h | 19 ++-- >> sysdeps/powerpc/powerpc64/sysdep.h | 90 ++++++++++++++++++- >> .../unix/sysv/linux/powerpc/powerpc64/clone.S | 9 +- >> .../unix/sysv/linux/powerpc/powerpc64/vfork.S | 6 +- >> sysdeps/unix/sysv/linux/powerpc/syscall.S | 11 ++- >> sysdeps/unix/sysv/linux/powerpc/sysdep.h | 78 +++++++++++----- >> 6 files changed, 174 insertions(+), 39 deletions(-) >> >> diff --git a/sysdeps/powerpc/powerpc32/sysdep.h b/sysdeps/powerpc/powerpc32/sysdep.h >> index 829eec266a..bff18bdc8b 100644 >> --- a/sysdeps/powerpc/powerpc32/sysdep.h >> +++ b/sysdeps/powerpc/powerpc32/sysdep.h >> @@ -90,9 +90,12 @@ GOT_LABEL: ; \ >> cfi_endproc; \ >> ASM_SIZE_DIRECTIVE(name) >> >> -#define DO_CALL(syscall) \ >> - li 0,syscall; \ >> - sc >> +#define DO_CALL(syscall) \ >> + li 0,syscall; \ >> + DO_CALL_SC > > nit: there are some innocuous whitespace changes which could be avoided to minimize > the diff (moving the '\' closer and changing from spaces to a tab). > Fixed. >> + >> +#define DO_CALL_SC \ >> + sc >> >> #undef JUMPTARGET >> #ifdef PIC >> @@ -106,14 +109,20 @@ GOT_LABEL: ; \ >> # define HIDDEN_JUMPTARGET(name) __GI_##name##@local >> #endif >> >> +#define TAIL_CALL_SYSCALL_ERROR \ >> + b __syscall_error@local >> + >> #define PSEUDO(name, syscall_name, args) \ >> .section ".text"; \ >> ENTRY (name) \ >> DO_CALL (SYS_ify (syscall_name)); >> >> +#define RET_SC \ >> + bnslr+; >> + >> #define PSEUDO_RET \ >> - bnslr+; \ >> - b __syscall_error@local >> + RET_SC; \ >> + TAIL_CALL_SYSCALL_ERROR >> #define ret PSEUDO_RET >> >> #undef PSEUDO_END >> diff --git a/sysdeps/powerpc/powerpc64/sysdep.h b/sysdeps/powerpc/powerpc64/sysdep.h >> index d557098898..2d7dde64da 100644 >> --- a/sysdeps/powerpc/powerpc64/sysdep.h >> +++ b/sysdeps/powerpc/powerpc64/sysdep.h >> @@ -17,6 +17,7 @@ >> . */ >> >> #include >> +#include >> >> #ifdef __ASSEMBLER__ >> >> @@ -263,10 +264,72 @@ LT_LABELSUFFIX(name,_name_end): ; \ >> TRACEBACK_MASK(name,mask); \ >> END_2(name) >> >> +/* We will allocate a new frame to save LR and the non-volatile register used to >> + read the TCB when checking for scv support on syscall code. We actually just >> + need the minimum frame size plus room for 1 reg (64 bits). But the ABI > > nit: Since everything is in bytes below, suggest changing "64 bits" to "8 bytes". Fixed. > >> + mandates stack frames should be aligned at 16 Bytes, so we end up allocating >> + a bit more space then what will actually be used. */ >> +#define SCV_FRAME_SIZE (FRAME_MIN_SIZE+16) > > 8 for the register save area + 8 more to maintain 16-byte alignment. OK. > >> +#define SCV_FRAME_NVOLREG_SAVE FRAME_MIN_SIZE >> + >> +/* Allocate frame and save register */ >> +#define NVOLREG_SAVE \ >> + stdu r1,-SCV_FRAME_SIZE(r1); \ >> + std r31,SCV_FRAME_NVOLREG_SAVE(r1); \ >> + cfi_adjust_cfa_offset(SCV_FRAME_SIZE); >> + >> +/* Restore register and destroy frame */ >> +#define NVOLREG_RESTORE \ >> + ld r31,SCV_FRAME_NVOLREG_SAVE(r1); \ >> + addi r1,r1,SCV_FRAME_SIZE; \ >> + cfi_adjust_cfa_offset(-SCV_FRAME_SIZE); >> + >> +/* Check PPC_FEATURE2_SCV bit from hwcap2 in the TCB and update CR0 >> + * accordingly. First, we check if the thread pointer != 0, so we don't try to >> + * access the TCB before it has been initialized, e.g. inside the dynamic >> + * loader. If it is already initialized, check if scv is available. On both >> + * negative cases, go to JUMPFALSE (label given by the macro's caller). We >> + * save the value we read from the TCB in a non-volatile register so we can >> + * reuse it later when exiting from the syscall in PSEUDO_RET. */ > > Florian already mentioned removing the leading '*' for proper formatting. Fixed. > >> + .macro CHECK_SCV_SUPPORT REG JUMPFALSE >> + >> + /* Check if thread pointer has already been setup */ >> + cmpdi r13,0 >> + beq \JUMPFALSE >> + >> + /* Read PPC_FEATURE2_SCV from TCB and store it in REG */ >> + ld \REG,TCB_HWCAP(PT_THREAD_POINTER) >> + andis. \REG,\REG,PPC_FEATURE2_SCV>>16 >> + >> + beq \JUMPFALSE >> + .endm >> + >> +/* Before doing the syscall, check if we can use scv. scv is supported by P9 >> + * and later with Linux v5.9 and later. If so, use it. Otherwise, fallback to >> + * sc. We use a non-volatile register to save hwcap2 from the TCB, so we need >> + * to save its content beforehand. */ > > Here, too. Also need one more space after '.'. Fixed and fixed. > >> #define DO_CALL(syscall) \ >> - li 0,syscall; \ >> + li r0,syscall; \ >> + NVOLREG_SAVE; \ >> + CHECK_SCV_SUPPORT r31 0f; \ >> + DO_CALL_SCV; \ >> + b 1f; \ >> +0: DO_CALL_SC; \ >> +1: >> + >> +/* DO_CALL_SC and DO_CALL_SCV expect the syscall number to be loaded on r0. */ > > nit: s/loaded on/in/ Fixed too. > > rest looks OK. > > With the comments fixed, LGTM. Fixing the nits is up to you. > > PC > Queued changes for v2. Thanks, Matheus Castanho