From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 5E652386F447; Mon, 15 Jun 2020 21:53:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5E652386F447 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1592258027; bh=3BDUG3XGeWgpCmHf/qw240AFG970NOjVaGzqFsADWwM=; h=From:To:Subject:Date:From; b=rPXRQ2mkDrwXzmkQqOc1RuK2LIM0hCsWH8q4MWClrTNzgjhslcN4EhWtBANHs/N/i RW9H9O4G/fbJkjII4cHPLMw3oI7qJxTMfAx0YBHDM3rMkvlJdeF/6v0SmhHx8Yf62t yHqNjzb9wBnHZ3H0Oxq/QMyKD8q/NKqsfHZwCkys= From: "markalle at us dot ibm.com" To: gcc-bugs@gcc.gnu.org Subject: [Bug inline-asm/95692] New: PPC64, suspicious store in front of inline assembly section Date: Mon, 15 Jun 2020 21:53:47 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: inline-asm X-Bugzilla-Version: 10.1.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: markalle at us dot ibm.com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Jun 2020 21:53:47 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D95692 Bug ID: 95692 Summary: PPC64, suspicious store in front of inline assembly section Product: gcc Version: 10.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: inline-asm Assignee: unassigned at gcc dot gnu.org Reporter: markalle at us dot ibm.com Target Milestone: --- Created attachment 48735 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D48735&action=3Dedit Reproducer, includes Makefile (which downloads and configures openmpi-4.0.3) and a couple .c filles that use the patcher out of Open MPI *** System/version info: The system on which the suspicious behavior was observed is PPC64 running Redhat 8.1. % uname -a Linux f8n02 4.18.0-147.13.2.el8_1.ppc64le #1 SMP Wed May 13 15:23:36 UTC 20= 20 ppc64le ppc64le ppc64le GNU/Linux % cat /etc/redhat-release Red Hat Enterprise Linux release 8.1 (Ootpa) % gcc --version gcc (GCC) 8.3.1 20190507 (Red Hat 8.3.1-4) Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. I also downloaded and built gcc-10.1.0 and observed the same behavior *** Problem: Open MPI has a symbol patcher it uses to intercept calls like munmap to know when virtual to physical memory mappings can change. On PPC64 the patcher code replaces "munmap" with a branch to a new function intercept_munmap(). The target function intercept_munmap() wants its own r2 table of contents, so it has a BEGIN/END macro where it saves, overwrites, then restores r2. We can edit the code in small ways to produces passing/failing versions of this. Starting with one of the passing versions, the instructions for the pre-patched version of the target function "intercept_munmap()" are 0x0000000010000ff4 <+28>: std r3,128(r31) 0x0000000010000ff8 <+32>: std r4,136(r31) 0x0000000010000ffc <+36>: std r2,104(r31) 0x0000000010001000 <+40>: nop 0x0000000010001004 <+44>: nop 0x0000000010001008 <+48>: nop 0x000000001000100c <+52>: nop 0x0000000010001010 <+56>: nop where the table of contents r2 is being saved in a local variable, and those place holder NOP operations will be modified to overwrite r2. The failing versions of the program are the ones where intercept_munmap() is generated as 0x0000000010000ff4 <+28>: std r2,24(r1) 0x0000000010000ff8 <+32>: std r3,128(r31) 0x0000000010000ffc <+36>: std r4,136(r31) 0x0000000010001000 <+40>: std r2,104(r31) 0x0000000010001004 <+44>: nop 0x0000000010001008 <+48>: nop 0x000000001000100c <+52>: nop 0x0000000010001010 <+56>: nop 0x0000000010001014 <+60>: nop Here that first save of r2 to $r1+24 is picking up the pre-patched value, and if that gets used later it will fail. When it comes to tracing through the Open MPI code, the central function in the patcher is mca_patcher_overwrite_apply_patch() at openmpi-4.0.3_mod/opal/mca/patcher/overwrite/patcher_overwrite_module.c:203 The main values there are sys_addr =3D munmap from glibc hook_addr =3D intercept_munmap, the function to replace munmap At the bottom of mca_patcher_overwrite_apply_patch() the instructions at sys_addr are replaced by instructions that change r11 to hook_addr and branch there. Above that the target code in intercept_munmap() is also modified to save, overwrite, and later restore the table of contents r2. If it matters, the particulars of how this failing case accesses bad data relative to r2 is that when intercept_munmap() calls fn() via a function pointer, the return from that only restores r2 to $r1+24, and then the syscall() that disassembles as [plt_call.syscall@@GLIBC_2.17] 0x0000000010000bc0 <+0>: std r2,24(r1) 0x0000000010000bc4 <+4>: ld r12,-32304(r2) 0x0000000010000bc8 <+8>: mtctr r12 0x0000000010000bcc <+12>: bctr 0x0000000010000bd0 <+16>: .long 0x0 0x0000000010000bd4 <+20>: .long 0x0 0x0000000010000bd8 <+24>: .long 0x0 0x0000000010000bdc <+28>: .long 0x0 is pulling from the wrong table of contents. If r2 had been restoredd to $r31+128 then the syscall branch would have gone to the right place. I think the root of the problem is the early store of r2, and it should be considered a bug that gcc put this in front of the inline assembly that's put in place by OPAL_PATCHER_BEGIN. For what it's worth, editing openmpi-4.0.3_mod/opal/mca/patcher/patcher.h to add "ld r2,24(r1)" or similar (to make it contain instructions that write to r2) doesn't keep gcc from putting the suspicious early save of r2 in front of OPAL_PATCHER_BEGIN. *** Explanation of reproducer package: The Makefile wgets openmpi-4.0.3.tar.gz and configures it which makes all the necessary headers exist and take the PPC64 path. Also the main entry function mca_patcher_overwrite_patch_symbol() is static in Open MPI so we edit out the static so we can call it directly. After that the compilation is pretty simple. The code defines an intercept_munmap() that can be disassembled to see the suspicious early store of r2. Running ./x will segv in the syscall to SYS_munmap when it tries to branch to $r2-32304 when r2 is loaded with the wrong table of contents. *** Example: % make % gdb ./x (gdb) break intercept_munmap (gdb) run (gdb) disassemble $pc =3D> 0x0000000010000ff4 <+28>: std r2,24(r1) 0x0000000010000ff8 <+32>: std r3,128(r31) 0x0000000010000ffc <+36>: std r4,136(r31) 0x0000000010001000 <+40>: std r2,104(r31) 0x0000000010001004 <+44>: lis r2,0 0x0000000010001008 <+48>: ori r2,r2,0 0x000000001000100c <+52>: rldicr r2,r2,32,31 0x0000000010001010 <+56>: oris r2,r2,4098 0x0000000010001014 <+60>: ori r2,r2,32512 0x0000000010001018 <+64>: addis r9,r2,-2 0x000000001000101c <+68>: addi r9,r9,-28612 0x0000000010001020 <+72>: std r9,112(r31) 0x0000000010001024 <+76>: ld r9,112(r31) 0x0000000010001028 <+80>: mr r12,r9 0x000000001000102c <+84>: mtctr r12 0x0000000010001030 <+88>: bctrl 0x0000000010001034 <+92>: ld r2,24(r1) 0x0000000010001038 <+96>: ld r5,136(r31) 0x000000001000103c <+100>: ld r4,128(r31) 0x0000000010001040 <+104>: li r3,91 0x0000000010001044 <+108>: bl 0x10000bc0 <00000022.plt_call.syscall@@GLIBC_2.17> 0x0000000010001048 <+112>: ld r2,24(r1) 0x000000001000104c <+116>: mr r9,r3 0x0000000010001050 <+120>: stw r9,96(r31) 0x0000000010001054 <+124>: ld r2,104(r31) 0x0000000010001058 <+128>: nop 0x000000001000105c <+132>: mr r3,r9 0x0000000010001060 <+136>: addi r1,r31,160 0x0000000010001064 <+140>: ld r0,16(r1) 0x0000000010001068 <+144>: mtlr r0 0x000000001000106c <+148>: ld r31,-8(r1) 0x0000000010001070 <+152>: blr 0x0000000010001074 <+156>: .long 0x0 0x0000000010001078 <+160>: .long 0x1000000 0x000000001000107c <+164>: .long 0x1000180 (gdb) p /x $r2 0x200000307300 (old toc saved at $r1+24) (gdb) stepi 9 (gdb) p /x $r2 0x10027f00 (after the instructions that replace $r2) (gdb) stepi 7 (branches into foo() via function pointer) (gdb) fin (come back from foo()) (gdb) p /x $r2 0x10027f00 (this is the new value and would be fine if it stayed) (gdb) p /x *(long long*)((char*)$r2 - 32304) 0x10003ef0 (gdb) info symbol 0x10003ef0 syscall@plt (just confirming that the current $r2 has good data at $r2-32304) (gdb) stepi (executing the instruction at <+92>) (gdb) p /x $r2 0x200000307300 (old toc saved at $r1+24) (gdb) p /x *(long long*)((char*)$r2 - 32304) 0x200000304fb0 (gdb) info symbol 0x200000304fb0 __exit_funcs_lock (I think this is leftover, not the function we want to branch to) (gdb) stepi 4 (gdb) disassemble $pc 0x0000000010000bc0 <+0>: std r2,24(r1) 0x0000000010000bc4 <+4>: ld r12,-32304(r2) 0x0000000010000bc8 <+8>: mtctr r12 0x0000000010000bcc <+12>: bctr 0x0000000010000bd0 <+16>: .long 0x0 0x0000000010000bd4 <+20>: .long 0x0 0x0000000010000bd8 <+24>: .long 0x0 0x0000000010000bdc <+28>: .long 0x0 (gdb) stepi 3 (gdb) p /x $r12 0x200000304fb0 Now it's about to branch into __exit_funcs_lock and segv when it wanted to branch to syscall@plt which is what it would have got if $r2 had the right table of contents. *** section of preprocessed test.i If -save-temps is added to the Makefile, here's a clip of the output for te= st.i int intercept_munmap(void *start, size_t length) { unsigned long toc_save; asm volatile ("std 2, %0" : "=3Dm" (toc_save));= asm volatile ("nop; nop; nop ; nop; nop");; volatile MyFunction_t fn; fn =3D foo; fn(); int result =3D syscall( # 26 "test.c" 3 4 91 # 26 "test.c" , start, length); asm volatile ("ld 2, %0" : : "m" (toc_save));; }=